[torqueusers] Dependencies being ignored from some submit hosts.

John Hanks griznog at gmail.com
Fri Feb 22 13:33:56 MST 2008


Reverse lookups are consistent from both hosts, and hostA is the
primary nameserver for itself and submitA, so they should be getting
their answers from the same place.

[root at hostA hostA]# host hostA
hostA.hpc.usu.edu has address 192.168.0.1
[root at hostA hostA]# host 192.168.0.1
1.0.168.192.in-addr.arpa domain name pointer hostA.hpc.usu.edu.

user at submitA ~ $ host hostA
hostA.hpc.usu.edu has address 192.168.0.1
user at submitA ~ $ host 192.168.0.1
1.0.168.192.in-addr.arpa domain name pointer hostA.hpc.usu.edu.

qmgr -c 'p s' shows this for the server configuration:

set server scheduling = True
set server default_queue = route
set server log_events = 511
set server mail_from = adm
set server query_other_jobs = True
set server scheduler_iteration = 60
set server node_check_rate = 150
set server tcp_timeout = 6
set server log_level = 1
set server pbs_version = 2.2.1
set server submit_hosts = submitA

I'm bumping up my personal "clueless" status to Orange.

Thanks,

jbh

On Fri, Feb 22, 2008 at 12:30 PM, Garrick Staples <garrick at usc.edu> wrote:
> On Fri, Feb 22, 2008 at 09:04:38AM -0700, John Hanks alleged:
>
> > So all the q* cmds I've looked at seem to call get_server() to
>  > create/modify a job id from the job argument if it exists. The last
>  > thing get_server does is:
>  >
>  >     if (get_fullhostname(def_server,host_server,PBS_MAXSERVERNAME,NULL) != 0)
>  >       {
>  >       /* FAILURE */
>  >
>  >       return(1);
>  >       }
>  >
>  >     strcat(job_id_out,host_server);
>  >
>  > Which makes job_id_out be of the form JOBID.FQDN, which is breaking my
>  > clients ability to use q* commands because the server calls all jobs
>  > JOBID.SHORTHOSTNAME. I have to think I'm not to first person who
>  > wanted to run q* command on one machine that controlled pbs_server on
>  > another machine, so this has to be something I've broken. I just have
>  > no idea what it is. Any suggestions for where to poke next would be
>  > appreciated. Is there a way to force pbs_server to use FQDN for job
>  > ids?
>
>  The most likely reason the two hosts have a different idea of the correct name
>  is because of reverse name resolution.  Are they configured differently?
>
>  Verify the reverse resolution of the server's IP is the same on both hosts.
>
>
> _______________________________________________
>  torqueusers mailing list
>  torqueusers at supercluster.org
>  http://www.supercluster.org/mailman/listinfo/torqueusers
>
>


More information about the torqueusers mailing list