[torqueusers] Getting "qsub: Job rejected by all possible
destinations"
Prakash Velayutham
prakash.velayutham at cchmc.org
Wed Feb 25 08:41:05 MST 2009
Hi,
Torque-2.3.6 Server and MOMs
Torque-2.1.10 submission client
I am getting "qsub: Job rejected by all possible destinations" on the
client side. Here are some details which are baffling me.
My client name is client1.domain.com and IP is x.y.z. This entry is in
the DNS server.
In the client's /etc/hosts file, I have an entry called
x.y.z client2.domain.com client2
This entry existed for a while and same jobs used to work before. But,
suddenly, since today morning at around 09:35, I started getting the
rejection from the Torque server. Once I remove the entry in /etc/
hosts, jobs go in fine.
I have the client1.domain.com in Torque server's /etc/hosts.equiv and
the client1.domain.com in qmgr's acl_hosts too.
Even now, if I add the same entry back to /etc/hosts, I get
rejections. I have no idea why this is happening because if I do
nslookup on the Torque server for the client's IP address x.y.z, I get
back client1.domain.com. This is baffling and disturbing. Following
are the relevant entries in the Torque server logs.
02/25/2009 09:35:02;0100;PBS_Server;Req;;Type AuthenticateUser request
received from user at client1.domain.com, sock=13
02/25/2009 09:35:02;0100;PBS_Server;Req;;Type QueueJob request
received from user at client1.domain.com, sock=10
02/25/2009 09:35:02;0100;PBS_Server;Req;;Type JobScript request
received from user at client1.domain.com, sock=10
02/25/2009 09:35:02;0100;PBS_Server;Req;;Type ReadyToCommit request
received from user at client1.domain.com, sock=10
02/25/2009 09:35:02;0100;PBS_Server;Req;;Type Commit request received
from user at client1.domain.com, sock=10
02/25/2009 09:35:02;0100;PBS_Server;Job;
2799.bmiclustersvcd1.cchmc.org;enqueuing into routing, state 1 hop 1
02/25/2009 09:35:02;0008;PBS_Server;Job;
2799.bmiclustersvcd1.cchmc.org;Job rejected by all possible destinations
02/25/2009 09:35:02;0100;PBS_Server;Job;
2799.bmiclustersvcd1.cchmc.org;dequeuing from routing, state QUEUED
02/25/2009 09:35:02;0080;PBS_Server;Req;req_reject;Reject reply
code=15039(Job rejected by all possible destinations), aux=0,
type=Commit, from user at client1.domain.com
Any suggestions/ideas please?
Thanks,
Prakash
More information about the torqueusers
mailing list