[torqueusers] pbs_server error 15008

Adrian Sevcenco Adrian.Sevcenco at cern.ch
Wed Apr 30 14:31:39 MDT 2008


Steve Snelgrove wrote:
> You should do the command "momctl -d3" on the mom and see if the server 
> address is in the trusted client list.  If not, you are experiencing the 
> effect of my goof in the code that is now corrected in the latest 
> snapshots.
> 
> The latest snapshots can be obtained at 
> http://www.clusterresources.com/downloads/torque/snapshots/.
Hi! Thank for looking into this. i dont have momctl on the nodes and on 
the server it gives : [root at grid01 bin]# momctl -d3
ERROR:    query[0] 'diag3' failed on localhost (errno: 0:5)
It is about version: 2.1.9 packaged by EGEE for glite(GRID) install.
So i am thinking that is something about my configuration.
Thank you,
Adrian

> Adrian Sevcenco wrote:
>> Hi! I have an instalation of torque and i try to send some test jobs 
>> but all jobs stop with the status Deferred and i receive on the server 
>> side this type of errors :
>> 04/30/2008 22:37:57;0040;PBS_Server;Svr;grid01.x.x;Scheduler sent 
>> command new
>> 04/30/2008 22:37:57;0100;PBS_Server;Req;;Type ModifyJob request 
>> received from root at grid01.x.x, sock=9
>> 04/30/2008 22:37:57;0008;PBS_Server;Job;133.grid01.x.x;Job Modified at 
>> request of root at grid01.x.x
>> 04/30/2008 22:37:57;0100;PBS_Server;Req;;Type RunJob request received 
>> from root at grid01.x.x, sock=9
>> 04/30/2008 22:37:57;0008;PBS_Server;Job;133.grid01.x.x;Job Run at 
>> request of root at grid01.x.x
>> 04/30/2008 22:37:57;0008;PBS_Server;Job;133.grid01.x.x;send of job to 
>> wn02 failed error = 15008
>> 04/30/2008 22:37:57;0001;PBS_Server;Svr;PBS_Server;Access from host 
>> not allowed, or unknown host (15008) in send_job, child failed in 
>> previous commit request for job 133.grid01.x.x
>> 04/30/2008 22:37:57;0008;PBS_Server;Job;133.grid01.x.x;unable to run 
>> job, MOM rejected/rc=1
>> 04/30/2008 22:37:57;0080;PBS_Server;Req;req_reject;Reject reply 
>> code=15041(Execution server rejected request MSG=cannot send job to 
>> mom, state=PRERUN), aux=0, type=RunJob, from root at grid01.x.x
>>
>> on the node i have in mom_log this :
>> 1193  04/30/2008 22:18:33;0008;   pbs_mom;Job;process_request;request 
>> type QueueJob from host grid01.x.x rejected (host not authorized)
>>   1194  04/30/2008 22:18:33;0080;   pbs_mom;Req;req_reject;Reject 
>> reply code=15008(Access from host not allowed, or unknown host 
>> MSG=request not authorized), aux=0, type=QueueJob, from 
>> PBS_Server at grid01.x.x
>>
>> I have public key identification.. i don't know how to pursue the 
>> problem.. I would appreciate any advice for finding the problem.
>> Also i have :
>> 1195  04/30/2008 22:19:32;0001;   pbs_mom;Svr;pbs_mom;is_request, bad 
>> connect from public_ip:1023 - unauthorized server
>> But in hosts i put only the private ip. why is not used the hostname 
>> and privet ip that i put in hosts?
>> Thank you for any advice you can give me
>> Best regards,
>> Adrian
>>
>> -------------------------------------------------------
>> Adrian Sevcenco - Institute of Space Sciences, Romania
>> -------------------------------------------------------
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/torqueusers
>>   
> 
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3092 bytes
Desc: S/MIME Cryptographic Signature
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20080430/30315ab6/smime.bin


More information about the torqueusers mailing list