[torqueusers] pbs_server error 15008

Brock Palen brockp at umich.edu
Wed Apr 30 14:40:45 MDT 2008


On Apr 30, 2008, at 4:31 PM, Adrian Sevcenco wrote:
> Steve Snelgrove wrote:
>> You should do the command "momctl -d3" on the mom and see if the  
>> server address is in the trusted client list.  If not, you are  
>> experiencing the effect of my goof in the code that is now  
>> corrected in the latest snapshots.
>> The latest snapshots can be obtained at http:// 
>> www.clusterresources.com/downloads/torque/snapshots/.
> Hi! Thank for looking into this. i dont have momctl on the nodes  
> and on the server it gives : [root at grid01 bin]# momctl -d3
> ERROR:    query[0] 'diag3' failed on localhost (errno: 0:5)
> It is about version: 2.1.9 packaged by EGEE for glite(GRID) install.
> So i am thinking that is something about my configuration.
> Thank you,
> Adrian

When you run momctl from another host, you have to tell it which  
host's mom to connect to. Otherwise it defaults to localhost.

On nyx559
[root at nyx559 ~]# momctl -h nyx555 -d3
Host: nyx555.engin.umich.edu/nyx555.engin.umich.edu   Version: 2.1.9
<snip>

Brock Palen

>
>> Adrian Sevcenco wrote:
>>> Hi! I have an instalation of torque and i try to send some test  
>>> jobs but all jobs stop with the status Deferred and i receive on  
>>> the server side this type of errors :
>>> 04/30/2008 22:37:57;0040;PBS_Server;Svr;grid01.x.x;Scheduler sent  
>>> command new
>>> 04/30/2008 22:37:57;0100;PBS_Server;Req;;Type ModifyJob request  
>>> received from root at grid01.x.x, sock=9
>>> 04/30/2008 22:37:57;0008;PBS_Server;Job;133.grid01.x.x;Job  
>>> Modified at request of root at grid01.x.x
>>> 04/30/2008 22:37:57;0100;PBS_Server;Req;;Type RunJob request  
>>> received from root at grid01.x.x, sock=9
>>> 04/30/2008 22:37:57;0008;PBS_Server;Job;133.grid01.x.x;Job Run at  
>>> request of root at grid01.x.x
>>> 04/30/2008 22:37:57;0008;PBS_Server;Job;133.grid01.x.x;send of  
>>> job to wn02 failed error = 15008
>>> 04/30/2008 22:37:57;0001;PBS_Server;Svr;PBS_Server;Access from  
>>> host not allowed, or unknown host (15008) in send_job, child  
>>> failed in previous commit request for job 133.grid01.x.x
>>> 04/30/2008 22:37:57;0008;PBS_Server;Job;133.grid01.x.x;unable to  
>>> run job, MOM rejected/rc=1
>>> 04/30/2008 22:37:57;0080;PBS_Server;Req;req_reject;Reject reply  
>>> code=15041(Execution server rejected request MSG=cannot send job  
>>> to mom, state=PRERUN), aux=0, type=RunJob, from root at grid01.x.x
>>>
>>> on the node i have in mom_log this :
>>> 1193  04/30/2008 22:18:33;0008;    
>>> pbs_mom;Job;process_request;request type QueueJob from host  
>>> grid01.x.x rejected (host not authorized)
>>>   1194  04/30/2008 22:18:33;0080;   pbs_mom;Req;req_reject;Reject  
>>> reply code=15008(Access from host not allowed, or unknown host  
>>> MSG=request not authorized), aux=0, type=QueueJob, from  
>>> PBS_Server at grid01.x.x
>>>
>>> I have public key identification.. i don't know how to pursue the  
>>> problem.. I would appreciate any advice for finding the problem.
>>> Also i have :
>>> 1195  04/30/2008 22:19:32;0001;   pbs_mom;Svr;pbs_mom;is_request,  
>>> bad connect from public_ip:1023 - unauthorized server
>>> But in hosts i put only the private ip. why is not used the  
>>> hostname and privet ip that i put in hosts?
>>> Thank you for any advice you can give me
>>> Best regards,
>>> Adrian
>>>
>>> -------------------------------------------------------
>>> Adrian Sevcenco - Institute of Space Sciences, Romania
>>> -------------------------------------------------------
>>> -------------------------------------------------------------------- 
>>> ----
>>>
>>> _______________________________________________
>>> torqueusers mailing list
>>> torqueusers at supercluster.org
>>> http://www.supercluster.org/mailman/listinfo/torqueusers
>>>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers



More information about the torqueusers mailing list