[torqueusers] pbs_server error 15008

Steve Snelgrove ssnelgrove at clusterresources.com
Wed Apr 30 14:52:08 MDT 2008


Okay, the problem that I was refering to is not in 2.1.9.

However, it is always helpful to run the momctl command to see if the 
mom is happy and is talking to the server.

Like Brock says, you can run momctl from the server node but you must 
give the "-h" option to specify which node to contact for the mom diag info.


Brock Palen wrote:
> On Apr 30, 2008, at 4:31 PM, Adrian Sevcenco wrote:
>> Steve Snelgrove wrote:
>>> You should do the command "momctl -d3" on the mom and see if the 
>>> server address is in the trusted client list.  If not, you are 
>>> experiencing the effect of my goof in the code that is now corrected 
>>> in the latest snapshots.
>>> The latest snapshots can be obtained at 
>>> http://www.clusterresources.com/downloads/torque/snapshots/.
>> Hi! Thank for looking into this. i dont have momctl on the nodes and 
>> on the server it gives : [root at grid01 bin]# momctl -d3
>> ERROR:    query[0] 'diag3' failed on localhost (errno: 0:5)
>> It is about version: 2.1.9 packaged by EGEE for glite(GRID) install.
>> So i am thinking that is something about my configuration.
>> Thank you,
>> Adrian
>
> When you run momctl from another host, you have to tell it which 
> host's mom to connect to. Otherwise it defaults to localhost.
>
> On nyx559
> [root at nyx559 ~]# momctl -h nyx555 -d3
> Host: nyx555.engin.umich.edu/nyx555.engin.umich.edu   Version: 2.1.9
> <snip>
>
> Brock Palen
>
>>
>>> Adrian Sevcenco wrote:
>>>> Hi! I have an instalation of torque and i try to send some test 
>>>> jobs but all jobs stop with the status Deferred and i receive on 
>>>> the server side this type of errors :
>>>> 04/30/2008 22:37:57;0040;PBS_Server;Svr;grid01.x.x;Scheduler sent 
>>>> command new
>>>> 04/30/2008 22:37:57;0100;PBS_Server;Req;;Type ModifyJob request 
>>>> received from root at grid01.x.x, sock=9
>>>> 04/30/2008 22:37:57;0008;PBS_Server;Job;133.grid01.x.x;Job Modified 
>>>> at request of root at grid01.x.x
>>>> 04/30/2008 22:37:57;0100;PBS_Server;Req;;Type RunJob request 
>>>> received from root at grid01.x.x, sock=9
>>>> 04/30/2008 22:37:57;0008;PBS_Server;Job;133.grid01.x.x;Job Run at 
>>>> request of root at grid01.x.x
>>>> 04/30/2008 22:37:57;0008;PBS_Server;Job;133.grid01.x.x;send of job 
>>>> to wn02 failed error = 15008
>>>> 04/30/2008 22:37:57;0001;PBS_Server;Svr;PBS_Server;Access from host 
>>>> not allowed, or unknown host (15008) in send_job, child failed in 
>>>> previous commit request for job 133.grid01.x.x
>>>> 04/30/2008 22:37:57;0008;PBS_Server;Job;133.grid01.x.x;unable to 
>>>> run job, MOM rejected/rc=1
>>>> 04/30/2008 22:37:57;0080;PBS_Server;Req;req_reject;Reject reply 
>>>> code=15041(Execution server rejected request MSG=cannot send job to 
>>>> mom, state=PRERUN), aux=0, type=RunJob, from root at grid01.x.x
>>>>
>>>> on the node i have in mom_log this :
>>>> 1193  04/30/2008 22:18:33;0008;   
>>>> pbs_mom;Job;process_request;request type QueueJob from host 
>>>> grid01.x.x rejected (host not authorized)
>>>>   1194  04/30/2008 22:18:33;0080;   pbs_mom;Req;req_reject;Reject 
>>>> reply code=15008(Access from host not allowed, or unknown host 
>>>> MSG=request not authorized), aux=0, type=QueueJob, from 
>>>> PBS_Server at grid01.x.x
>>>>
>>>> I have public key identification.. i don't know how to pursue the 
>>>> problem.. I would appreciate any advice for finding the problem.
>>>> Also i have :
>>>> 1195  04/30/2008 22:19:32;0001;   pbs_mom;Svr;pbs_mom;is_request, 
>>>> bad connect from public_ip:1023 - unauthorized server
>>>> But in hosts i put only the private ip. why is not used the 
>>>> hostname and privet ip that i put in hosts?
>>>> Thank you for any advice you can give me
>>>> Best regards,
>>>> Adrian
>>>>
>>>> -------------------------------------------------------
>>>> Adrian Sevcenco - Institute of Space Sciences, Romania
>>>> -------------------------------------------------------
>>>> ------------------------------------------------------------------------ 
>>>>
>>>>
>>>> _______________________________________________
>>>> torqueusers mailing list
>>>> torqueusers at supercluster.org
>>>> http://www.supercluster.org/mailman/listinfo/torqueusers
>>>>
>> _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/torqueusers
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>



More information about the torqueusers mailing list