[torqueusers] qrerun fails due to Unauthorized Request

David Beer dbeer at adaptivecomputing.com
Fri Nov 18 09:59:36 MST 2011


Are the super user and or your user at that box managers on pbs_server? You would need manager privileges to qrerun a job.

David

----- Original Message -----
> Dear torque users,
> 
> I am trying to use qrerun in a shell script to deal with the
> (potential) limit in available MATLAB licenses. Let me shortly
> outline the idea before explaining the problem.
> 
> I have a shell script that starts MATLAB with the "-r <filename>"
> option for a MATLAB script. In case there is no license available,
> MATLAB returns immediately with a descriptive error about the
> license failure. I would like to catch that error and if it happens,
> issue "qalter -h u JOBID" and "qrerun JOBID" to reschedule the job
> for execution at a later time. Note that I am aware of the ability
> to configure floating resources in moab, but I am using maui.
> Furthermore, the floating resources for the Matlab license don't
> optimally represent the license requirements for scheduling multiple
> jobs by the same user on a multicore machine. Hence I prefer to use
> qrerun instead of making the license a managed resource.
> 
> The problem I run into can be summarized in the following snippet
> from the command line. I schedule a simple job that subsequenty
> starts running on one of the execution hosts:
> 
> roboos at mentat001> echo sleep 1000 | qsub
> 45254.dccn-l014.dccn.nl
> 
> Then I try to use qrerun, first as regular user then as super user
> (which I normally would not do of course):
> 
> roboos at mentat001> qrerun 45254
> qrerun: Unauthorized Request  45254.dccn-l014.dccn.nl
> roboos at mentat001> sudo qrerun 45254
> qrerun: Unauthorized Request  MSG=operation not permitted
> 45254.dccn-l014.dccn.nl
> 
> So as root/administrative user I am also not allowed to do it from
> the client machine. I am able to log in directly on the torque
> server, where as regular user I am also not allowed to qrerun. It is
> not a general failure of qrerun, since the the root user on the
> torque server is allowed to use it:
> 
> roboos at mentat001> ssh torque
> roboos at torque> qrerun 45254
> qrerun: Unauthorized Request  45254.dccn-l014.dccn.nl
> roboos at torque> sudo qrerun 45254
> 
> after which the job is correctly requeued and starts over again.
> 
> To provide some info from the log files: as regular user I get the
> following message in /var/spool/torque/server_logs
> 
> 11/16/2011 09:36:55;0080;PBS_Server;Req;req_reject;Reject reply
> code=15018(Request invalid for state of job), aux=0, type=RerunJob,
> from roboos at mentat001.dccn.nl
> 
> and as root on the torque server I get
> 
> 11/16/2011 09:38:12;0080;PBS_Server;Req;req_reject;Reject reply
> code=15018(Request invalid for state of job), aux=0, type=RerunJob,
> from root at dccn-l014.dccn.nl
> 
> The log mesaage is basically the same. In the log message on the
> execution host I cannot find anything that pertains to the failed
> qrerun request.
> 
> Does anyone have an idea on what might be the problem for the regular
> user not being allowed to restart the job? I tried the same thing on
> a different torque cluster (not managed by me) that I have access
> to, and also there it failed.
> 
> 
> best regards,
> Robert
> 
> 
> 
> -----------------------------------------------------------
> Robert Oostenveld, PhD
> Senior Researcher & MEG Physicist
> Donders Institute for Brain, Cognition and Behaviour
> Centre for Cognitive Neuroimaging
> Radboud University Nijmegen
> tel.: +31 (0)24 3619695
> e-mail: r.oostenveld at donders.ru.nl
> web: http://www.ru.nl/neuroimaging
> skype: r.oostenveld
> -----------------------------------------------------------
> 
> 
> 
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
> 

-- 
David Beer 
Direct Line: 801-717-3386 | Fax: 801-717-3738
     Adaptive Computing
     1712 S East Bay Blvd, Suite 300
     Provo, UT 84606



More information about the torqueusers mailing list