[torqueusers] qsub: Bad UID for job execution

Torsten Rohlfing rohlfing at ieee.org
Fri Mar 31 13:09:11 MST 2006


Hi everyone!

I guess I have to ask about this myself now, because I have had the same
problem (I should say symptom) with one of my machines for a long time,
and none of the proposed solutions works for me.

Here's what I have: 12 machines in the cluster, 1 of them server running
torque-2.0.0p7 (problem has persisted since 1.2.something) and the other
11 compute nodes. All machines are using the same NIS server. All
compute nodes are in the server's /etc/hosts.equiv. The server is in all
nodes' hosts.equiv, just for good measure. I just set the
ALLOWCOMPUTEHOSTSUBMIT flag in torque.cfg also, since I wasn't aware of
that one before (yes, I restarted the pbs_server process).

Now here's the funny thing - all my compute nodes (and the server, which
is also a compute node) can submit jobs EXCEPT one of the compute nodes.
The only difference I can remotely think of between that compute node
and all the others is that this one used to be the server in a torque
test installation before I got the actual server. Yet, I have checked
all config files many times, and the all compute nodes have essentially
(except for number of CPUs, max loads, etc) the same configs, including
the one that cannot submit jobs.

So I have to ask - is there any not-so-well-know and straight forward
cause of this problem? Or is there a more fundamental solution - like
tell the server to allow submission from anywhere? All my machines are
on a private network, so I really don't care much about restricting
submissions.

Thanks for your help!
  Torsten
> On Thu, Mar 30, 2006 at 05:29:57PM -0500 or thereabouts, Doug Renfrew wrote:
> >/ Hi Gang,
> />/ 
> />/ I am having trouble setting things up so that users can submit jobs
> />/ from the compute hosts. I have a pretty simple setup. Machine 1 is
> />/ acting as the PBS server, the NFS server, and the NIS server. Machine
> />/ 2-16 are acting as PBS clients, NFS clients, and NIS clients. In the
> />/ torque.cfg file on the PBS server I have set ALLOWCOMPUTEHOSTSUBMIT
> />/ true. Users can log in to any of the machines and use qstat, qmgr,
> />/ pbsnodes, etc but qsub fails with the error below.
> />/ 
> />/ qsub: Bad UID for job execution
> /
>
> Hi Doug,
>
>  A well known I am sure you are pleased to here.
>
>  Add you pbs clients as /etc/hosts.equiv on your pbs_server 
>  host or use the newer 
>
>  ALLOWCOMPUTEHOSTSUBMIT
>
>  as defined here.
>
>  http://www.clusterresources.com/products/torque/docs20/a.ktorquecfg.shtml
> >/ 
> />/ Users can submit jobs from machine 1 but not from machines 2-16. Since
> />/ we are using NIS the user ids are the same no matter which machine the
> />/ user is logged into. Can anyone give me any advice on how to figure
> />/ out what is going on.
> />/ 
> />/ Doug
> />/ --
> />/ ---------------------------------------------
> />/ P. Douglas Renfrew
> />/ Graduate Student
> />/ Molecular and Cellular Biophysics Program
> />/ Dept. Biochemistry and Biophysics
> />/ Unv. of North Carolina at Chapel Hill
> />/ ---------------------------------------------
> />/ _______________________________________________
> />/ torqueusers mailing list
> />/ torqueusers at supercluster.org <http://www.supercluster.org/mailman/listinfo/torqueusers>
> />/ http://www.supercluster.org/mailman/listinfo/torqueusers
> /
> -- 
> Steve Traylen
> s.traylen at rl.ac.uk <http://www.supercluster.org/mailman/listinfo/torqueusers>
> http://www.gridpp.ac.uk/


--
Torsten Rohlfing, PhD          SRI International, Neuroscience Program
  Research Scientist             333 Ravenswood Ave, Menlo Park, CA 94025
   Phone: ++1 (650) 859-3379      Fax: ++1 (650) 859-2743
    torsten at synapse.sri.com        http://www.stanford.edu/~rohlfing/

      "Though this be madness, yet there is a method in't"



More information about the torqueusers mailing list