[torqueusers] qsub: Bad UID for job execution

Santiago Iturriaga siturria at fing.edu.uy
Fri Mar 26 05:31:34 MDT 2010


More info on the problem. I changed log_events from 64 to 511 and I 
got the following error log:

03/26/2010 11:26:27;0100;PBS_Server;Req;;Type AuthenticateUser request 
received from siturria at cluster.fing.edu.uy, sock=11
03/26/2010 11:26:27;0100;PBS_Server;Req;;Type QueueJob request received 
from siturria at cluster.fing.edu.uy, sock=10
03/26/2010 11:26:27;0080;PBS_Server;Req;req_reject;Reject reply 
code=15023(Bad UID for job execution), aux=0, type=QueueJob, from 
siturria at cluster.fing.edu.uy

On 03/26/2010 10:50:31 AM, Santiago Iturriaga wrote:
> I'm trying to setup TORQUE 2.1.8 and every time I try to submit a job 
> from the pbs_server node I keep getting this error: "qsub: Bad UID 
> for
> 
> job execution". There's no log entry in the server_logs when the 
> error
> 
> occurs.
> 
> I'm not using root to submit the job and the user is shared among 
> all the cluster nodes. When I execute `id siturria` I get the same 
> UID
> 
> on all the nodes:
> 
> [siturria at node02 hola_mundo]$ id siturria
> uid=524(siturria) gid=501(clusterusers) groups=501(clusterusers),515
> (pbs),516(maui)
> 
> The pbs_server node has two network interfaces (and two names): 
> cluster.fing.edu.uy (164.73.47.186) and node01.cluster.fing 
> (192.168.242.1). The pbs_server is configured to run with the 
> hostname
> 
> cluster.fing.edu.uy.
> 
> Here's my pbs_server configuration (I modified a couple of 
> things trying to identify the problem so there's unnecesary stuff 
> in the configuration): 
> 
> Qmgr: print server
> #
> # Create queues and set their attributes.
> #
> #
> # Create and define queue workq
> #
> create queue workq
> set queue workq queue_type = Execution
> set queue workq resources_max.cput = 10000:00:00
> set queue workq resources_max.ncpus = 64
> set queue workq resources_max.nodect = 8
> set queue workq resources_max.walltime = 10000:00:00
> set queue workq resources_min.cput = 00:00:01
> set queue workq resources_min.ncpus = 1
> set queue workq resources_min.nodect = 1
> set queue workq resources_min.walltime = 00:00:01
> set queue workq resources_default.cput = 10000:00:00
> set queue workq resources_default.ncpus = 1
> set queue workq resources_default.nodect = 1
> set queue workq resources_default.walltime = 10000:00:00
> set queue workq resources_available.nodect = 8
> set queue workq enabled = True
> set queue workq started = True
> #
> # Set server attributes.
> #
> set server scheduling = True
> set server managers = root at cluster.fing.edu.uy
> set server managers += siturria at cluster.fing.edu.uy
> set server managers += siturria at node02.cluster.fing
> set server managers += siturria at node01.cluster.fing
> set server operators = root at cluster.fing.edu.uy
> set server operators += siturria at cluster.fing.edu.uy
> set server operators += siturria at node02.cluster.fing
> set server operators += siturria at node01.cluster.fing
> set server default_queue = workq
> set server log_events = 64
> set server mail_from = adm
> set server query_other_jobs = True
> set server resources_available.ncpus = 64
> set server resources_available.nodect = 8
> set server resources_available.nodes = 8
> set server resources_max.ncpus = 64
> set server resources_max.nodes = 8
> set server scheduler_iteration = 60
> set server node_check_rate = 150
> set server tcp_timeout = 6
> set server pbs_version = 2.1.8
> set server submit_hosts = node01.cluster.fing
> set server submit_hosts += cluster.fing.edu.uy
> set server submit_hosts += node02.cluster.fing
> set server allow_node_submit = True
> 
> Regards,
> Santiago.
> 
> 





More information about the torqueusers mailing list