[torqueusers] qsub: Bad UID for job execution
Santiago Iturriaga
siturria at fing.edu.uy
Fri Mar 26 05:31:34 MDT 2010
More info on the problem. I changed log_events from 64 to 511 and I
got the following error log:
03/26/2010 11:26:27;0100;PBS_Server;Req;;Type AuthenticateUser request
received from siturria at cluster.fing.edu.uy, sock=11
03/26/2010 11:26:27;0100;PBS_Server;Req;;Type QueueJob request received
from siturria at cluster.fing.edu.uy, sock=10
03/26/2010 11:26:27;0080;PBS_Server;Req;req_reject;Reject reply
code=15023(Bad UID for job execution), aux=0, type=QueueJob, from
siturria at cluster.fing.edu.uy
On 03/26/2010 10:50:31 AM, Santiago Iturriaga wrote:
> I'm trying to setup TORQUE 2.1.8 and every time I try to submit a job
> from the pbs_server node I keep getting this error: "qsub: Bad UID
> for
>
> job execution". There's no log entry in the server_logs when the
> error
>
> occurs.
>
> I'm not using root to submit the job and the user is shared among
> all the cluster nodes. When I execute `id siturria` I get the same
> UID
>
> on all the nodes:
>
> [siturria at node02 hola_mundo]$ id siturria
> uid=524(siturria) gid=501(clusterusers) groups=501(clusterusers),515
> (pbs),516(maui)
>
> The pbs_server node has two network interfaces (and two names):
> cluster.fing.edu.uy (164.73.47.186) and node01.cluster.fing
> (192.168.242.1). The pbs_server is configured to run with the
> hostname
>
> cluster.fing.edu.uy.
>
> Here's my pbs_server configuration (I modified a couple of
> things trying to identify the problem so there's unnecesary stuff
> in the configuration):
>
> Qmgr: print server
> #
> # Create queues and set their attributes.
> #
> #
> # Create and define queue workq
> #
> create queue workq
> set queue workq queue_type = Execution
> set queue workq resources_max.cput = 10000:00:00
> set queue workq resources_max.ncpus = 64
> set queue workq resources_max.nodect = 8
> set queue workq resources_max.walltime = 10000:00:00
> set queue workq resources_min.cput = 00:00:01
> set queue workq resources_min.ncpus = 1
> set queue workq resources_min.nodect = 1
> set queue workq resources_min.walltime = 00:00:01
> set queue workq resources_default.cput = 10000:00:00
> set queue workq resources_default.ncpus = 1
> set queue workq resources_default.nodect = 1
> set queue workq resources_default.walltime = 10000:00:00
> set queue workq resources_available.nodect = 8
> set queue workq enabled = True
> set queue workq started = True
> #
> # Set server attributes.
> #
> set server scheduling = True
> set server managers = root at cluster.fing.edu.uy
> set server managers += siturria at cluster.fing.edu.uy
> set server managers += siturria at node02.cluster.fing
> set server managers += siturria at node01.cluster.fing
> set server operators = root at cluster.fing.edu.uy
> set server operators += siturria at cluster.fing.edu.uy
> set server operators += siturria at node02.cluster.fing
> set server operators += siturria at node01.cluster.fing
> set server default_queue = workq
> set server log_events = 64
> set server mail_from = adm
> set server query_other_jobs = True
> set server resources_available.ncpus = 64
> set server resources_available.nodect = 8
> set server resources_available.nodes = 8
> set server resources_max.ncpus = 64
> set server resources_max.nodes = 8
> set server scheduler_iteration = 60
> set server node_check_rate = 150
> set server tcp_timeout = 6
> set server pbs_version = 2.1.8
> set server submit_hosts = node01.cluster.fing
> set server submit_hosts += cluster.fing.edu.uy
> set server submit_hosts += node02.cluster.fing
> set server allow_node_submit = True
>
> Regards,
> Santiago.
>
>
More information about the torqueusers
mailing list