[torqueusers] Bad UID for job execution MSG=ruserok failed validating ...

Carlos Borroto carlos.borroto at gmail.com
Wed Nov 7 15:21:01 MST 2012


Hi,

After a very frustrating day trying to fix this by myself I surrender
and ask for help.

I'm installing latest torque version (4.1.3)  on Centos 6. I would
like to avoid using rsh/ssh. If I understood correctly by following
the documentation[1] there are 3 ways I could accomplish this. By
using "submit_hosts", by setting "allow_node_submit" or by enabling
munge( ./configure --enable-munge-auth ). None seems to work for me.

No matter what I do I keep getting this error when submitting from
other than the head node:
$ echo 'sleep 10' | qsub
qsub: submit error (Bad UID for job execution MSG=ruserok failed
validating cborroto/cborroto from borroto-lx.domain.local)

This is my server configuration:
# qmgr -c "p s"
#
# Create queues and set their attributes.
#
#
# Create and define queue batch
#
create queue batch
set queue batch queue_type = Execution
set queue batch resources_default.nodes = 1
set queue batch resources_default.walltime = 01:00:00
set queue batch enabled = True
set queue batch started = True
#
# Set server attributes.
#
set server scheduling = True
set server managers = cborroto at services-lx.domain.local
set server operators = cborroto at services-lx.domain.local
set server default_queue = batch
set server log_events = 511
set server mail_from = adm
set server scheduler_iteration = 600
set server node_check_rate = 150
set server tcp_timeout = 300
set server job_stat_rate = 45
set server poll_jobs = True
set server mom_job_sync = True
set server keep_completed = 300
set server submit_hosts = borroto-lx.domain.local
set server allow_node_submit = True
set server next_job_number = 9
set server moab_array_compatible = True

rsh is not installed in any of the systems, I will like to keep it
like that. The system users are from LDAP, in case this is relevant.
The exact same configuration but using torque 2.5 packages from EPEL
which require munge, does work. I'm ok using munge, but haven't
figured out why with the packages I'm compiling do not work. Can't
even tell if munge is being query at all by torque.

config.log:
$ ./configure --enable-munge-auth --enable-drmaa
...
configure:23972: checking whether to support munge authorization
configure:23975: result: yes
...

Testing if munge is currectly configured:
[cborroto at borroto-lx ~]$ munge -n | ssh services-lx.domain.local unmunge
STATUS:           Success (0)
ENCODE_HOST:      borroto-lx.domain.local (X.X.X.X)
ENCODE_TIME:      2012-11-07 17:14:38 (1352326478)
DECODE_TIME:      2012-11-07 17:14:37 (1352326477)
TTL:              300
CIPHER:           aes128 (4)
MAC:              sha1 (3)
ZIP:              none (0)
UID:              cborroto (1002)
GID:              cborroto (1002)
LENGTH:           0

[1]http://www.adaptivecomputing.com/resources/docs/torque/4-0-2/help.htm#topics/1-installConfig/serverConfig.htm#configJobSubHost

Any help will be highly appreciated,
Thanks,
Carlos


More information about the torqueusers mailing list