[torqueusers] job failure- cannot find user in password file

shazly hmelshazly at gmail.com
Mon Jun 17 10:39:00 MDT 2013


Hi there guys,

I'm having a problem with pbs i wish anyone can help me out with it.

first Here are some helping info:
server=hatem-Inspiron-5520
client=toma-VirtualBox
shazly= a user on the server
job id=9.hatem-Inspiron-5520

So i installed and configured pbs torque on a mini-cluster (server + one 
client), when i submit a job from the server, i don't get the output files, 
so i went to check the mom log on the client machine and i found these 
entries:

pbs_mom;Svr;mom_server_add;server hatem-Inspiron-5520 added
pbs_mom;Svr;pbs_mom;LOG_ALERT::mom_server_valid_message_source, bad connect 
from "the server ip"- unauthorized server
mom_server_check_connection;sending hello to server 'hatem-Inspiron-5520'
pbs_mom;LOG_ERROR::start_exec, no password entry for user 'shazly'
pbs_mom;Req;send_sisters;sending ABORT to sisters for job '9.hatem-Inspiron-
5520'
pbs_mom;Svr;pbs_mom;LOG_ERROR::sucess(0) in fork_to_user, cannot find 
'shazly' in password file
pbs_mom;Req;req_reject;Reject reply code=15025(BAD UID for job execution 
REJHOST=toma-virtualbox MSG=cannot find user 'shazly' in password file), 
aux=0, type=CopyFiles, from PBS_Server at hatem-Inspiron-5520
pbs_mom;Svr;pbs_mom;LOG_ERROR::Inappropriate ioctl for device (25) in 
req_cpyfile, fork_to_user failed with rc=-15025 'cannot find user 'shazly' 
in password file'-returning failure
pbs_mom;Job;removed job script 

Also when i run "qstat -f" on the server afer submitting the job i get:
sched_hint=Post Job file processing error; job 9.hatem-Inspiron-5520 on host 
toma-VirtualBox/0 BAD UID for job execution REJHOST=toma-virtualBox 
MSG=cannot find user 'shazly' in password file
exit_status=-1

Everything in /etc/hosts is fine and i can ssh from server to client 
passwordless and vice-versa and i can ping both ips. I'm frustrated here so 
any help is appreciated.

Thanks




More information about the torqueusers mailing list