[torqueusers] pbs_mom no password entry for user
Mary Ellen Fitzpatrick
mfitzpat at bu.edu
Thu Jul 31 14:15:39 MDT 2008
Hi,
I have installed/configured torque-2.3.1 and maui-3.2.6p18 on my head
node, nona-man. I thought I had everything configured correctly, but
apparantly not.
When I submit a test job, I get the following error on compute node1003
07/31/2008 15:52:55;0008; pbs_mom;Job;process_request;request type
CopyFiles from host nona-man allowed
07/31/2008 15:52:55;0008; pbs_mom;Job;11.nona-man;attempting to copy
file 'nona-man:/home/mef/test3.out'
07/31/2008 15:52:55;0001; pbs_mom;Svr;pbs_mom;Success (0) in
fork_to_user, cannot find user 'mef' in password file
07/31/2008 15:52:55;0080; pbs_mom;Req;req_reject;Reject reply
code=15023(Bad UID for job execution REJHOST=node1003 MSG=cannot find
user 'mef' in password file), aux=0, type=CopyFiles, from
PBS_Server at nona-man
But I can run ypcat passwd on node1003 and the account mef is listed.
Not sure why I am getting that error.
Also, the error about Bad UID for job execution, my qmgr lists acl_hosts
and server submit hosts as:
set server acl_host_enable = True
set server acl_hosts = *
set server acl_hosts += nona-man
set server submit_hosts += nona-man
/var/log/messages on the head node has the following errors:
Jul 31 15:52:20 nona-man PBS_Server: stream_eof, connection to node1003
is bad, remote service may be down, message may be corrupt, or
connection may have been dropped remotely (End of File). setting node
state to down
Jul 31 15:52:50 node1003 pbs_mom: start_exec, no password entry for user
mef
Jul 31 15:52:50 nona-man kernel: maui[20200]: segfault at
0000000000000694 rip 0000000000456a21 rsp 000007fbff4bb00 error 4
Jul 31 15:52:50 node1003 pbs_mom: open_std_file, cannot determine filename
Jul 31 15:52:55 node1003 pbs_mom: open_std_file, cannot determine filename
Jul 31 15:52:55 node1003 pbs_mom: Success (0) in fork_to_user, cannot
find user 'mef' in password file
Jul 31 15:52:55 node1003 pbs_mom: Inappropriate ioctl for device (25) in
req_cpyfile, fork_to_user failed with rc=-15023 'cannot find user 'mef'
in password file' - returning failure
when run pbsnodes -a, node1003 is listed as "free".
And just in case that is not enough, maui crashes as indicated by the
segfault above.
any advice/help would be great
--
Thanks
Mary Ellen
More information about the torqueusers
mailing list