[torqueusers] pbs_mom no password entry for user
Miles O'Neal
meo at intrinsity.com
Fri Aug 1 10:30:07 MDT 2008
Mary Ellen Fitzpatrick said...
|
|Not sure what you mean by upgrading the moms. Do you mean on the
|compute nodes?
Yes, the torque-mom packages on the compute nodes.
Someone here wrote a script to make sure each
node was offline & drained, restarted the mom w/
the new software, thenm put the node back online.
|If so, I did that with torque packages created after running make
|packages in the build dir.
|any other hints as to what solved the issue for you?
Not that I'm aware of, but I'll ask.
|
|Miles O'Neal wrote:
|> Mary Ellen Fitzpatrick said...
|> |
|> |I installed/configured torque-2.3.2-snap.200807231134.tar.gz. I get the
|> |same error messages.
|>
|> When we upgraded to the current maui and pbs_server
|> we had all kinds of ridiculous errors and had to
|> upgrade the moms as well. That was just this week.
|>
|> |
|> |headnode: /var/spool/torque/server_log
|> |Aug 1 11:47:51 node1003 pbs_mom: open_std_file, cannot determine filename
|> |Aug 1 11:47:51 node1003 pbs_mom: Success (0) in fork_to_user, cannot
|> |find user 'mef' in password file
|> |Aug 1 11:47:51 node1003 pbs_mom: Inappropriate ioctl for device (25) in
|> |req_cpyfile, fork_to_user failed with rc=-15023 'cannot find user 'mef'
|> |in password file' - returning failure
|> |
|> |node1003: /var/spool/torque/mom_log:
|> |08/01/2008 11:47:46;0001; pbs_mom;Job;job_nodes;job: 5.nona-man
|> |numnodes=1 numvnod=1
|> |08/01/2008 11:47:46;0001; pbs_mom;Svr;pbs_mom;start_exec, no password
|> |entry for user mef
|> |08/01/2008 11:47:46;0008; pbs_mom;Req;send_sisters;sending command
|> |ABORT_JOB for job 5.nona-man (10)
|> |08/01/2008 11:47:46;0008; pbs_mom;Req;send_sisters;sending ABORT to
|> |sisters
|> |08/01/2008 11:47:46;0002; pbs_mom;n/a;mom_server_update_stat;status
|> |update successfully sent to nona-man
|> |08/01/2008 11:47:46;0080; pbs_mom;Svr;scan_for_exiting;searching for
|> |exiting jobs
|> |08/01/2008 11:47:46;0008; pbs_mom;Job;kill_job;scan_for_exiting:
|> |sending signal 9, "KILL" to job 5.nona-man, reason: local task
|> |termination detected
|> |
|> |
|> |
|> |
|> |
|> |Glen Beane wrote:
|> |> On Thu, Jul 31, 2008 at 4:15 PM, Mary Ellen Fitzpatrick <mfitzpat at bu.edu> wrote:
|> |>
|> |>
|> |> Hi,
|> |> I have installed/configured torque-2.3.1 and maui-3.2.6p18 on my head node, nona-man. I thought I had everything configured correctly, but apparantly not.
|> |>
|> |>
|> |>
|> |> what OS are you using?
|> |>
|> |> can you try
|> |>
|> |> http://www.clusterresources.com/downloads/torque/snapshots/torque-2.3.2-snap.200807231134.tar.gz
|> |>
|> |> instead of the 2.3.1 release? 2.3.1 has a couple known issues
|> |>
|> |>
|> |>
|> |>
|> |
|> |--
|> |Thanks
|> |Mary Ellen
|> |
|> |_______________________________________________
|> |torqueusers mailing list
|> |torqueusers at supercluster.org
|> |http://www.supercluster.org/mailman/listinfo/torqueusers
|> |
|>
|>
|>
|
|--
|Thanks
|Mary Ellen
|
|
--
Miles O'Neal
Intrinsity, Inc.
meo at intrinsity.com
More information about the torqueusers
mailing list