[torqueusers] OS X Leopard, torque, authentication problem

Tapio Simula tapio.simula at gmail.com
Mon Jan 5 19:34:15 MST 2009

I am trying to set up a queue on a Mac Pro cluster running OSX Leopard
10.5.6 (the following problem existed on an earlier version of Leopard too).
Testing only on a single node everything seems to work fine. When I use two
nodes, one running pbs_server (and scheduler) and the other pbs_mom, all
still works as long as I stay logged on in the node running the pbs_mom. As
soon as I log out, the file staging (scp copy) fails. Restarting pbs_mom
fixes the issue but again for only as long as I stay logged on in the node
running the mom. Below is an example of an interactive job (when I have
logged out from the momhost) which may point to an authentication issue?
The interactive job starts, runs and exits but the user name matching the
uid cannot be read from the (local) database for some reason (id returns the
correct uid but no user name) and this is also the reason that the file
staging fails for normal jobs.

serverhost:~ myusername$ qsub -I
qsub: waiting for job 30.mydomain to start
qsub: job 30.mydomain ready

momhost:~ I have no name!$ dscl . -read /Users
Operation failed with error: eServerNotRunning
momhost:~ I have no name!$ dscacheutil -flushcache
Flushcache failed, unable to talk to daemon
momhost:~ I have no name!$ exit

qsub: job 30.mydomain completed
serverhost:~ myusername$

ssh/scp works fine both ways without prompting passwords. All of the above
is also independent of the scheduler (using Maui or pbs_sched yield the same
results). I am currently testing torque-2.4.0b1 but have the same issue with
various earlier versions of torque.

Any help on how to get torque working with Leopard would be much
Kind regards,
