[torqueusers] torque on itanium
Jan Snigula
snigula at usm.uni-muenchen.de
Tue Feb 12 14:06:34 MST 2008
Hi torque developers,
I'm trying to run torque on a 8 node itanium cluster (linux 2.6.9-47
centos
4.6 HP blade servers). Any time a job is started on a node, the
pbs_mom goes
up to 100% CPU time, the job is executed but ends up in the exiting
state.
Here only a qdel -p (which left the pbs_mom in 100% CPU status) or a
/etc/init.d/pbs_mom purge (which results in a normal behavior)
releases the
job and the CPU usage.
To test it I setup a 1 node execution only environment and did a
strace -etrace=desc -F -f -ff -p pid_of_pbs_mom before I submitted a
job. (I saved the result of this process and can send it to you if
interested). The overall behavior is (shown below), that when the job
goes
into execution: a huge number of "select" system calls is executed
within
pbs_mom which drives the process to 100% CPU usage.
I tested with torque-2.0pl11 up to torque-2.3.0-snap.200801151629.
Can anyone help me?
Jan Snigula
More information about the torqueusers
mailing list