[torqueusers] torque on itanium

Jan Snigula snigula at usm.uni-muenchen.de
Tue Feb 12 14:06:34 MST 2008


Hi torque developers,

I'm trying to run torque on a 8 node itanium cluster (linux 2.6.9-47  
centos
4.6 HP blade servers). Any time a job is started on a node, the  
pbs_mom goes
up to 100% CPU time, the job is executed but ends up in the exiting  
state.
Here only a qdel -p (which left the pbs_mom in 100% CPU status) or a
/etc/init.d/pbs_mom purge (which results in a normal behavior)  
releases the
job and the CPU usage.

To test it I setup a 1 node execution only environment and did a
	strace -etrace=desc -F -f -ff -p pid_of_pbs_mom before I submitted a
job. (I saved the result of this process and can send it to you if
interested). The overall behavior is (shown below), that when the job  
goes
into execution: a huge number of "select" system calls is executed  
within
pbs_mom which drives the process to 100% CPU usage.

I tested with torque-2.0pl11 up to torque-2.3.0-snap.200801151629.

Can anyone help me?

Jan Snigula


More information about the torqueusers mailing list