[torqueusers] Torque 2.3.4 - Jobs not running

Wayne Mallett wayne.mallett at jcu.edu.au
Tue Nov 25 16:48:25 MST 2008


On Tue, Nov 25, 2008 at 07:05:26AM +1000, Wayne Mallett alleged:
 > > G'day all,
 > >
 > > I have recently upgraded to Torque 2.3.4 and have found jobs won't run on
 > > some servers unless I direct them to with a "qrun <jobid>".   Using
 > > "tracejob <jobid>" on a job that wasn't forced to run, I get the following
 > > output
 >
 > Do you have a scheduler running?
 >
 > Note that 2.3.5 was released last week.

Yes, I do have a scheduler (maui) running.  The problem reported only occurs 
on _some_ compute nodes.  I recently added 33 servers to the cluster I manage, 
18 of these will accept jobs, 15 won't and I'm trying to diagnose why.  All 
systems should be built to the same image (using XCAT).  The pbs_server/maui 
daemons run on a VM that has been handling jobs (various versions) for several 
years now.

Thanks,
Wayne

-- 
Dr. Wayne Mallett
Email:	Wayne.Mallet at jcu.edu.au
Smail:	High Performance & Research Computing
	James Cook University
	Townsville  Qld 4811
Phone:	0747815084


More information about the torqueusers mailing list