[torqueusers] all jobs in Q state

James J Coyle jjc at iastate.edu
Thu Feb 22 10:40:31 MST 2007


Roland,



   You want a fix right now I'm sure.
I suggest the following:

(Caveat: I am a Torque user not a developer. Here is what I have done
  in a similar situation.  I run 4 clusters containing about 200 
  multiprocessor nodes.)


Edit the file

 /var/spool/torque/sched_priv/sched_config

  Change the line 
help_starving_jobs     true    ALL

to

help_starving_jobs      false   ALL


  Exit the editor, and issue

killall -9 pbs_sched;  /usr/local/sbin/pbs_sched


  The currently running jobs should stay running, and 
new jobs should start scheduling again.

  Worst case would be losing the currently running jobs,
but I have started and stopped both the scheduler and 
server in this fashion with Torque and have not lost any jobs.



 - Jim Coyle 



More information about the torqueusers mailing list