[torqueusers] Default C Scheduler behaves strange / messages

Garrick Staples garrick at usc.edu
Tue Jan 17 15:54:42 MST 2006


On Fri, Jan 13, 2006 at 03:10:48PM +0100, Henryk Feider alleged:
> Hi,
> 
> I am confronted with some strange behavior of the default C Scheduler. 
> It kills jobs, and is printing one of the three messages. But after 
> reviewing the submit scripts, I can not see, were the problem is.
> Can someone please explain me, with conditions trigger the following 
> events in the schedular:
> 
> 1)
> PBS Job Id: 2744.peyote.aei.mpg.de
> Job Name:   D20%IH
> Aborted by PBS Server
> Job aborted on PBS Server initialization

This happens when shutting down or starting pbs_server, depending on the
type of shutdown or startup, or if it can't recover the job files.  See
the pbs_server and qterm manpages.

It can also happen in MOM if if can't create the necessary job task
directory.

Maybe check for full disks?  Check the MOM dirs?

 
> 2)
> PBS Job Id: 2773.peyote.aei.mpg.de
> Job Name:   D02%Nik
> job deleted
> Job deleted at request of Scheduler at peyote.aei.mpg.de
> Job could never run

The scheduler wasn't able to find resources that matched the request.

 
> 3)
> qsub.interactive -I -l nodes=8:ppn=2
> qsub: waiting for job 2875.peyote.aei.mpg.de to start
> qsub: job 2875.peyote.aei.mpg.de apparently deleted

The reason why it was deleted would have been emailed.

-- 
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20060117/139f1a45/attachment.bin


More information about the torqueusers mailing list