[torqueusers] Empty output/error log file

David Beer dbeer at adaptivecomputing.com
Wed Mar 23 15:01:53 MDT 2011


Before I respond to anything else, I would say your best bet is to look in the log files for the pbs_mom and the pbs_server to find out if it has any errors about the jobs.

----- Original Message -----
> >> It looks like this type of problem appends when the machine on
> >> which
> >> the PBS script is executed is busy (due to other running PBS jobs
> >> for
> >> instance).
> >
> > If you have a number of jobs running on a machine and there is still
> > room (i.e. free processors) for new jobs then those will get
> > scheduled
> > and start to run. If there is no room for new jobs (aka "busy") then
> > nothing will be started.
> Yes I know that ;-)
> My feeling is that when a core is freed because a PBS script ends
> well, its hard drive might remains busy because of other jobs running.
> Thus the hard drive might not be available for the next coming
> jobs....
> Does it make sense?
> If Yes how to avoid this problem?

I have never heard of such a problem, but if that were the case it would be an error in the filesystem, or an error on the hard drive. These sort of problems are usually dealt with using a node health script http://www.adaptivecomputing.com/resources/docs/torque/11.2healthcheck.php but I'm still thinking the log files are going to provide the most insight into this problem.

David Beer 
Direct Line: 801-717-3386 | Fax: 801-717-3738
     Adaptive Computing
     1656 S. East Bay Blvd. Suite #300
     Provo, UT 84606

More information about the torqueusers mailing list