[torqueusers] Delayed Job Execution
Mark A. White
mawhite at utmb.edu
Tue Oct 5 08:59:39 MDT 2010
Has anyone experienced a problem with PBS/Torque delaying execution of
jobs, even when adequate resources are available? In my experience,
this is not a repeatable problem, sometimes 100 jobs will be launched
simultaneously, and another time 5 of fifty will be delayed, with no
apparent difference in circumstances.
Any ideas how to debug this problem?
On Wed, 2010-09-29 at 17:24 -0500, Mark A. White wrote:
> A user is experience an apparently random issue with the Torque/PDB
> scheduler delaying execution of jobs, even when adequate resources are
> He will submit a batch of 50 or 100 small jobs simultaneously (in a
> single script). Sometimes all 50 or 100 jobs will run. At other times
> several jobs will be held until the other finish (just a few minutes).
> This does not seem to relate to the available resources: there is not
> a conflict with other jobs queued or running.
> Has anyone else experienced this, know what the cause and solution to
> the problem might be?
> torque 2.1.9-1
> CentOS 5.2
> 120 processor cluster (15 boxes, 64 bit dual quad-core opterons)
> Yours sincerely,
> Mark A. White, Ph.D.
> Associate Professor of Biochemistry and Molecular Biology,
> Manager, Sealy Center for Structural Biology and Molecular Biophysics
> X-ray Crystallography Laboratory,
> Basic Science Building, Room 6.660 C
> University of Texas Medical Branch
> Galveston, TX 77555-0647
> Tel. (409) 747-4747
> Cell. (281) 734-3614
> Fax. (409) 747-1404
Mark A. White, Ph.D.
Associate Professor of Biochemistry and Molecular Biology,
Manager, Sealy Center for Structural Biology and Molecular Biophysics
X-ray Crystallography Laboratory,
Basic Science Building, Room 6.660 C
University of Texas Medical Branch
Galveston, TX 77555-0647
Tel. (409) 747-4747
Cell. (281) 734-3614
Fax. (409) 747-1404
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the torqueusers