[torqueusers] Delayed Job Execution

Mark A. White mawhite at utmb.edu
Wed Sep 29 16:24:31 MDT 2010


Hello,

A user is experience an apparently random issue with the Torque/PDB
scheduler delaying execution of jobs, even when adequate resources are
available.  

He will submit a batch of 50 or 100 small jobs simultaneously (in a
single script). Sometimes all 50 or 100 jobs will run.  At other times
several jobs will be held until the other finish (just a few minutes).
This does not seem to relate to the available resources: there is not a
conflict with other jobs queued or running.


Has anyone else experienced this, know what the cause and solution to
the problem might be?

System:
torque 2.1.9-1
CentOS 5.2
120 processor cluster (15 boxes, 64 bit dual quad-core opterons)

Yours sincerely,

Mark A. White, Ph.D.
Associate Professor of Biochemistry and Molecular Biology, 
Manager, Sealy Center for Structural Biology and Molecular Biophysics
X-ray Crystallography Laboratory,
Basic Science Building, Room 6.660 C
University of Texas Medical Branch
Galveston, TX 77555-0647
Tel. (409) 747-4747
Cell. (281) 734-3614
Fax. (409) 747-1404
mailto://mawhite@utmb.edu
http://xray.utmb.edu
http://xray.utmb.edu/~white
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20100929/55615ea2/attachment.html 


More information about the torqueusers mailing list