Bugzilla – Bug 165
qstat -a reports wrong 0 value in TSK column if nodes is a hostname
Last modified: 2013-02-19 02:34:50 MST
You need to log in before you can comment on or make changes to this bug.
cat ppn2_TSK0.batch #!/bin/sh #PBS -S /bin/sh #PBS -l nodes=horizon11:ppn=2 #PBS -N Npp #PBS -j oe sleep 22m qsub ppn2_TSK0.batch 246.horizon horizon: ~/torque_tests > qstat -a horizon.iap.fr: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------------- ------ ----- --- ------ ----- - ----- 246.horizon rouberol batch Npp 4445 1 0 -- 04:00 R -- TSK value is 0 instead of 2 The problem comes from torque-3.0.3/src/cmds/qstat.c code, line 709 in 3.0.3 version: int nodes = atoi(pat->value); This returns 0 if pat->value does not begin with a number, like "horizon11" in the job script example above. The qstat.c code should distinguish between the 2 possibilities indicated in http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml: nodes={<node_count> | <hostname>} to get an accurate value of TSK in case of nodes=<hostname> use. Regards, sr
Slightly different issue in 2.4.x (yes, I know it's old, but it works!).. [root@bruce-m ~]# qstat -u samuel -a bruce-m.vlsci.unimelb.edu.au: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------------- ------ ----- --- ------ ----- - ----- 979069.bruce-m.v samuel batch STDIN 29325 1 bru -- 01:00 C 00:00 979070.bruce-m.v samuel batch STDIN 10876 1 1 -- 01:00 R -- The first job requested a specific node, but instead of being converted to a number it was just passed through truncated at 3 characters. Don't know if that's better or worse than the behaviour in 3.0.x. :)
Bug 165 - qstat -a reports wrong 0 value in TSK column if nodes is a hostname This bug still present in torque 4.1.4 qsbu ... -l nodes=mynode001:ppn=10 => bug TSK=0 qsub ... -l nodes=1:ppn=10 => ok TSK=10 (but different in meaning of course) Thx.