[torquedev] New 2.5.6 snapshot
siegert at sfu.ca
Mon Apr 11 20:05:00 MDT 2011
On Thu, Apr 07, 2011 at 05:05:04PM -0600, Ken Nielson wrote:
> There is a new snapshot for 2.5.6 available. This fixes a problem with
> a patch for Bugzilla 116 where the new resource procct was added. If the
> -l nodes option was not used in a job submission then the job would not
> be run by Moab because procct was added to the Resource_List attribute
> and treated like a generic resource by Moab. Because the generic resource
> procct does not exist Moab never schedules the job.
> This is now fixed.
> You can download this snapshot at http://www.clusterresources.com/downloads/torque/snapshots/torque-2.5.6-snap.201104071657.tar.gz
> Please download and let us know if you find any problems.
I am afraid this does not work: I haven't traced this back to the
source routine, but apparently this new version presets the nodes
resource to 1, correct?
Thus, if a user only requests -l procs=N, with 2.5.6-snap.201104071657
procct is set to N+1, not N, see
resc_def_all.c, line 1118:
torque-2.5.6-snap.201104041023 actually worked flawlessly for me.
Which means that I haven't figured out how to trigger the bug that
torque-2.5.6-snap.201104071657 was supposed to fix.
Regardless of whether I specified -l nodes=... or -l procs=... or
neither moab always started my job, i.e., the procct resource
always got removed before the job was sent to moab, see,
svr_jobfunc.c, line 1965:
if (strcmp(pque->qu_attr->at_val.at_str, "Execution") == 0)
/* job routed to Execution queue successfully */
/* unset job's procct resource */
pctdef = find_resc_def(svr_resc_def, "procct", svr_resc_size);
if ((pctresc = find_resc_entry(&pjob->ji_wattr[JOB_ATR_resource], pctdef)) != NULL)
If somebody can explain to me how to submit a job that is not caught in
this if block, I may be able to fix this.
Simon Fraser University
More information about the torquedev