[torquedev] New 2.5.6 snapshot

Ken Nielson knielson at adaptivecomputing.com
Tue Apr 12 09:50:47 MDT 2011


On 04/11/2011 08:05 PM, Martin Siegert wrote:
> Hi,
>
> On Thu, Apr 07, 2011 at 05:05:04PM -0600, Ken Nielson wrote:
>> There is a new snapshot for 2.5.6 available. This fixes a problem with
>> a patch for Bugzilla 116 where the new resource procct was added. If the
>> -l nodes option was not used in a job submission then the job would not
>> be run by Moab because procct was added to the Resource_List attribute
>> and treated like a generic resource by Moab. Because the generic resource
>> procct does not exist Moab never schedules the job.
>>
>> This is now fixed.
>>
>> You can download this snapshot at http://www.clusterresources.com/downloads/torque/snapshots/torque-2.5.6-snap.201104071657.tar.gz
>>
>> Please download and let us know if you find any problems.
> I am afraid this does not work: I haven't traced this back to the
> source routine, but apparently this new version presets the nodes
> resource to 1, correct?
> Thus, if a user only requests -l procs=N, with 2.5.6-snap.201104071657
> procct is set to N+1, not N, see
>
> resc_def_all.c, line 1118:
>
>      ppct->rs_value.at_val.at_long =
>        count_proc(pnodesp->rs_value.at_val.at_str)
>        + pprocsp->rs_value.at_val.at_long;
>
> torque-2.5.6-snap.201104041023 actually worked flawlessly for me.
> Which means that I haven't figured out how to trigger the bug that
> torque-2.5.6-snap.201104071657 was supposed to fix.
> Regardless of whether I specified -l nodes=... or -l procs=... or
> neither moab always started my job, i.e., the procct resource
> always got removed before the job was sent to moab, see,
>
> svr_jobfunc.c, line 1965:
>
>        if (strcmp(pque->qu_attr->at_val.at_str, "Execution") == 0)
>          {
>          /* job routed to Execution queue successfully */
>          /* unset job's procct resource */
>          resource_def *pctdef;
>          resource *pctresc;
>          pctdef = find_resc_def(svr_resc_def, "procct", svr_resc_size);
>          if ((pctresc = find_resc_entry(&pjob->ji_wattr[JOB_ATR_resource], pctdef)) != NULL)
>             pctdef->rs_free(&pctresc->rs_value);
>          }
>        }
>
> If somebody can explain to me how to submit a job that is not caught in
> this if block, I may be able to fix this.
>
> Cheers,
> Martin
>
Martin,

I will fix it unless of course you fix it first.

Ken

-- 

<http://www.adaptivecomputing.com/news/moabcon.php>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torquedev/attachments/20110412/8758e96d/attachment.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: MoabCon_250px.png
Type: image/png
Size: 11771 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torquedev/attachments/20110412/8758e96d/attachment.png 


More information about the torquedev mailing list