[torqueusers] TORQUE 2.5.13 release candidate

Ken Nielson knielson at adaptivecomputing.com
Thu Aug 1 16:24:03 MDT 2013


Thanks. I will see if I can reproduce that here.

Ken

On Thu, Aug 1, 2013 at 4:21 PM, Burkhard Bunk <bunk at physik.hu-berlin.de>wrote:

> Hi,
>
> the problem shows up if a queue does not have a limit on walltime
> or if an array job is queried as a whole (elapsed time is undefined,
> I guess).
>
> Example 1: no limit on Elapsed Time => 00:00:00
>
>                                               Req'd       Elap
>         Job ID                  ...           Time    S   Time
>         ----------------------- ----------- --------- - ---------
>         4119.gargamel.physik.h  ...               --  R  00:00:00
>
> "qstat -f" does show the correct resources_used.walltime.
> The info is there, the problem is with the display by qstat on the client
> side.
>
> Example 2: compact display of array job shows "Elap Time" = "Req'd Time".
>
>                                               Req'd       Elap
>         Job ID                  ...           Time    S   Time
>         ----------------------- ----------- --------- - ---------
>         60603[].irz14.physik.h  ...         720:00:00 R 720:00:00
>
> If "Req'd Time" is unset, I see "Elap Time" = 00:00:00 .
> I would prefer the behavior of 2.5.11: "Elap Time" = -- .
>
> Explicit display of array members (qstat -at) is correct:
>
>         60603[2].irz14.physik.  ...         720:00:00 R  08:00:25
>
>
> All that is easy to understand, given the fact that qstat.c in 2.5.13
> tries to compute
>
>
>         elap_time = req_walltime - rem_walltime
>
> which doesn't work if "Requested Time" or "Remaining Time" is undefined
> (falls back to zero).
>
>
> Regards,
> Burkhard Bunk.
> ------------------------------**------------------------------**----------
>  bunk at physik.hu-berlin.de      Physics Institute, Humboldt University
>  fax:    ++49-30 2093 7628     Newtonstr. 15
>  phone:  ++49-30 2093 7980     12489 Berlin, Germany
> ------------------------------**------------------------------**----------
>
> On Thu, 1 Aug 2013, Ken Nielson wrote:
>
>  Burkhard,
>>
>> I looked at the output of qstat -a and the elapsed time seems to be
>> working as
>> expected to me. What are you getting and what do you expect?
>>
>> Ken
>>
>> On Thu, Aug 1, 2013 at 11:09 AM, Burkhard Bunk <bunk at physik.hu-berlin.de>
>> wrote:
>>       Hi,
>>
>>       unfortunately the behavior of "Elapsed Time" in "qstat -a" is
>>       still
>>       slightly screwed up, as reported two days ago.
>>
>>       The problem is in src/cmds/qstat.c:
>>       "eltimewal" was replaced by "elap_time_string", which is computed
>>       from
>>       "rem_walltime" in
>>
>>           878     if ((*jstate != 'Q') && (*jstate != 'C'))
>>           879       {
>>           880       elap_time = req_walltime - rem_walltime;
>>           881       time_to_string(elap_time_**string, elap_time);
>>           882       }
>>
>>       This will fail if "req_walltime" is not available.
>>
>>       I fixed it by reverting to "eltimewal" in the print statements
>>       (lines 904 and 924), and it works for me.
>>
>>       This is a preliminary hack, not a real fix. I'm sure that the
>>       "elap_time_string" mechanism was introduced for a reason (which I
>>       don't understand), and it should be fixed accordingly.
>>
>>       Regards,
>>       Burkhard Bunk.
>>       ------------------------------**------------------------------**
>> ----------
>>        bunk at physik.hu-berlin.de      Physics Institute, Humboldt
>>       University
>>        fax:    ++49-30 2093 7628     Newtonstr. 15
>>        phone:  ++49-30 2093 7980     12489 Berlin, Germany
>>       ------------------------------**------------------------------**
>> ----------
>>
>>       On Wed, 31 Jul 2013, Ken Nielson wrote:
>>
>>             Hi all,
>>
>>             I have fixed the last reported but with 2.5.13. The
>>             download is available
>>             through GitHub. You can download it with the following
>>             command.
>>
>>             git clone
>>             https://github.com/**adaptivecomputing/torque.git<https://github.com/adaptivecomputing/torque.git>-b
>>             2.5.13 2.5.13
>>
>>             Please let us know right away of any major problems.
>>
>>             Regards
>>
>>             --
>>             Ken Nielson
>>             +1 801.717.3700 office +1 801.717.3738 fax
>>             1712 S. East Bay Blvd, Suite 300  Provo, UT  84606
>>             www.adaptivecomputing.com
>>
>>
>>
>> ______________________________**_________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://www.supercluster.org/**mailman/listinfo/torqueusers<http://www.supercluster.org/mailman/listinfo/torqueusers>
>>
>>
>>
>>
>> --
>> Ken Nielson
>> +1 801.717.3700 office +1 801.717.3738 fax
>> 1712 S. East Bay Blvd, Suite 300  Provo, UT  84606
>> www.adaptivecomputing.com
>>
>>
>>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>


-- 
Ken Nielson
+1 801.717.3700 office +1 801.717.3738 fax
1712 S. East Bay Blvd, Suite 300  Provo, UT  84606
www.adaptivecomputing.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20130801/3d542dab/attachment-0001.html 


More information about the torqueusers mailing list