[torquedev] cpu use reporting

Brock Palen brockp at umich.edu
Tue Jun 13 08:35:27 MDT 2006


Attached is the output from qmgr -c 'print server'   I assume this is  
what you mean by full server and queue config?
We do not do suspend or preemption though we plan to in the future,  
but not now.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pbs.config
Type: application/octet-stream
Size: 4924 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torquedev/attachments/20060613/1c68eeed/pbs.obj
-------------- next part --------------

Brock Palen
Center for Advanced Computing
brockp at umich.edu
(734)936-1985


On Jun 12, 2006, at 7:22 PM, garrick at speculation.org wrote:

> On Mon, Jun 12, 2006 at 10:53:08AM -0400, Brock Palen alleged:
>> We are seeing strange problems with torque-2.1.0p0  backed by
>> maui-3.2.6p14.
>> Basically a single cpu is being allocated more than once.
>> Example
>>
>> eliza:/usr/local/maui brockp$ qstat -n1 | grep nem064
>> 1809.nemesis.engin.u USER   cac_seri CsI_4P       6667     1  --
>> --  120:0 R 44:55   nem064/0
>> 1871.nemesis.engin.u USER   cac_seri Brain       11318     1  --
>> --  120:0 R 31:03   nem064/0
>>
>> Maui is correctly only putting a two single cpu jobs on the node (2
>> cpu nodes)  as it just cares about the number of tasks.  Its this
>> reporting thats bad.  These machines appear as free in qmgr,
>
> Don't worry about "free", that only applies to the state reported from
> pbs_mom with regards to the node's load average and $ideal_load and
> $max_load configs.
>
>
>> Qmgr: l n nem064
>> Node nem064
>>         state = free
>>         np = 2
>>         properties = myrinet
>>         ntype = cluster
>>         jobs = 0/1871.nemesis.engin.umich.edu,
>> 0/1809.nemesis.engin.umich.edu
>
> It's more bad than just reporting.  pbs_server should not allocate the
> same CPU to multiple jobs.
>
> Can you send me full server, queue, and job configs?  Are you using
> job suspend preemption?
>
> _______________________________________________
> torquedev mailing list
> torquedev at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torquedev
>
>



More information about the torquedev mailing list