[torqueusers] procs= not working as documented (or understood?)

Glen Beane glen.beane at gmail.com
Tue Feb 14 12:21:53 MST 2012


if you switch schedulers you should be able to just stop your old
schedule and let the running jobs continue to run, and then startup
your new scheduler.  no need to kill any running jobs

pbs_sched is a simple FIFO scheduler, it doesn't have fairshare

On Tue, Feb 14, 2012 at 2:16 PM, Lance Westerhoff
<lance at quantumbioinc.com> wrote:
>
> Hi Glen-
>
> Yeah, that matches what we have seen as well. I have tried several different maui versions, and the problem seems to be pretty consistent.
>
> I assume that pbs_sched does not yet have FAIRSHARE - is that correct?
>
> I have put in a request for a quote for Moab, so we'll see what they come back with.
>
> Related question: if I wish to change to a different scheduler, do I need to shut down all of currently running jobs, or will jobs scheduled using the old scheduler (maui) run unaffected when I switch to the new scheduler?
>
> Thanks!
>
> -Lance
>
> On Feb 14, 2012, at 1:59 PM, Glen Beane wrote:
>
>> -l procs= is supported by Moab and pbs_sched,  I do not believe that
>> Maui handles it properly.
>>
>>
>>
>>
>> On Tue, Feb 14, 2012 at 1:11 PM, Lance Westerhoff
>> <lance at quantumbioinc.com> wrote:
>>>
>>> Hello All-
>>>
>>> We're still having trouble with this feature, and we are starting to shop around for a torque/maui replacement in order to be able to use it. Before we do that however, I wanted to see if anyone has any thoughts on how to address the problem within torque/maui. Perhaps I simply don't understand the feature. The versions of torque and maui we are using are.
>>>
>>>        torque-3.0.2
>>>        maui-3.2.6p21
>>>
>>> Yes, we have tried newer versions of maui, but then the option doesn't work at all.
>>>
>>> Here is the scenario (I also included the conversation from November below for more information).
>>>
>>> Conceptually, our software is almost infinitely scalable in the sense that there is very little overhead associated with interprocess communication. Therefore, we do not require that all of the processes reside on a small number of nodes. In fact, we can stretch the processors to any and all nodes in the cluster with ~zero loss in performance. So we can literally have one node that has a single process running and another node that has 8 processes running. Since we have that level of scalability, we don't want to have to lock ourselves into having to request resources using the "nodes=X:ppn=Y" style since this style requires that nodes open up or drain in order to use them. Since our users have a big mixture of single and multi-processor jobs, waiting for node drain can really waste a lot of resources.
>>>
>>> I saw the "procs=#" the Requesting Resources table (see http://www.clusterresources.com/torquedocs/2.1jobsubmission.shtml#resources for more). It *appears* that this option should be able to allow the user to request simply X*Y processors and the scheduler should be able to schedule them any way it can fit. So using the following #PBS note, we should be able to request 40 processors:
>>>
>>> #PBS -l procs=40
>>>
>>> Instead, we see that the scheduler seems to take this information, read it, and basically disregard it. The reason I know it reads it is because if I ask for say 40 processors and 40 processors are available in the cluster, it works as expected and all is right with the world. Where it gets a bit more choppy is when I ask for 40 processors and only 1 processor is available. The job doesn't wait in the queue for the remaining 39 processors to open up, and instead PBS simply just starts the job on that processor. I can't see how that is anything but a bug. If the user is asking for 40 processors, why isn't the scheduler waiting for all 40 processors to open up?
>>>
>>> I'll also post this to the maui list so I apologize if you receive it twice. I'm just not sure if this is a problem with torque, maui, or both. If answering this question will require additional information, please ask. We are at our wits end here.
>>>
>>> Thanks!
>>>
>>> -Lance
>>>
>>>
>>>
>>>
>>> On Nov 18, 2011, at 11:12 AM, Lance Westerhoff wrote:
>>>
>>>>
>>>> Hi Steve-
>>>>
>>>> Here you go. Here is the top few lines of the job script. I have then provided the output you requested long with the maui.cfg. If you need anything further, certainly please let me know.
>>>>
>>>> Thanks for your help!
>>>>
>>>> ===============
>>>>
>>>> + head job.pbs
>>>>
>>>> #!/bin/bash
>>>> #PBS -S /bin/bash
>>>> #PBS -l procs=100
>>>> #PBS -l pmem=700mb
>>>> #PBS -l walltime=744:00:00
>>>> #PBS -j oe
>>>> #PBS -q batch
>>>>
>>>> Report run on Fri Nov 18 10:49:38 EST 2011
>>>> + pbsnodes --version
>>>> version: 3.0.2
>>>> + diagnose --version
>>>> maui client version 3.2.6p21
>>>> + checkjob 371010
>>>>
>>>>
>>>> checking job 371010
>>>>
>>>> State: Running
>>>> Creds:  user:josh  group:games  class:batch  qos:DEFAULT
>>>> WallTime: 00:02:35 of 31:00:00:00
>>>> SubmitTime: Fri Nov 18 10:46:33
>>>>  (Time Queued  Total: 00:00:01  Eligible: 00:00:01)
>>>>
>>>> StartTime: Fri Nov 18 10:46:34
>>>> Total Tasks: 1
>>>>
>>>> Req[0]  TaskCount: 26  Partition: DEFAULT
>>>> Network: [NONE]  Memory >= 700M  Disk >= 0  Swap >= 0
>>>> Opsys: [NONE]  Arch: [NONE]  Features: [NONE]
>>>> Dedicated Resources Per Task: PROCS: 1  MEM: 700M
>>>> NodeCount: 10
>>>> Allocated Nodes:
>>>> [compute-0-17:7][compute-0-10:4][compute-0-3:2][compute-0-5:3]
>>>> [compute-0-6:1][compute-0-7:2][compute-0-9:1][compute-0-12:2]
>>>> [compute-0-13:2][compute-0-14:2]
>>>>
>>>>
>>>> IWD: [NONE]  Executable:  [NONE]
>>>> Bypass: 0  StartCount: 1
>>>> PartitionMask: [ALL]
>>>> Flags:       RESTARTABLE
>>>>
>>>> Reservation '371010' (-00:02:09 -> 30:23:57:51  Duration: 31:00:00:00)
>>>> PE:  26.00  StartPriority:  4716
>>>>
>>>> + cat /opt/maui/maui.cfg | grep -v "#" | grep "^[A-Z]"
>>>> SERVERHOST            gondor
>>>> ADMIN1                maui root
>>>> ADMIN3                ALL
>>>> RMCFG[base]  TYPE=PBS
>>>> AMCFG[bank]  TYPE=NONE
>>>> RMPOLLINTERVAL        00:01:00
>>>> SERVERPORT            42559
>>>> SERVERMODE            NORMAL
>>>> LOGFILE               maui.log
>>>> LOGFILEMAXSIZE        10000000
>>>> LOGLEVEL              3
>>>> QUEUETIMEWEIGHT       1
>>>> FSPOLICY              DEDICATEDPS
>>>> FSDEPTH               7
>>>> FSINTERVAL            86400
>>>> FSDECAY               0.50
>>>> FSWEIGHT              200
>>>> FSUSERWEIGHT          1
>>>> FSGROUPWEIGHT         1000
>>>> FSQOSWEIGHT           1000
>>>> FSACCOUNTWEIGHT       1
>>>> FSCLASSWEIGHT         1000
>>>> USERWEIGHT            4
>>>> BACKFILLPOLICY        FIRSTFIT
>>>> RESERVATIONPOLICY     CURRENTHIGHEST
>>>> NODEALLOCATIONPOLICY  MINRESOURCE
>>>> RESERVATIONDEPTH            8
>>>> MAXJOBPERUSERPOLICY         OFF
>>>> MAXJOBPERUSERCOUNT          8
>>>> MAXPROCPERUSERPOLICY        OFF
>>>> MAXPROCPERUSERCOUNT         256
>>>> MAXPROCSECONDPERUSERPOLICY  OFF
>>>> MAXPROCSECONDPERUSERCOUNT   36864000
>>>> MAXJOBQUEUEDPERUSERPOLICY   OFF
>>>> MAXJOBQUEUEDPERUSERCOUNT    2
>>>> JOBNODEMATCHPOLICY          EXACTNODE
>>>> NODEACCESSPOLICY            SHARED
>>>> JOBMAXOVERRUN 99:00:00:00
>>>> DEFERCOUNT 8192
>>>> DEFERTIME  0
>>>> CLASSCFG[developer] FSTARGET=40.00+
>>>> CLASSCFG[lowprio] PRIORITY=-1000
>>>> SRCFG[developer] CLASSLIST=developer
>>>> SRCFG[developer] ACCESS=dedicated
>>>> SRCFG[developer] DAYS=Mon,Tue,Wed,Thu,Fri
>>>> SRCFG[developer] STARTTIME=08:00:00
>>>> SRCFG[developer] ENDTIME=18:00:00
>>>> SRCFG[developer] TIMELIMIT=2:00:00
>>>> SRCFG[developer] RESOURCES=PROCS(8)
>>>> USERCFG[DEFAULT]      FSTARGET=100.0
>>>>
>>>> ===============
>>>>
>>>> -Lance
>>>>
>>>>
>>>> On Nov 18, 2011, at 9:47 AM, Steve Crusan wrote:
>>>>
>>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>>> Hash: SHA1
>>>>>
>>>>>
>>>>> On Nov 18, 2011, at 9:33 AM, Lance Westerhoff wrote:
>>>>>
>>>>>> The request that is placed is for procs=60. Both torque and maui see that there are only 53 processors available and instead of letting the job sit in the queue and wait for all 60 processors to become available, it goes ahead and runs the job with what's available. Now if the user could ask for procs=[50-60] where 50 is the minimum number of processors to provide and 60 is the maximum, this would be a feature. But as it stands, if the user asks for 60 processors and ends up with 2 processors, the job just won't scale properly and he may as well kill it (when it shouldn't have run anyway).
>>>>>
>>>>> Hi Lance,
>>>>>
>>>>>      Can you post the output of checkjob <jobid> of an incorrectly running job. Let's take a look at what Maui thinks the job is asking for.
>>>>>
>>>>>      Might as well add your maui.cfg file also.
>>>>>
>>>>>      I've found in the past that procs= is troublesome...
>>>>>
>>>>>>
>>>>>> I'm actually beginning to think the problem may be related to maui. Perhaps I'll post this same question to the maui list and see what comes back.
>>>>>>
>>>>>> This problem is infuriating though since without the functionality working as it should, using procs=X in torque/maui makes torque/maui work more like a submission and run system (not a queuing system).
>>>>>
>>>>> Agreed. HPC cluster job management is normally be set it and forget it. Anything else other than maintenance/break fixes/new features would be ridiculously time consuming.
>>>>>
>>>>>>
>>>>>> -Lance
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Message: 3
>>>>>>> Date: Thu, 17 Nov 2011 17:29:17 -0800
>>>>>>> From: "Brock Palen" <brockp at umich.edu>
>>>>>>> Subject: Re: [torqueusers] procs= not working as documented
>>>>>>> To: "Torque Users Mailing List" <torqueusers at supercluster.org>
>>>>>>> Message-ID: <20111118012930.C635E83A8026 at mail.adaptivecomputing.com>
>>>>>>> Content-Type: text/plain; charset="utf-8"
>>>>>>>
>>>>>>> Does maui only see one cpu or does mpiexec only see one cpu?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Brock Palen
>>>>>>> (734)936-1985
>>>>>>> brockp at umich.edu
>>>>>>> - Sent from my Palm Pre, please excuse typos
>>>>>>> On Nov 17, 2011 3:19 PM, Lance Westerhoff &lt;lance at quantumbioinc.com&gt; wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hello All-
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> It appears that when running with the following specs, the procs= option does not actually work as expected.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ==========================================
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> #PBS -S /bin/bash
>>>>>>>
>>>>>>> #PBS -l procs=60
>>>>>>>
>>>>>>> #PBS -l pmem=700mb
>>>>>>>
>>>>>>> #PBS -l walltime=744:00:00
>>>>>>>
>>>>>>> #PBS -j oe
>>>>>>>
>>>>>>> #PBS -q batch
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> torque version: tried 3.0.2. in v2.5.4, I think the procs option worked as documented
>>>>>>>
>>>>>>> maui version: 3.2.6p21 (also tried maui 3.3.1 but it is a complete fail in terms of the procs option and it only asks for a single CPU)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ==========================================
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> If there are fewer then 60 processors available in the cluster (in this case there were 53 available) the job will go in an take whatever is left instead of waiting for all 60 processors to free up. Any thoughts as to why this might be happening? Sometimes it doesn't really matter and 53 would be almost as good as 60, however if only 2 processors are available and the user asks for 60, I would hate for him to go in.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thank you for your time!
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -Lance
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> torqueusers mailing list
>>>>>> torqueusers at supercluster.org
>>>>>> http://www.supercluster.org/mailman/listinfo/torqueusers
>>>>>
>>>>> ----------------------
>>>>> Steve Crusan
>>>>> System Administrator
>>>>> Center for Research Computing
>>>>> University of Rochester
>>>>> https://www.crc.rochester.edu/
>>>>>
>>>>>
>>>>> -----BEGIN PGP SIGNATURE-----
>>>>> Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
>>>>> Comment: GPGTools - http://gpgtools.org
>>>>>
>>>>> iQEcBAEBAgAGBQJOxnAEAAoJENS19LGOpgqK2CEH/Ry+THjmhxdTzcIZ5d5YYCP/
>>>>> bYQY2QthvbaEkUhh+q26m2EWrmPGHRgW9zXOx/fRBE2ejZE+EycpRLMdWDTOxn28
>>>>> cK1qs+ITaiOevNbxufd7pt/P5hhvafQgsDtuy8RPGokgqSuRBEH9i8DZAFfIASQZ
>>>>> tQ9YE5MSqEfaoTSwOVP2PXJCgEJh2ZU5GHO2UvmxF4SX4+7HePUgQYzmzIBu2cW8
>>>>> JeeIpaf2AuNIvXjG3ZNA3FjHWQEZefiZhRTQxeE1PHuQCLWPnfTwz0nzquCHZBJv
>>>>> Ufc1wOGanDi+LosRldVIUgAyHGcAcOvZzFnxlfNrYa2xfJSCyuC86YB4XNfpO1c=
>>>>> =AGW7
>>>>> -----END PGP SIGNATURE-----
>>>>> _______________________________________________
>>>>> torqueusers mailing list
>>>>> torqueusers at supercluster.org
>>>>> http://www.supercluster.org/mailman/listinfo/torqueusers
>>>>
>>>
>>> _______________________________________________
>>> torqueusers mailing list
>>> torqueusers at supercluster.org
>>> http://www.supercluster.org/mailman/listinfo/torqueusers
>> _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/torqueusers
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers


More information about the torqueusers mailing list