[Mauiusers] Multiple job request peculiarities

Marvin Novaglobal marvin.novaglobal at gmail.com
Thu Mar 24 20:46:08 MDT 2011


Hi Peter,
    It doesn't work for my setup. I meant it only applies to nodes=3 and
nodes=5 so far. We don't have enough resources to test on nodes=7. So again,
qsub -l nodes=1:ppn=12+1:ppn=1 will work but
qsub -l nodes=3:ppn=12+1:ppn=1 will not work
    May I know which version of Maui and Torque you are using? Your Maui and
Torque's config also please.



Regards,
Marvin


On Fri, Mar 25, 2011 at 12:20 AM, Peter Michael Crosta <pmc2107 at columbia.edu
> wrote:

> Hi Marvin,
>
> I have gotten multiple resource requests to work by using the "+" sign.
> Have you tried
>
> qsub -l nodes=3:ppn=12+1:ppn=1 ?
>
> Best,
> Peter
>
>
> On Thu, 24 Mar 2011, Marvin Novaglobal wrote:
>
>  Hi,    On my setup,
>> $ qsub -l nodes=1:ppn=12:1:ppn=1 (works)
>> $ qsub -l nodes=2:ppn=12:1:ppn=1 (works)
>> $ qsub -l nodes=3:ppn=12:1:ppn=1 (job goes to idle and never get executed)
>> $ qsub -l nodes=4:ppn=12:1:ppn=1 (works)
>> $ qsub -l nodes=5:ppn=12:1:ppn=1 (job goes to idle and never get executed)
>>
>> <Maui.cfg>
>> ...
>> ENABLEMULTINODEJOBS[0]            TRUE
>> ENABLEMULTIREQJOBS[0]              TRUE
>> JOBNODEMATCHPOLICY[0]             EXACTNODE
>> NODEALLOCATIONPOLICY[0]           MINRESOURCE
>>
>>
>> <Torque.cfg>
>> set server scheduling = True
>> set server acl_hosts = aquarius.local
>> set server managers = torque at aquarius
>> set server operators = torque at aquarius
>> set server default_queue = DEFAULT
>> set server log_events = 511
>> set server mail_from = adm
>> set server resources_available.nodect = 2048
>> set server scheduler_iteration = 600
>> set server node_check_rate = 150
>> set server tcp_timeout = 6
>> set server mom_job_sync = True
>> set server keep_completed = 300
>> set server next_job_number = 377
>>
>> <maui.log>
>> 03/24 20:23:48 MResDestroy(377)
>> 03/24 20:23:48 MResChargeAllocation(377,2)
>> 03/24 20:23:48
>> MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
>> 03/24 20:23:48 INFO:     total jobs selected in partition ALL: 1/1
>> 03/24 20:23:48
>> MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,DEFAULT,FReason,TRUE)
>> 03/24 20:23:48 INFO:     total jobs selected in partition DEFAULT: 1/1
>> 03/24 20:23:48 MQueueScheduleIJobs(Q,DEFAULT)
>> 03/24 20:23:48 INFO:     72 feasible tasks found for job 377:0 in
>> partition
>> DEFAULT (36 Needed)
>> 03/24 20:23:48 INFO:     72 feasible tasks found for job 377:1 in
>> partition
>> DEFAULT (1 Needed)
>> 03/24 20:23:48 ALERT:    inadequate tasks to allocate to job 377:1 (0 < 1)
>> 03/24 20:23:48 ERROR:    cannot allocate nodes to job '377' in partition
>> DEFAULT
>> 03/24 20:23:48 MJobPReserve(377,DEFAULT,ResCount,ResCountRej)
>> 03/24 20:23:48 MJobReserve(377,Priority)
>> 03/24 20:23:48 INFO:     72 feasible tasks found for job 377:0 in
>> partition
>> DEFAULT (36 Needed)
>> 03/24 20:23:48 INFO:     72 feasible tasks found for job 377:1 in
>> partition
>> DEFAULT (1 Needed)
>> 03/24 20:23:48 INFO:     72 feasible tasks found for job 377:0 in
>> partition
>> DEFAULT (36 Needed)
>> 03/24 20:23:48 INFO:     72 feasible tasks found for job 377:1 in
>> partition
>> DEFAULT (1 Needed)
>> 03/24 20:23:48 INFO:     located resources for 36 tasks (144) in best
>> partition DEFAULT for job 377 at time 00:00:01
>> 03/24 20:23:48 INFO:     tasks located for job 377:  37 of 36 required
>> (144
>> feasible)
>> 03/24 20:23:48 MResJCreate(377,MNodeList,00:00:01,Priority,Res)
>> 03/24 20:23:48 INFO:     job '377' reserved 36 tasks (partition DEFAULT)
>> to
>> start in 00:00:01 on Thu Mar 24 20:23:49
>>  (WC: 2592000)
>>
>> <pbs_server.log>
>> 03/24/2011 20:23:17;0100;PBS_Server;Job;377.aquarius;enqueuing into
>> DEFAULT,
>> state 1 hop 1
>> 03/24/2011 20:23:17;0008;PBS_Server;Job;377.aquarius;Job Queued at request
>> of torque at aquarius, owner = torque at aquarius, job name = parallel.sh,
>> queue =
>> DEFAULT
>> 03/24/2011 20:23:17;0040;PBS_Server;Svr;aquarius;Scheduler was sent the
>> command new
>>
>>
>> Anyone encounter problem with multiple job requests?
>>
>>
>> Regards,
>> Marvin
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20110325/c4ed99c8/attachment.html 


More information about the mauiusers mailing list