[Mauiusers] Multiple job request peculiarities

Marvin Novaglobal marvin.novaglobal at gmail.com
Thu Mar 24 06:27:04 MDT 2011


Hi,
    On my setup,
$ qsub -l nodes=1:ppn=12:1:ppn=1 (works)
$ qsub -l nodes=2:ppn=12:1:ppn=1 (works)
$ qsub -l nodes=3:ppn=12:1:ppn=1 (job goes to idle and never get executed)
$ qsub -l nodes=4:ppn=12:1:ppn=1 (works)
$ qsub -l nodes=5:ppn=12:1:ppn=1 (job goes to idle and never get executed)

<Maui.cfg>
...
ENABLEMULTINODEJOBS[0]            TRUE
ENABLEMULTIREQJOBS[0]              TRUE
JOBNODEMATCHPOLICY[0]             EXACTNODE
NODEALLOCATIONPOLICY[0]           MINRESOURCE


<Torque.cfg>
set server scheduling = True
set server acl_hosts = aquarius.local
set server managers = torque at aquarius
set server operators = torque at aquarius
set server default_queue = DEFAULT
set server log_events = 511
set server mail_from = adm
set server resources_available.nodect = 2048
set server scheduler_iteration = 600
set server node_check_rate = 150
set server tcp_timeout = 6
set server mom_job_sync = True
set server keep_completed = 300
set server next_job_number = 377

<maui.log>
03/24 20:23:48 MResDestroy(377)
03/24 20:23:48 MResChargeAllocation(377,2)
03/24 20:23:48
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
03/24 20:23:48 INFO:     total jobs selected in partition ALL: 1/1
03/24 20:23:48
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,DEFAULT,FReason,TRUE)
03/24 20:23:48 INFO:     total jobs selected in partition DEFAULT: 1/1
03/24 20:23:48 MQueueScheduleIJobs(Q,DEFAULT)
03/24 20:23:48 INFO:     72 feasible tasks found for job 377:0 in partition
DEFAULT (36 Needed)
03/24 20:23:48 INFO:     72 feasible tasks found for job 377:1 in partition
DEFAULT (1 Needed)
03/24 20:23:48 ALERT:    inadequate tasks to allocate to job 377:1 (0 < 1)
03/24 20:23:48 ERROR:    cannot allocate nodes to job '377' in partition
DEFAULT
03/24 20:23:48 MJobPReserve(377,DEFAULT,ResCount,ResCountRej)
03/24 20:23:48 MJobReserve(377,Priority)
03/24 20:23:48 INFO:     72 feasible tasks found for job 377:0 in partition
DEFAULT (36 Needed)
03/24 20:23:48 INFO:     72 feasible tasks found for job 377:1 in partition
DEFAULT (1 Needed)
03/24 20:23:48 INFO:     72 feasible tasks found for job 377:0 in partition
DEFAULT (36 Needed)
03/24 20:23:48 INFO:     72 feasible tasks found for job 377:1 in partition
DEFAULT (1 Needed)
03/24 20:23:48 INFO:     located resources for 36 tasks (144) in best
partition DEFAULT for job 377 at time 00:00:01
03/24 20:23:48 INFO:     tasks located for job 377:  37 of 36 required (144
feasible)
03/24 20:23:48 MResJCreate(377,MNodeList,00:00:01,Priority,Res)
03/24 20:23:48 INFO:     job '377' reserved 36 tasks (partition DEFAULT) to
start in 00:00:01 on Thu Mar 24 20:23:49
 (WC: 2592000)

<pbs_server.log>
03/24/2011 20:23:17;0100;PBS_Server;Job;377.aquarius;enqueuing into DEFAULT,
state 1 hop 1
03/24/2011 20:23:17;0008;PBS_Server;Job;377.aquarius;Job Queued at request
of torque at aquarius, owner = torque at aquarius, job name = parallel.sh, queue =
DEFAULT
03/24/2011 20:23:17;0040;PBS_Server;Svr;aquarius;Scheduler was sent the
command new


Anyone encounter problem with multiple job requests?


Regards,
Marvin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20110324/338efa53/attachment.html 


More information about the mauiusers mailing list