[Mauiusers] procs= with torque 3.05-1 and maui 3.3.1-1

Daniel Davidson danield at igb.uiuc.edu
Fri Sep 21 13:54:29 MDT 2012


I am working on finalizing our cluster setup, and as part of that is 
nailing down the torque/maui config.

I have been looking at what happens in maui when someone submits qsub -l 
procs=x blah.sh to their script.  Right now, it looks like maui is 
ignoring the procs line.  Here is an example:

bash-4.1$ qsub -I -q test_queue -l procs=6
qsub: waiting for job 76338.biocluster.igb.illinois.edu to start
qsub: job 76338.biocluster.igb.illinois.edu ready

-bash-4.1$

However, when i do a tracejob:

[root at biocluster init.d]# tracejob -v 76338
/var/spool/torque/server_priv/accounting/20120921: Successfully located 
matching job records
/var/spool/torque/server_logs/20120921: Successfully located matching 
job records
/var/spool/torque/mom_logs/20120921: No such file or directory
/var/spool/torque/sched_logs/20120921: No such file or directory

Job: 76338.biocluster.igb.illinois.edu

09/21/2012 14:47:16  S    enqueuing into test_queue, state 1 hop 1
09/21/2012 14:47:16  S    Job Queued at request of 
danield at biocluster.igb.illinois.edu, owner = 
danield at biocluster.igb.illinois.edu, job name = STDIN, queue = test_queue
09/21/2012 14:47:16  A    queue=test_queue
09/21/2012 14:47:17  S    Job Run at request of 
maui at biocluster.igb.illinois.edu
09/21/2012 14:47:17  S    Not sending email: User does not want mail of 
this type.
09/21/2012 14:47:17  A    user=danield group=danield jobname=STDIN 
queue=test_queue ctime=1348256836 qtime=1348256836 etime=1348256836 
start=1348256837 owner=danield at biocluster.igb.illinois.edu 
exec_host=compute-0-1/0
                           Resource_List.mem=3gb Resource_List.ncpus=1 
Resource_List.neednodes=1 Resource_List.nodect=1 Resource_List.nodes=1 
Resource_List.procs=6

So it looks like only one processor is reserved.  If I change procs=6 to 
nodes=1:ppn=6 then it works right:
[root at biocluster init.d]# tracejob -v 76340
/var/spool/torque/server_priv/accounting/20120921: Successfully located 
matching job records
/var/spool/torque/server_logs/20120921: Successfully located matching 
job records
/var/spool/torque/mom_logs/20120921: No such file or directory
/var/spool/torque/sched_logs/20120921: No such file or directory

Job: 76340.biocluster.igb.illinois.edu

09/21/2012 14:50:12  S    enqueuing into test_queue, state 1 hop 1
09/21/2012 14:50:12  S    Job Queued at request of 
danield at biocluster.igb.illinois.edu, owner = 
danield at biocluster.igb.illinois.edu, job name = STDIN, queue = test_queue
09/21/2012 14:50:12  A    queue=test_queue
09/21/2012 14:50:13  S    Job Run at request of 
maui at biocluster.igb.illinois.edu
09/21/2012 14:50:13  S    Not sending email: User does not want mail of 
this type.
09/21/2012 14:50:13  A    user=danield group=danield jobname=STDIN 
queue=test_queue ctime=1348257012 qtime=1348257012 etime=1348257012 
start=1348257013 owner=danield at biocluster.igb.illinois.edu
exec_host=compute-0-1/5+compute-0-1/4+compute-0-1/3+compute-0-1/2+compute-0-1/1+compute-0-1/0 
Resource_List.mem=3gb Resource_List.ncpus=1 
Resource_List.neednodes=1:ppn=6 Resource_List.nodect=1
                           Resource_List.nodes=1:ppn=6

Can someone let me know why this would be, and why isnt ncpus set 
correctly in the lastjob.  If I am mistaken about what the procs field 
mean, please let me know.

Dan


More information about the mauiusers mailing list