[torqueusers] getting started

Alexander Saydakov saydakov at yahoo-inc.com
Thu Jan 19 13:17:27 MST 2006


Hi!

 

I am trying to play with Torque 2.0.0p5 on FreeBSD. First I had problems
compiling the thing. Looks like a couple of lines are missing in
src/resmom/freebsd/mom_mach.c. Here is my patch:

 

141a142,143

> extern int LOGLEVEL;

>

1754a1757,1758

>   char        *id = "setmax";

>

 

After that it appears to build and run just fine. Except gui x* binaries are
missing despite being enabled by default. Looks like some build script can
not figure out tclsh for src/tools/xpbsmon/buildindex. I did not care too
much for now.

 

One more suspicious thing is in pbsnodes report:

 

status = opsys=freebsd,uname=FreeBSD 4.10:i386,sessions= 60195 132 59821
59851 164 68625 5032,nsessions=7,nusers=4,idletime=258322,totmem=?
15201,availmem=?
15201,physmem=2058388kb,ncpus=4,loadave=1.98,gres=pbsserver:pilgrim.corp,net
load=? 15201,state=busy,jobs=? 15201,rectime=1136576517

 

What are those 'totmem=? 15201'? Is it an indication of a problem?

 

Now the real question: I would like to run batches of 500 rather small jobs
(each takes from few minutes to, say, ~1 hour). Each job is an instance of
the same script with a number from 1 to 500 as a parameter. I have 5
dual-CPU x86 machines, so I configured 4 slots (ncpus) on each. I would like
to submit 500 jobs so they occupy all 20 slots, and the next one starts as
soon as one slot becomes free. What is the best way to do that? Currently I
run qsub 500 times, which is slow. It would be great if I can treat those
jobs as a group: hold, delete, change priority. Is it possible?

 

Thanks a lot.

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20060119/fa919017/attachment.html


More information about the torqueusers mailing list