[torqueusers] Lack of nodes & giant server log

Ari Pollak aripollak at gmail.com
Mon Nov 5 16:04:50 MST 2007


Hi all,

I'm testing a new Torque 2.2.1 setup, currently with one MOM set to np=2.

After doing a simple qsub test with: "for i in `seq 1000`; do echo 'sleep 10' |
qsub -q testQ; done", the server log file grew over 500 MB, with tons of
messages like this every second (many lines for every job still in the queue):
11/05/2007 15:43:04;0100;PBS_Server;Req;;Type ModifyJob request received from Sc
heduler at englab01, sock=10
11/05/2007 15:43:04;0008;PBS_Server;Job;1016.englab01;Job Modified at request of
 Scheduler at englab01
11/05/2007 15:43:04;0100;PBS_Server;Req;;Type RunJob request received from Sched
uler at englab01, sock=10
11/05/2007 15:43:04;0008;PBS_Server;Job;1016.englab01;could not locate requested
 resources '1:accumulator' (node_spec failed) job allocation request exceeds cur
rently available cluster nodes, 1 requested, 0 available
11/05/2007 15:43:04;0080;PBS_Server;Req;req_reject;Reject reply code=15044(Resou
rce temporarily unavailable MSG=job allocation request exceeds currently availab
le cluster nodes, 1 requested, 0 available), aux=0, type=RunJob, from Scheduler@
englab01

I couldn't imagine this is good for performance, so I've tried turning down the
verbosity by setting log_events=0, but there are still lots of entries being
written every second. Most of the other server variables are set to their
defaults, and I'm using the default C pbs_sched.

Is there anything else I could do to prevent writing so much useless information
so frequently? Would switching to Maui help at all? 

Thanks,
Ari



More information about the torqueusers mailing list