[torqueusers] a problem about torque (Post job file processing
error)
=?big5?B?s6+0ZqzA?=
chwhs at hpds.ee.ncku.edu.tw
Fri Jan 26 03:16:45 MST 2007
Hi
when I execute
qsub testpbs --> it' ok
qsub testpbs -l nodes=2 --> no output
So I use tracejob command to trace the job
Job: 65.g1.ee.ncku.edu.tw
01/21/2007 20:26:16 M JOIN JOB as node 1
01/21/2007 20:26:16 S enqueuing into batch, state 1 hop 1
01/21/2007 20:26:16 S Job Queued at request of cs at g1.ee.ncku.edu.tw, owner = cs at g1.ee.ncku.edu.tw, job name =
testpbs, queue = batch
01/21/2007 20:26:16 S Job Modified at request of Scheduler at g1.ee.ncku.edu.tw
01/21/2007 20:26:16 L Job Run
01/21/2007 20:26:16 S Job Run at request of Scheduler at g1.ee.ncku.edu.tw
01/21/2007 20:26:17 S Exit_status=0 resources_used.cput=00:00:00 resources_used.mem=536kb resources_used.vmem=6392kb
resources_used.walltime=00:00:00
01/21/2007 20:26:17 M kill_job received
01/21/2007 20:26:25 S Post job file processing error
01/21/2007 20:26:25 S dequeuing from batch, state COMPLETE
------------------------------------------------------------------------------
pbsnodes -a
g1
state = free
np = 4
ntype = cluster
status = opsys=linux,uname=Linux g1.ee.ncku.edu.tw 2.6.15-1.2054_FC5 #1 SMP Tue Mar 14 15:48:20 EST 2006 x86_64,sessions=7332 669 25304 25614 25797 6273 7121 14546 14601 31432,nsessions=10,nusers=5,idletime=17343,totmem=7535192kb,availmem=6838968kb,physmem=3342236kb,ncpus=4,loadave=0.04,netload=3296700297,state=free,jobs=? 15201,rectime=1169384151
g2
state = free
np = 4
ntype = cluster
status = opsys=linux,uname=Linux g2.ee.ncku.edu.tw 2.6.15-1.2054_FC5 #1 SMP Tue Mar 14 15:48:20 EST 2006 x86_64,sessions=2704 15991 16026,nsessions=3,nusers=2,idletime=349429,totmem=5438732kb,availmem=4471124kb,physmem=3342260kb,ncpus=4,loadave=0.00,netload=5519608940,state=free,jobs=? 15201,rectime=1169384196
g3
state = free
np = 4
ntype = cluster
status = opsys=linux,uname=Linux g3.ee.ncku.edu.tw 2.6.15-1.2054_FC5 #1 SMP Tue Mar 14 15:48:20 EST 2006 x86_64,sessions=2286 10173 10549,nsessions=3,nusers=2,idletime=258120,totmem=11270312kb,availmem=10289352kb,physmem=3342244kb,ncpus=4,loadave=0.00,netload=485218860,state=free,jobs=? 15201,rectime=1169384176
-------------------------------------------------------------------------------
[root at g1 ~]# /opt/pbs/bin/qmgr -c 'print server'
#
# Create queues and set their attributes.
#
#
# Create and define queue batch
#
create queue batch
set queue batch queue_type = Execution
set queue batch enabled = True
set queue batch started = True
#
# Set server attributes.
#
set server scheduling = True
set server operators = root at g1.ee.ncku.edu.tw
set server default_queue = batch
set server log_events = 511
set server mail_from = adm
set server resources_default.nodes = 1
set server scheduler_iteration = 600
set server node_check_rate = 150
set server tcp_timeout = 6
set server pbs_version = 2.0.0p11
---------------------------------------------------------------
I don't know what I forget to do.
So I am rather at sea. ^^a
what shoud I do?
thanks for your help!
Regards,
San.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20070126/06bb690d/attachment.html
More information about the torqueusers
mailing list