[torqueusers] a problem about torque (Post job file processing error)

=?big5?B?s6+0ZqzA?= chwhs at hpds.ee.ncku.edu.tw
Fri Jan 26 03:16:45 MST 2007


Hi

when I execute 
qsub testpbs     -->   it' ok

qsub testpbs -l nodes=2  -->   no output

So I use tracejob command to trace the job 

Job: 65.g1.ee.ncku.edu.tw

01/21/2007 20:26:16  M    JOIN JOB as node 1
01/21/2007 20:26:16  S    enqueuing into batch, state 1 hop 1
01/21/2007 20:26:16  S    Job Queued at request of cs at g1.ee.ncku.edu.tw, owner = cs at g1.ee.ncku.edu.tw, job name =
                          testpbs, queue = batch
01/21/2007 20:26:16  S    Job Modified at request of Scheduler at g1.ee.ncku.edu.tw
01/21/2007 20:26:16  L    Job Run
01/21/2007 20:26:16  S    Job Run at request of Scheduler at g1.ee.ncku.edu.tw
01/21/2007 20:26:17  S    Exit_status=0 resources_used.cput=00:00:00 resources_used.mem=536kb resources_used.vmem=6392kb
                          resources_used.walltime=00:00:00
01/21/2007 20:26:17  M    kill_job received
01/21/2007 20:26:25  S    Post job file processing error
01/21/2007 20:26:25  S    dequeuing from batch, state COMPLETE

------------------------------------------------------------------------------
pbsnodes -a
g1
     state = free
     np = 4
     ntype = cluster
     status = opsys=linux,uname=Linux g1.ee.ncku.edu.tw 2.6.15-1.2054_FC5 #1 SMP Tue Mar 14 15:48:20 EST 2006 x86_64,sessions=7332 669 25304 25614 25797 6273 7121 14546 14601 31432,nsessions=10,nusers=5,idletime=17343,totmem=7535192kb,availmem=6838968kb,physmem=3342236kb,ncpus=4,loadave=0.04,netload=3296700297,state=free,jobs=? 15201,rectime=1169384151

g2
     state = free
     np = 4
     ntype = cluster
     status = opsys=linux,uname=Linux g2.ee.ncku.edu.tw 2.6.15-1.2054_FC5 #1 SMP Tue Mar 14 15:48:20 EST 2006 x86_64,sessions=2704 15991 16026,nsessions=3,nusers=2,idletime=349429,totmem=5438732kb,availmem=4471124kb,physmem=3342260kb,ncpus=4,loadave=0.00,netload=5519608940,state=free,jobs=? 15201,rectime=1169384196

g3
     state = free
     np = 4
     ntype = cluster
     status = opsys=linux,uname=Linux g3.ee.ncku.edu.tw 2.6.15-1.2054_FC5 #1 SMP Tue Mar 14 15:48:20 EST 2006 x86_64,sessions=2286 10173 10549,nsessions=3,nusers=2,idletime=258120,totmem=11270312kb,availmem=10289352kb,physmem=3342244kb,ncpus=4,loadave=0.00,netload=485218860,state=free,jobs=? 15201,rectime=1169384176

-------------------------------------------------------------------------------
[root at g1 ~]# /opt/pbs/bin/qmgr -c 'print server'
#
# Create queues and set their attributes.
#
#
# Create and define queue batch
#
create queue batch
set queue batch queue_type = Execution
set queue batch enabled = True
set queue batch started = True
#
# Set server attributes.
#
set server scheduling = True
set server operators = root at g1.ee.ncku.edu.tw
set server default_queue = batch
set server log_events = 511
set server mail_from = adm
set server resources_default.nodes = 1
set server scheduler_iteration = 600
set server node_check_rate = 150
set server tcp_timeout = 6
set server pbs_version = 2.0.0p11


---------------------------------------------------------------
I don't know what I forget to do.
So I am rather at sea.   ^^a

what shoud I do?
thanks for your help!

Regards,
San.



 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20070126/06bb690d/attachment.html


More information about the torqueusers mailing list