[torqueusers] Re:[Torque] Job is running but never completed
Manoj Kumar Singh
manoks at cat.ernet.in
Sun Oct 31 21:18:44 MST 2004
Dear Torque users
When i run a test job over our six nodes cluster, the job is accepted and
it shows that job is running, But it never completed. I mean to say that
even after three days after submittiing the jobs, qstat shows that job is
running.
The Job "testpbs" is very simple script and it compeleted momentaly just
after submitting the job on the server.
Following is the output of qstat -f copmmand
$qstat -f
Job Id: 2.brahma.cluster.lmd.cat.ernet.in
Job_Name = testpbs
Job_Owner = mksingh at brahma.cluster.lmd.cat.ernet.in
job_state = R
queue = workq
server = brahma.cluster.lmd.cat.ernet.in
Checkpoint = u
ctime = Sat Oct 30 12:03:11 2004
Error_Path = brahma.cluster.lmd.cat.ernet.in:/home/mksingh/testpbs.e2
exec_host = node4/0+node3/0+node2/0+node1/0+brahma/0
Hold_Types = n
Join_Path = n
Keep_Files = n
Mail_Points = a
mtime = Mon Nov 1 09:39:07 2004
Output_Path = brahma.cluster.lmd.cat.ernet.in:/home/mksingh/testpbs.o2
Priority = 0
qtime = Sat Oct 30 12:03:11 2004
Rerunable = True
Resource_List.neednodes = 5
Resource_List.nodect = 5
Resource_List.nodes = 5
substate = 42
Variable_List = PBS_O_HOME=/home/mksingh,PBS_O_LANG=en_US.UTF-8,
PBS_O_LOGNAME=mksingh,
PBS_O_PATH=/home/mksingh/Ab-Initio/Graphics/XCrySDen-B1.0bin-static:/u
sr/local/sbin:/usr/local/pgi/linux86/5.2/bin:/opt/intel_cc_80/bin:/opt/
intel_fc_80/bin:/usr/local/mpich-1.2.6/ch_p4/bin:/usr/local/lf9561/bin:
/usr/kerberos/bin:/home/mksingh/Ab-Initio/Graphics/XCrySDen-B1.0bin-sta
tic:/usr/local/sbin:/usr/local/pgi/linux86/5.2/bin:/opt/intel_cc_80/bin
:/opt/intel_fc_80/bin:/usr/local/mpich-1.2.6/ch_p4/bin:/usr/local/lf956
1/bin:/usr/local/bin:/bin:/usr/bin:/home/mksingh/Ab-Initio/Graphics/XCr
ySDen-B1.0bin-static/scripts:/home/mksingh/Ab-Initio/Graphics/XCrySDen-
B1.0bin-static/util:/usr/X11R6/bin:/home/mksingh/Ab-Initio/Graphics/XCr
ySDen-B1.0bin-static/scripts:/home/mksingh/Ab-Initio/Graphics/XCrySDen-
B1.0bin-static/util:/home/mksingh/bin,
PBS_O_MAIL=/var/spool/mail/mksingh,PBS_O_SHELL=/bin/bash,
PBS_O_HOST=brahma.cluster.lmd.cat.ernet.in,
euser = mksingh
egroup = mksingh
hashname = 2.brahma.cl
queue_rank = 2
queue_type = E
comment = Job started on Mon Nov 01 at 09:39
etime = Sat Oct 30 12:03:11 2004
Please help me.
Thank you
I am
Manoj
Scientist, CAT
Indore- INDIA
More information about the torqueusers
mailing list