[torqueusers] Re:[Torque] Job is running but never completed

Manoj Kumar Singh manoks at cat.ernet.in
Sun Oct 31 21:18:44 MST 2004


Dear Torque users

When i run a test job over our six nodes cluster, the job is accepted and
it shows that job is running, But it never completed. I mean to say that
even after three days after submittiing the jobs, qstat shows that job is
running.
The Job "testpbs" is very simple script and it compeleted momentaly just
after submitting the job on the server.

Following is the output of qstat -f copmmand

 $qstat -f
Job Id: 2.brahma.cluster.lmd.cat.ernet.in
    Job_Name = testpbs
    Job_Owner = mksingh at brahma.cluster.lmd.cat.ernet.in
    job_state = R
    queue = workq
    server = brahma.cluster.lmd.cat.ernet.in
    Checkpoint = u
    ctime = Sat Oct 30 12:03:11 2004
    Error_Path = brahma.cluster.lmd.cat.ernet.in:/home/mksingh/testpbs.e2
    exec_host = node4/0+node3/0+node2/0+node1/0+brahma/0
    Hold_Types = n
    Join_Path = n
    Keep_Files = n
    Mail_Points = a
    mtime = Mon Nov  1 09:39:07 2004
    Output_Path = brahma.cluster.lmd.cat.ernet.in:/home/mksingh/testpbs.o2
    Priority = 0
    qtime = Sat Oct 30 12:03:11 2004
    Rerunable = True
    Resource_List.neednodes = 5
    Resource_List.nodect = 5
    Resource_List.nodes = 5
    substate = 42
    Variable_List = PBS_O_HOME=/home/mksingh,PBS_O_LANG=en_US.UTF-8,
        PBS_O_LOGNAME=mksingh,
        PBS_O_PATH=/home/mksingh/Ab-Initio/Graphics/XCrySDen-B1.0bin-static:/u
        sr/local/sbin:/usr/local/pgi/linux86/5.2/bin:/opt/intel_cc_80/bin:/opt/
        intel_fc_80/bin:/usr/local/mpich-1.2.6/ch_p4/bin:/usr/local/lf9561/bin:
        /usr/kerberos/bin:/home/mksingh/Ab-Initio/Graphics/XCrySDen-B1.0bin-sta
        tic:/usr/local/sbin:/usr/local/pgi/linux86/5.2/bin:/opt/intel_cc_80/bin
        :/opt/intel_fc_80/bin:/usr/local/mpich-1.2.6/ch_p4/bin:/usr/local/lf956
        1/bin:/usr/local/bin:/bin:/usr/bin:/home/mksingh/Ab-Initio/Graphics/XCr
        ySDen-B1.0bin-static/scripts:/home/mksingh/Ab-Initio/Graphics/XCrySDen-
        B1.0bin-static/util:/usr/X11R6/bin:/home/mksingh/Ab-Initio/Graphics/XCr
        ySDen-B1.0bin-static/scripts:/home/mksingh/Ab-Initio/Graphics/XCrySDen-
        B1.0bin-static/util:/home/mksingh/bin,
        PBS_O_MAIL=/var/spool/mail/mksingh,PBS_O_SHELL=/bin/bash,
        PBS_O_HOST=brahma.cluster.lmd.cat.ernet.in,
euser = mksingh
    egroup = mksingh
    hashname = 2.brahma.cl
    queue_rank = 2
    queue_type = E
    comment = Job started on Mon Nov 01 at 09:39
    etime = Sat Oct 30 12:03:11 2004

Please help me.


Thank you

I am
Manoj
Scientist, CAT
Indore- INDIA






More information about the torqueusers mailing list