[torqueusers] qsub ... queue hang
Zhiliang Hu
zhu at iastate.edu
Wed Dec 5 09:34:55 MST 2007
At 11:37 PM 12/4/2007, Garrick Staples wrote:
>>> sh run | qsub
>>> 49.cluster2.xxxx.xxxxxxx.xxx
>>>
>>> -- it hangs there forever:
>
>'sh run' is executing and qsub is waiting for it to exit so it can submit the
>output as a job.
>
>I think you want 'echo sh run | qsub'.
That makes sense!
Now I tried:
> qsub run
-- It sends the "run" job to the queue and stays there (hang).
> qsub -l nodes=6 run
-- It sends the "run" job to the queue, and took a little while to disappear from the queue (it worked! :-). But I don't see anything back. I then tried another "run" job in which it directs output to a file:
/opt/openmpi.gcc/bin/mpirun -np 12 -machinefile ./machines
/usr/local/bin/mpiblast -p blastn
-i /home/zhu/tests/mpiblast/datain4
-d bta.genome.chr
-o /home/zhu/tests/mpiblast/out
However this job did appear on, and then disappear from, the queue; but I don't see output anywhere (Note: the script runs well without "qsub").
This brings up a few more questions:
1. Of course -- where does the output go?
2. It appears "qsub" requires to know the number of nodes
to run the job. However the "miprun" also requires so.
- I can use a "machinefile" to tell "mpirun" which node to use;
How can I do similar to "qsub"?
- I have 2 processors on each node so I can specify "-np 12"
to tell "mpirun" to fire up 12 processes on 6 nodes.
How can I let "qsub" know the same info?
These questions may appear simple to experts but I have a hard time to abstract useful information from a few Turque tutorial web sites. Any hint would be appreciated...
Thanks in advance!
Zhiliang
More information about the torqueusers
mailing list