[torqueusers] Node is not responding!
halmabrazi at idtdna.com
Thu Oct 13 13:34:22 MDT 2011
First, I am newbie to Torque and this is my first message to this group. I hope I will not waste anyone's time by asking such stupid question but I have tried to look for some answers in the archived listinfo but since there is no search capabilities built in I find it harder to find what I need.
Here is where I am so far:
I installed the Torque 3.0 package on my Linux box (SUSE 11.2). I also configured a node on a different VM that is running SUSE as well. It seems things are installed and configured correctly (I think).
When I run the pbsnodes I get
state = free
np = 1
ntype = cluster
status = rectime=1318533469,varattr=,jobs=,state=free,netload=116214003,gres=,loadave=0.00,ncpus=1,physmem=1017908kb,availmem=3012532kb,totmem=3115056kb,idletime=76,nusers=2,nsessions=7,sessions=1753 1767 1770 1889 1894 1997 3017,uname=Linux suse-ptpd-16 2.6.34-12-desktop #1 SMP PREEMPT 2010-06-29 02:39:08 +0200 i686,opsys=linux
mom_service_port = 15002
mom_manager_port = 15003
gpus = 0
When I shut down the node it changes to "down" in the state. This tells me everything is okay.
However, when I tried to send my first job to the node. I used this example found online
# --- send the output to the test.out file
# the default is .o<jobid>
#PBS -o test.out
# --- send the error output to the test.err file
# the default is .e<jobid>
#PBS -e test.err
echo "Print out the hostname and date"
And then I ran it from the head node (not as a root)
Looking at the submitted jobs ( I submitted the jobs twice)
Job id Name User Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
16.suse-halmabr test.job torqueuser 0 Q batch
17.suse-halmabr test.job torqueuser 0 Q batch
However, nothing seems to be happening after that.
Can any body tell me what I am doing wrong or if I am missing something here? Also, it will be great if someone can direct me to the right site for examples on how to use the server that will be highly appreciated.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the torqueusers