[torqueusers] jobs remain in Q state

Fernando Malick fmalick at yahoo.com.ar
Mon Jan 21 14:30:24 MST 2008


Hi, I'm new to torque, and I'm having a problem getting jobs done.

I compiled and installed torque 2.2.1, followed configuration steps, created a queue named "batch" in a server named "gandalf". Then I wrote a very primitive script to send to the queue, and when I ask for the queue stats I get this:


qstat
Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
6.gandalf                 nada             root                   0 Q batch
7.gandalf                 nada             root                   0 Q batch


I have pbs_server, pbs_mom and pbs_sched  running, but something is happening that jobs remain in queue and don't get executed.

If someone is able to guide me or give me a clue as to what is happening, I will be very grateful.
I'm including the logs, 
-----------------------------------------------------------------------------------------------------------------------
server_logs:
01/21/2008 16:25:57;0002;PBS_Server;Svr;Log;Log opened
01/21/2008 16:25:57;0006;PBS_Server;Svr;PBS_Server;Server gandalf.xxx.yyy started, initialization type = 1
01/21/2008 16:25:57;0002;PBS_Server;Svr;Act;Account file /var/spool/torque/server_priv/accounting/20080121 opened
01/21/2008 16:25:57;0040;PBS_Server;Req;setup_nodes;setup_nodes()
01/21/2008 16:25:57;0086;PBS_Server;Svr;PBS_Server;Recovered queue batch
01/21/2008 16:25:57;0002;PBS_Server;Svr;PBS_Server;Expected 1, recovered 1 queues
01/21/2008 16:25:57;0100;PBS_Server;Job;7.gandalf.xxx.yyy;enqueuing into batch, state 1 hop 1
01/21/2008 16:25:57;0086;PBS_Server;Job;7.gandalf.xxx.yyy;Requeueing job, substate: 10 Requeued in queue: batch
01/21/2008 16:25:57;0100;PBS_Server;Job;6.gandalf.xxx.yyy;enqueuing into batch, state 1 hop 1
01/21/2008 16:25:57;0086;PBS_Server;Job;6.gandalf.xxx.yyy;Requeueing job, substate: 10 Requeued in queue: batch
01/21/2008 16:25:57;0002;PBS_Server;Svr;PBS_Server;Expected 2, recovered 2 jobs
01/21/2008 16:25:57;0006;PBS_Server;Svr;PBS_Server;Using ports Server:15001  Scheduler:15004  MOM:15002
01/21/2008 16:25:57;0002;PBS_Server;Svr;PBS_Server;Server Ready, pid = 6180, loglevel=0
01/21/2008 16:26:02;0040;PBS_Server;Req;ping_nodes;ping attempting to contact 1 nodes
01/21/2008 16:26:15;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:26:15;0100;PBS_Server;Req;;Type StatusJob request received from root at gandalf..xxx.yyy, sock=9
01/21/2008 16:31:12;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:31:12;0100;PBS_Server;Req;;Type StatusJob request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:31:38;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:31:38;0100;PBS_Server;Req;;Type StatusServer request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:31:38;0100;PBS_Server;Req;;Type StatusJob request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:32:21;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:32:21;0100;PBS_Server;Req;;Type StatusJob request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:33:35;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:33:35;0100;PBS_Server;Req;;Type StatusServer request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:33:35;0100;PBS_Server;Req;;Type StatusJob request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:34:07;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:34:07;0100;PBS_Server;Req;;Type StatusServer request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:34:07;0100;PBS_Server;Req;;Type StatusJob request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:34:13;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:34:13;0100;PBS_Server;Req;;Type StatusServer request received from root at gandalf..xxx.yyy, sock=9
01/21/2008 16:34:13;0100;PBS_Server;Req;;Type StatusJob request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:34:59;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:34:59;0100;PBS_Server;Req;;Type StatusServer request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:34:59;0100;PBS_Server;Req;;Type SelStat request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:35:07;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:35:07;0100;PBS_Server;Req;;Type StatusServer request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:35:07;0100;PBS_Server;Req;;Type SelStat request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:35:27;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:35:27;0100;PBS_Server;Req;;Type StatusQueue request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:36:25;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:36:25;0100;PBS_Server;Req;;Type StatusServer request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:38:22;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:38:22;0100;PBS_Server;Req;;Type ModifyJob request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:38:25;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:38:25;0100;PBS_Server;Req;;Type StatusJob request received from root at gandalf.xxx..yyy, sock=9
01/21/2008 16:41:13;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:41:13;0100;PBS_Server;Req;;Type Manager request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:41:13;0004;PBS_Server;Que;batch;attributes set:  at request of root at gandalf.xxx.yyy
01/21/2008 16:41:13;0004;PBS_Server;Que;batch;attributes set: enabled = TRUE
01/21/2008 16:41:15;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:41:15;0100;PBS_Server;Req;;Type StatusJob request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:41:25;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:41:25;0100;PBS_Server;Req;;Type Manager request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:41:25;0004;PBS_Server;Que;batch;attributes set:  at request of root at gandalf.xxx.yyy
01/21/2008 16:41:25;0004;PBS_Server;Que;batch;attributes set: enabled = TRUE
01/21/2008 16:41:27;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:41:27;0100;PBS_Server;Req;;Type StatusJob request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:41:57;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:41:57;0100;PBS_Server;Req;;Type Manager request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:41:57;0004;PBS_Server;Que;batch;attributes set:  at request of root at gandalf.xxx.yyy
01/21/2008 16:41:57;0004;PBS_Server;Que;batch;attributes set: started = TRUE
01/21/2008 16:42:00;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:42:00;0100;PBS_Server;Req;;Type StatusJob request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:42:06;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:42:06;0100;PBS_Server;Req;;Type Manager request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:42:06;0004;PBS_Server;Que;batch;attributes set:  at request of root at gandalf.xxx.yyy
01/21/2008 16:42:06;0004;PBS_Server;Que;batch;attributes set: started = TRUE
01/21/2008 16:42:09;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:42:09;0100;PBS_Server;Req;;Type Manager request received from root at gandalf..xxx.yyy, sock=9
01/21/2008 16:42:09;0004;PBS_Server;Que;batch;attributes set:  at request of root at gandalf.xxx.yyy
01/21/2008 16:42:09;0004;PBS_Server;Que;batch;attributes set: started = TRUE
01/21/2008 16:42:13;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:42:13;0100;PBS_Server;Req;;Type StatusJob request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:42:46;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:42:46;0100;PBS_Server;Req;;Type OrderJob request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:42:48;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:42:48;0100;PBS_Server;Req;;Type StatusJob request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:42:59;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:42:59;0100;PBS_Server;Req;;Type OrderJob request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:43:00;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:43:00;0100;PBS_Server;Req;;Type StatusJob request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 16:51:55;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 16:51:55;0100;PBS_Server;Req;;Type StatusJob request received from root at gandalf.xxx.yyy, sock=9
01/21/2008 17:17:57;0100;PBS_Server;Req;;Type AuthenticateUser request received from root at gandalf.xxx.yyy, sock=10
01/21/2008 17:17:57;0100;PBS_Server;Req;;Type StatusJob request received from root at gandalf.xxx.yyy, sock=9

----------------------------------------------------------------------------------------------------------------
mom_logs:

01/21/2008 16:25:08;0002;   pbs_mom;Svr;Log;Log opened
01/21/2008 16:25:08;0002;   pbs_mom;Svr;setpbsserver;gandalf.xxx.yyy
01/21/2008 16:25:08;0002;   pbs_mom;Svr;setpbsserver;server gandalf.xxx.yyy added
01/21/2008 16:25:08;0002;   pbs_mom;n/a;initialize;independent
01/21/2008 16:25:08;0080;   pbs_mom;Svr;pbs_mom;before init_abort_jobs
01/21/2008 16:25:08;0002;   pbs_mom;Svr;pbs_mom;Is up
01/21/2008 16:25:08;0002;   pbs_mom;Svr;mom_main;MOM executable path and mtime at launch: /usr/local/sbin/pbs_mom 1200328428
01/21/2008 16:25:08;0002;   pbs_mom;n/a;mom_main;hello sent to server gandalf.xxx.yyy
01/21/2008 16:25:54;0002;   pbs_mom;Svr;im_eof;End of File from addr www.xxx.yyy.zzz:15001
01/21/2008 16:25:54;0002;   pbs_mom;n/a;mom_main;hello sent to server gandalf.xxx.yyy

------------------------------------------------------------------------------------------------------------------
sched_logs:
01/21/2008 16:25:35;0002; pbs_sched;Svr;Log;Log opened
01/21/2008 16:25:35;0002; pbs_sched;Svr;TokenAct;Account file /var/spool/torque/sched_priv/accounting/20080121 opened
01/21/2008 16:25:35;0002; pbs_sched;Svr;main;pbs_sched startup pid 6175










      Yahoo! Encuentros.

Ahora encontrar pareja es mucho más fácil, probá el nuevo Yahoo! Encuentros http://yahoo.cupidovirtual.com/servlet/NewRegistration
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20080121/5a330731/attachment.html


More information about the torqueusers mailing list