[torqueusers] jobs remain in Q state

Fernando Malick fmalick at yahoo.com.ar
Tue Jan 22 04:46:22 MST 2008


#
# Create queues and set their attributes.
#
#
# Create and define queue batch
#
qmgr -c 'p s'
create queue batch
set queue batch queue_type = Execution
set queue batch resources_max.walltime = 01:00:00
set queue batch resources_default.nodes = 1
set queue batch resources_default.walltime = 00:01:00
set queue batch enabled = True
set queue batch started = True
#
# Set server attributes.
#
set server acl_roots = root@*
set server managers = root at gandalf.xx.yy.zz
set server operators = root at gandalf.xx.yy.zz
set server log_events = 511
set server mail_from = adm
set server scheduler_iteration = 600
set server node_check_rate = 150
set server tcp_timeout = 6
set server pbs_version = 2.2.1


----- Mensaje original ----
De: Chris Vaughan <chris at clusterresources.com>
Para: Fernando Malick <fmalick at yahoo.com.ar>
CC: torqueusers at supercluster.org
Enviado: martes 22 de enero de 2008, 9:33:02
Asunto: Re: [torqueusers] jobs remain in Q state

Fernando,

Can you provide the output of qmgr -c 'p s', thanks.

Fernando Malick wrote:
> Hi, I'm new to torque, and I'm having a problem getting jobs done.
>
> I compiled and installed torque 2.2.1, followed configuration steps, 
> created a queue named "batch" in a server named "gandalf". Then I 
> wrote a very primitive script to send to the queue, and when I ask
 for 
> the queue stats I get this:
>
>
> qstat
> Job id                    Name             User            Time Use S
 
> Queue
> ------------------------- ---------------- --------------- -------- -
 
> -----
> 6.gandalf                 nada             root                   0 Q
 
> batch
> 7.gandalf                 nada             root                   0 Q
 
> batch
>
>
> I have pbs_server, pbs_mom and pbs_sched  running, but something is 
> happening that jobs remain in queue and don't get executed.
>
> If someone is able to guide me or give me a clue as to what is 
> happening, I will be very grateful.
> I'm including the logs,
>
 -----------------------------------------------------------------------------------------------------------------------
> server_logs:
> 01/21/2008 16:25:57;0002;PBS_Server;Svr;Log;Log opened
> 01/21/2008 16:25:57;0006;PBS_Server;Svr;PBS_Server;Server 
> gandalf.xxx.yyy started, initialization type = 1
> 01/21/2008 16:25:57;0002;PBS_Server;Svr;Act;Account file 
> /var/spool/torque/server_priv/accounting/20080121 opened
> 01/21/2008 16:25:57;0040;PBS_Server;Req;setup_nodes;setup_nodes()
> 01/21/2008 16:25:57;0086;PBS_Server;Svr;PBS_Server;Recovered queue
 batch
> 01/21/2008 16:25:57;0002;PBS_Server;Svr;PBS_Server;Expected 1, 
> recovered 1 queues
> 01/21/2008 16:25:57;0100;PBS_Server;Job;7.gandalf.xxx.yyy;enqueuing 
> into batch, state 1 hop 1
> 01/21/2008 16:25:57;0086;PBS_Server;Job;7.gandalf.xxx.yyy;Requeueing 
> job, substate: 10 Requeued in queue: batch
> 01/21/2008 16:25:57;0100;PBS_Server;Job;6.gandalf.xxx.yyy;enqueuing 
> into batch, state 1 hop 1
> 01/21/2008 16:25:57;0086;PBS_Server;Job;6.gandalf.xxx.yyy;Requeueing 
> job, substate: 10 Requeued in queue: batch
> 01/21/2008 16:25:57;0002;PBS_Server;Svr;PBS_Server;Expected 2, 
> recovered 2 jobs
> 01/21/2008 16:25:57;0006;PBS_Server;Svr;PBS_Server;Using ports 
> Server:15001  Scheduler:15004  MOM:15002
> 01/21/2008 16:25:57;0002;PBS_Server;Svr;PBS_Server;Server Ready, pid
 = 
> 6180, loglevel=0
> 01/21/2008 16:26:02;0040;PBS_Server;Req;ping_nodes;ping attempting to
 
> contact 1 nodes
> 01/21/2008 16:26:15;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:26:15;0100;PBS_Server;Req;;Type StatusJob request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:31:12;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:31:12;0100;PBS_Server;Req;;Type StatusJob request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:31:38;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:31:38;0100;PBS_Server;Req;;Type StatusServer request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:31:38;0100;PBS_Server;Req;;Type StatusJob request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:32:21;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:32:21;0100;PBS_Server;Req;;Type StatusJob request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:33:35;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:33:35;0100;PBS_Server;Req;;Type StatusServer request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:33:35;0100;PBS_Server;Req;;Type StatusJob request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:34:07;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:34:07;0100;PBS_Server;Req;;Type StatusServer request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:34:07;0100;PBS_Server;Req;;Type StatusJob request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:34:13;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx..yyy, sock=10
> 01/21/2008 16:34:13;0100;PBS_Server;Req;;Type StatusServer request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:34:13;0100;PBS_Server;Req;;Type StatusJob request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:34:59;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:34:59;0100;PBS_Server;Req;;Type StatusServer request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:34:59;0100;PBS_Server;Req;;Type SelStat request
 received 
> from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:35:07;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:35:07;0100;PBS_Server;Req;;Type StatusServer request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:35:07;0100;PBS_Server;Req;;Type SelStat request
 received 
> from root at gandalf..xxx.yyy, sock=9
> 01/21/2008 16:35:27;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:35:27;0100;PBS_Server;Req;;Type StatusQueue request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:36:25;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:36:25;0100;PBS_Server;Req;;Type StatusServer request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:38:22;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:38:22;0100;PBS_Server;Req;;Type ModifyJob request 
> received from root at gandalf.xxx..yyy, sock=9
> 01/21/2008 16:38:25;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:38:25;0100;PBS_Server;Req;;Type StatusJob request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:41:13;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx..yyy, sock=10
> 01/21/2008 16:41:13;0100;PBS_Server;Req;;Type Manager request
 received 
> from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:41:13;0004;PBS_Server;Que;batch;attributes set:  at 
> request of root at gandalf.xxx.yyy
> 01/21/2008 16:41:13;0004;PBS_Server;Que;batch;attributes set: enabled
 
> = TRUE
> 01/21/2008 16:41:15;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:41:15;0100;PBS_Server;Req;;Type StatusJob request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:41:25;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:41:25;0100;PBS_Server;Req;;Type Manager request
 received 
> from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:41:25;0004;PBS_Server;Que;batch;attributes set:  at 
> request of root at gandalf.xxx.yyy
> 01/21/2008 16:41:25;0004;PBS_Server;Que;batch;attributes set: enabled
 
> = TRUE
> 01/21/2008 16:41:27;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:41:27;0100;PBS_Server;Req;;Type StatusJob request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:41:57;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:41:57;0100;PBS_Server;Req;;Type Manager request
 received 
> from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:41:57;0004;PBS_Server;Que;batch;attributes set:  at 
> request of root at gandalf.xxx.yyy
> 01/21/2008 16:41:57;0004;PBS_Server;Que;batch;attributes set: started
 
> = TRUE
> 01/21/2008 16:42:00;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:42:00;0100;PBS_Server;Req;;Type StatusJob request 
> received from root at gandalf..xxx.yyy, sock=9
> 01/21/2008 16:42:06;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:42:06;0100;PBS_Server;Req;;Type Manager request
 received 
> from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:42:06;0004;PBS_Server;Que;batch;attributes set:  at 
> request of root at gandalf.xxx.yyy
> 01/21/2008 16:42:06;0004;PBS_Server;Que;batch;attributes set: started
 
> = TRUE
> 01/21/2008 16:42:09;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:42:09;0100;PBS_Server;Req;;Type Manager request
 received 
> from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:42:09;0004;PBS_Server;Que;batch;attributes set:  at 
> request of root at gandalf.xxx.yyy
> 01/21/2008 16:42:09;0004;PBS_Server;Que;batch;attributes set: started
 
> = TRUE
> 01/21/2008 16:42:13;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:42:13;0100;PBS_Server;Req;;Type StatusJob request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:42:46;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:42:46;0100;PBS_Server;Req;;Type OrderJob request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:42:48;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:42:48;0100;PBS_Server;Req;;Type StatusJob request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:42:59;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:42:59;0100;PBS_Server;Req;;Type OrderJob request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:43:00;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx..yyy, sock=10
> 01/21/2008 16:43:00;0100;PBS_Server;Req;;Type StatusJob request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 16:51:55;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 16:51:55;0100;PBS_Server;Req;;Type StatusJob request 
> received from root at gandalf.xxx.yyy, sock=9
> 01/21/2008 17:17:57;0100;PBS_Server;Req;;Type AuthenticateUser
 request 
> received from root at gandalf.xxx.yyy, sock=10
> 01/21/2008 17:17:57;0100;PBS_Server;Req;;Type StatusJob request 
> received from root at gandalf..xxx.yyy, sock=9
>
>
 ----------------------------------------------------------------------------------------------------------------
> mom_logs:
>
> 01/21/2008 16:25:08;0002;   pbs_mom;Svr;Log;Log opened
> 01/21/2008 16:25:08;0002;   pbs_mom;Svr;setpbsserver;gandalf.xxx.yyy
> 01/21/2008 16:25:08;0002;   pbs_mom;Svr;setpbsserver;server 
> gandalf.xxx.yyy added
> 01/21/2008 16:25:08;0002;   pbs_mom;n/a;initialize;independent
> 01/21/2008 16:25:08;0080;   pbs_mom;Svr;pbs_mom;before
 init_abort_jobs
> 01/21/2008 16:25:08;0002;   pbs_mom;Svr;pbs_mom;Is up
> 01/21/2008 16:25:08;0002;   pbs_mom;Svr;mom_main;MOM executable path 
> and mtime at launch: /usr/local/sbin/pbs_mom 1200328428
> 01/21/2008 16:25:08;0002;   pbs_mom;n/a;mom_main;hello sent to server
 
> gandalf.xxx.yyy
> 01/21/2008 16:25:54;0002;   pbs_mom;Svr;im_eof;End of File from addr 
> www..xxx.yyy.zzz:15001 <http://www.xxx.yyy.zzz:15001>
> 01/21/2008 16:25:54;0002;   pbs_mom;n/a;mom_main;hello sent to server
 
> gandalf.xxx.yyy
>
>
 ------------------------------------------------------------------------------------------------------------------
> sched_logs:
> 01/21/2008 16:25:35;0002; pbs_sched;Svr;Log;Log opened
> 01/21/2008 16:25:35;0002; pbs_sched;Svr;TokenAct;Account file 
> /var/spool/torque/sched_priv/accounting/20080121 opened
> 01/21/2008 16:25:35;0002; pbs_sched;Svr;main;pbs_sched startup pid
 6175
>
>
>
>
>
>
>
>
 ------------------------------------------------------------------------
>
> Yahoo! Encuentros
> Ahora encontrar pareja es mucho más fácil, probá el nuevo Yahoo! 
> Encuentros.
> Visitá http://yahoo.cupidovirtual.com/servlet/NewRegistration
>
 ------------------------------------------------------------------------
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>   


-- 
Chris Vaughan
EMEA Systems Engineer
Cluster Resources, Ltd.
Direct - UK Office:  +44 (0)1223 437 132
Mobile - +44 (0)7800 973 062
US Headquarters:  +1 801 717 3700
Skype: supercomputer1
www.clusterresources.co.uk
-- 

Evaluate Our Products, Free 45-Day Evaluation
http://www.clusterresources.com/pages/products/evaluate.php







      Tarjeta de crédito Yahoo! de Banco Supervielle.
Solicitá tu nueva Tarjeta de crédito. De tu PC directo a tu casa. www.tuprimeratarjeta.com.ar 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20080122/50a75c6e/attachment-0001.html


More information about the torqueusers mailing list