[Mauiusers] job does not start

Jérôme Pansanel jerome.pansanel at iphc.cnrs.fr
Mon Mar 5 01:45:31 MST 2012


Hi,

We got lot of errors with maui version 3.2.6p1 (segfault mainly). Since
the update to version 3.3.4, it works fine.

Best regards,

Jerome Pansanel

On lun., 2012-03-05 at 14:06 +0530, Jayavant Patil wrote:
> >Hi,
> 
> >We have Torque Server Version 2.5.8 and maui version 3.2.6p1
> installed on
> >rhel 5.2 server. "showstart" for one of the jobs says that job should
> start
> >now i.e.
> 
> >Earliest start in         00:00:00 on current time.
> >########################
> >checkjob -vv says that
> 
> >checkjob -vv 62235
> >checking job 62235 (RM job '62235.yc9.cn.yuva.param')
> >State: Idle
> >Creds:  user:abcd  group:pqr  account:PQR-PR  class:q1  qos:q1-qos
> >WallTime: 00:00:00 of 2:05:00:00
> >SubmitTime: Thu Feb 23 18:56:26
>  >(Time Queued  Total: 1:21:27:05  Eligible: 1:21:27:05)
> 
> >Total Tasks: 2
> 
> >Req[0]  TaskCount: 2  Partition: ALL
> >Network: [NONE]  Memory >= 0  Disk >= 0  Swap >= 0
> >Opsys: [NONE]  Arch: [NONE]  Features: [NONE]
> >Exec:  ''  ExecSize: 0  ImageSize: 0
> >Dedicated Resources Per Task: PROCS: 1
> >NodeAccess: SHARED
> >NodeCount: 0
> >IWD: [NONE]  Executable:  [NONE]
> >Bypass: 51  StartCount: 0
> >PartitionMask: [ALL]
> >Reservation '62235' (00:00:00 -> 2:05:00:00  Duration: 2:05:00:00)
> >PE:  2.00  StartPriority:  2727
> >job cannot run in partition DEFAULT (insufficient idle procs
> available: 0 <
> >2)
> >job can run in partition P1 (32 procs available.  2 procs required)
> >job can run in partition P2 (48 procs available.  2 procs required)
> >########################
> >showres -n 62235 says that
> 
> >reservations on Sat Feb 25 16:28:10
> 
>   >          NodeName       Type      ReservationID   JobState Task
> Start    Duration            StartTime
> 
> > node16.clusternode        Job              62235       Idle    2
> 00:00:00  2:05:00:00  Sat Feb 25 16:28:10
> >1 nodes reserved
> ############################
> >checknode node16.clusternode says that node is available for job run.
> 
> >but somehow job is not going  and is not giving any error in maui,
> pbs_server,pbs_mom logs also.
> 
> >What can be the issue?
> 
> Have you seen that Maui is starting the job in maui.log? If yes, then
> there might be the communication problem with TORQUE.
> 
> >What can be done to make job run and avoid the same in future?
> 
> How many partitions you have in you cluster?
> 
> Can you try to submit the job by specifying the PARTITION as follows:
> 
> qsub -q <queue_name> -l nodes=<requirement> -W x=PARTITION:<partition
> name>
> 
> >thank you
> 
> >-pankakjd
> 
> -- 
> 
> Thanks & Regards,
> Jayavant Ningoji Patil
> +91 9923536030.
> 
> _______________________________________________
> mauiusers mailing list
> mauiusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/mauiusers

-- 
Jerome Pansanel
IPHC
23 rue du Loess, BP 28
F-67037 STRASBOURG Cedex 2
T. +33 (0)3 88 10 66 24
P. +33 (0)6 25 19 24 43
F. +33 (0)3 88 10 62 34



More information about the mauiusers mailing list