[Mauiusers] job does not start
Jérôme Pansanel
jerome.pansanel at iphc.cnrs.fr
Mon Mar 5 01:45:31 MST 2012
Hi,
We got lot of errors with maui version 3.2.6p1 (segfault mainly). Since
the update to version 3.3.4, it works fine.
Best regards,
Jerome Pansanel
On lun., 2012-03-05 at 14:06 +0530, Jayavant Patil wrote:
> >Hi,
>
> >We have Torque Server Version 2.5.8 and maui version 3.2.6p1
> installed on
> >rhel 5.2 server. "showstart" for one of the jobs says that job should
> start
> >now i.e.
>
> >Earliest start in 00:00:00 on current time.
> >########################
> >checkjob -vv says that
>
> >checkjob -vv 62235
> >checking job 62235 (RM job '62235.yc9.cn.yuva.param')
> >State: Idle
> >Creds: user:abcd group:pqr account:PQR-PR class:q1 qos:q1-qos
> >WallTime: 00:00:00 of 2:05:00:00
> >SubmitTime: Thu Feb 23 18:56:26
> >(Time Queued Total: 1:21:27:05 Eligible: 1:21:27:05)
>
> >Total Tasks: 2
>
> >Req[0] TaskCount: 2 Partition: ALL
> >Network: [NONE] Memory >= 0 Disk >= 0 Swap >= 0
> >Opsys: [NONE] Arch: [NONE] Features: [NONE]
> >Exec: '' ExecSize: 0 ImageSize: 0
> >Dedicated Resources Per Task: PROCS: 1
> >NodeAccess: SHARED
> >NodeCount: 0
> >IWD: [NONE] Executable: [NONE]
> >Bypass: 51 StartCount: 0
> >PartitionMask: [ALL]
> >Reservation '62235' (00:00:00 -> 2:05:00:00 Duration: 2:05:00:00)
> >PE: 2.00 StartPriority: 2727
> >job cannot run in partition DEFAULT (insufficient idle procs
> available: 0 <
> >2)
> >job can run in partition P1 (32 procs available. 2 procs required)
> >job can run in partition P2 (48 procs available. 2 procs required)
> >########################
> >showres -n 62235 says that
>
> >reservations on Sat Feb 25 16:28:10
>
> > NodeName Type ReservationID JobState Task
> Start Duration StartTime
>
> > node16.clusternode Job 62235 Idle 2
> 00:00:00 2:05:00:00 Sat Feb 25 16:28:10
> >1 nodes reserved
> ############################
> >checknode node16.clusternode says that node is available for job run.
>
> >but somehow job is not going and is not giving any error in maui,
> pbs_server,pbs_mom logs also.
>
> >What can be the issue?
>
> Have you seen that Maui is starting the job in maui.log? If yes, then
> there might be the communication problem with TORQUE.
>
> >What can be done to make job run and avoid the same in future?
>
> How many partitions you have in you cluster?
>
> Can you try to submit the job by specifying the PARTITION as follows:
>
> qsub -q <queue_name> -l nodes=<requirement> -W x=PARTITION:<partition
> name>
>
> >thank you
>
> >-pankakjd
>
> --
>
> Thanks & Regards,
> Jayavant Ningoji Patil
> +91 9923536030.
>
> _______________________________________________
> mauiusers mailing list
> mauiusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/mauiusers
--
Jerome Pansanel
IPHC
23 rue du Loess, BP 28
F-67037 STRASBOURG Cedex 2
T. +33 (0)3 88 10 66 24
P. +33 (0)6 25 19 24 43
F. +33 (0)3 88 10 62 34
More information about the mauiusers
mailing list