[Mauiusers] job does not start

Jayavant Patil jayavant.patil82 at gmail.com
Mon Mar 5 01:36:45 MST 2012


>Hi,

>We have Torque Server Version 2.5.8 and maui version 3.2.6p1 installed on
>rhel 5.2 server. "showstart" for one of the jobs says that job should start
>now i.e.

>Earliest start in         00:00:00 on current time.
>########################
>checkjob -vv says that

>checkjob -vv 62235
>checking job 62235 (RM job '62235.yc9.cn.yuva.param')
>State: Idle
>Creds:  user:abcd  group:pqr  account:PQR-PR  class:q1  qos:q1-qos
>WallTime: 00:00:00 of 2:05:00:00
>SubmitTime: Thu Feb 23 18:56:26
 >(Time Queued  Total: 1:21:27:05  Eligible: 1:21:27:05)

>Total Tasks: 2

>Req[0]  TaskCount: 2  Partition: ALL
>Network: [NONE]  Memory >= 0  Disk >= 0  Swap >= 0
>Opsys: [NONE]  Arch: [NONE]  Features: [NONE]
>Exec:  ''  ExecSize: 0  ImageSize: 0
>Dedicated Resources Per Task: PROCS: 1
>NodeAccess: SHARED
>NodeCount: 0
>IWD: [NONE]  Executable:  [NONE]
>Bypass: 51  StartCount: 0
>PartitionMask: [ALL]
>Reservation '62235' (00:00:00 -> 2:05:00:00  Duration: 2:05:00:00)
>PE:  2.00  StartPriority:  2727
>job cannot run in partition DEFAULT (insufficient idle procs available: 0 <
>2)
>job can run in partition P1 (32 procs available.  2 procs required)
>job can run in partition P2 (48 procs available.  2 procs required)
>########################
>showres -n 62235 says that

>reservations on Sat Feb 25 16:28:10

  >          NodeName       Type      ReservationID   JobState Task
Start    Duration            StartTime

> node16.clusternode        Job              62235       Idle    2
00:00:00  2:05:00:00  Sat Feb 25 16:28:10
>1 nodes reserved
############################
>checknode node16.clusternode says that node is available for job run.

>but somehow job is not going  and is not giving any error in maui,
pbs_server,pbs_mom logs also.

>What can be the issue?

Have you seen that Maui is starting the job in maui.log? If yes, then there
might be the communication problem with TORQUE.

>What can be done to make job run and avoid the same in future?

How many partitions you have in you cluster?

Can you try to submit the job by specifying the PARTITION as follows:

qsub -q <queue_name> -l nodes=<requirement> -W x=PARTITION:<partition name>

>thank you

>-pankakjd

-- 

Thanks & Regards,
Jayavant Ningoji Patil
+91 9923536030.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20120305/a41be37b/attachment-0001.html 


More information about the mauiusers mailing list