[torqueusers] maui and torque not communicating
DuChene, StevenX A
stevenx.a.duchene at intel.com
Fri Mar 16 18:50:32 MDT 2012
BTW, in my maui.log file I am seeing the following:
03/16 17:48:21 MRMWorkloadQuery()
03/16 17:48:21 MPBSWorkloadQuery(ELOGIN2,JCount,SC)
03/16 17:48:21 INFO: queue is empty
03/16 17:48:21 INFO: 0 PBS jobs detected on RM ELOGIN2
03/16 17:48:21 WARNING: no workload detected
Even though qstat from torque returns:
[root at elogin2 log]# qstat
Job id Name User Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
2.elogin2 script.pbs saducheX 0 Q batch
-----Original Message-----
From: torqueusers-bounces at supercluster.org [mailto:torqueusers-bounces at supercluster.org] On Behalf Of DuChene, StevenX A
Sent: Friday, March 16, 2012 5:46 PM
To: torqueusers at supercluster.org
Subject: [torqueusers] maui and torque not communicating
I am just wondering if anyone actually has Torque-4.0 running and working with Maui as the scheduler?
I have Torque-4.0 compiled and running without the pbs_sched part installed.
I have Maui-3.3.1 installed and running as well but it really seems like the two systems are not really talking to each other.
If I submit jobs with qsub from Torque I can see them sitting in the queue with qstat:
[root at elogin2 hwloc-1.4.1]# qstat
Job id Name User Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
2.elogin2 script.pbs saducheX 0 Q batch
But if I then use showq (a moui tool) the job does not show up.
[saducheX at elogin2 ~]$ showq
ACTIVE JOBS--------------------
JOBNAME USERNAME STATE PROC REMAINING STARTTIME
0 Active Jobs 0 of 1024 Processors Active (0.00%)
0 of 256 Nodes Active (0.00%)
IDLE JOBS----------------------
JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME
0 Idle Jobs
BLOCKED JOBS----------------
JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME
Total Jobs: 0 Active Jobs: 0 Idle Jobs: 0 Blocked Jobs: 0
If I try to run mdiag -j 2 it returns nothing:
[root at elogin2 hwloc-1.4.1]# mdiag -j 2
Name State Par Proc QOS WCLimit R Min User Group Account QueuedTime Network Opsys Arch Mem Disk Procs Class Features
The checkjob util says:
[root at elogin2 hwloc-1.4.1]# checkjob 2
ERROR: 'checkjob' failed
ERROR: cannot locate job '2'
[saducheX at elogin2 ~]$ checkjob 2.elogin2
ERROR: 'checkjob' failed
ERROR: cannot locate job '2.elogin2'
So my basic question is does anyone have maui working with Torque-4.0?
If so what did you have to do to get things operational?
Is there something I am missing?
--
Steven DuChene
_______________________________________________
torqueusers mailing list
torqueusers at supercluster.org
http://www.supercluster.org/mailman/listinfo/torqueusers
More information about the torqueusers
mailing list