[torqueusers] maui and torque not communicating

DuChene, StevenX A stevenx.a.duchene at intel.com
Mon Mar 19 09:51:13 MDT 2012


RHEL6.1 with latest standard kernel.

From: torqueusers-bounces at supercluster.org [mailto:torqueusers-bounces at supercluster.org] On Behalf Of Ken Nielson
Sent: Monday, March 19, 2012 8:42 AM
To: Torque Users Mailing List
Subject: Re: [torqueusers] maui and torque not communicating

On Fri, Mar 16, 2012 at 6:45 PM, DuChene, StevenX A <stevenx.a.duchene at intel.com<mailto:stevenx.a.duchene at intel.com>> wrote:
I am just wondering if anyone actually has Torque-4.0 running and working with Maui as the scheduler?

I have Torque-4.0 compiled and running without the pbs_sched part installed.

I have Maui-3.3.1 installed and running as well but it really seems like the two systems are not really talking to each other.

If I submit jobs with qsub from Torque I can see them sitting in the queue with qstat:

 [root at elogin2 hwloc-1.4.1]# qstat
Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
2.elogin2                  script.pbs       saducheX               0 Q batch

But if I then use showq (a moui tool) the job does not show up.

[saducheX at elogin2 ~]$ showq
ACTIVE JOBS--------------------
JOBNAME            USERNAME      STATE  PROC   REMAINING            STARTTIME


    0 Active Jobs       0 of 1024 Processors Active (0.00%)
                        0 of  256 Nodes Active      (0.00%)

IDLE JOBS----------------------
JOBNAME            USERNAME      STATE  PROC     WCLIMIT            QUEUETIME


0 Idle Jobs

BLOCKED JOBS----------------
JOBNAME            USERNAME      STATE  PROC     WCLIMIT            QUEUETIME


Total Jobs: 0   Active Jobs: 0   Idle Jobs: 0   Blocked Jobs: 0

If I try to run mdiag -j 2 it returns nothing:

[root at elogin2 hwloc-1.4.1]# mdiag -j 2
Name                  State Par Proc QOS     WCLimit R  Min     User    Group  Account  QueuedTime  Network  Opsys   Arch    Mem   Disk  Procs       Class Features

The checkjob util says:

[root at elogin2 hwloc-1.4.1]# checkjob 2
ERROR:    'checkjob' failed
ERROR:  cannot locate job '2'

[saducheX at elogin2 ~]$ checkjob 2.elogin2
ERROR:    'checkjob' failed
ERROR:  cannot locate job '2.elogin2'

So my basic question is does anyone have maui working with Torque-4.0?
If so what did you have to do to get things operational?
Is there something I am missing?
--
Steven DuChene

Steve,

What version of Linux are you running?

Ken
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20120319/c34523e9/attachment.html 


More information about the torqueusers mailing list