[Mauiusers] setting busy status in funciotn of disk space and cpuload

Arnau Bria arnaubria at pic.es
Tue Sep 23 02:52:28 MDT 2008


On Mon, 22 Sep 2008 12:29:02 -0500
Tom Rudwick wrote:

Hi Tom,
> We have this in our mom configs:
> size [fs=/tmp]
yep, something similar (/home instead of /tmp).

> This makes the size resource hold the space available in /tmp
> 
> Then we do:
> 
> qsub -l ddisk=5gb
> 
> Or you could set a default for ddisk for a queue or server.
Exactly.
 
> It should only run on nodes with the proper amount of free
> space. 
Sure, but what happens if I don't define that resource on some nodes,
cause we have some nodes with big disks and other with not.

> I'm sure you can do this other ways, but I don't have
> experience with those. Hopefully you can adapt this to what
> you need.

I was trying to define a dynamic rsource. A simple script that
returns free disk space in nodes with small disks, and big constant
value on nodes with big disks... But I don't know how to use it,
cause I send a job requesting some big space (that no nodes have), and
the job is always scheduled when it shouldn't. I have also modified the
dynamic resource so now it's a boolean value (if it has enough
space=1, if not, =0), but, again, if all nodes have the resource=0, and
I submit a job requesting that resource, the job is scheduled.

Following:
http://www.clusterresources.com/torquedocs21/a.cmomconfig.shtml


[root at pbs02 ~]# pbsnodes -a|grep -c espacio
122
[root at pbs02 ~]# pbsnodes -a|grep -c "espacio:0"
122
So, all nodes have the resource=0.

I submit a job:

[arnaubria at ui01 ~]$ echo sleep 5|qsub -l other=espacio -q short
560171.pbs02.pic.es
[arnaubria at ui01 ~]$ qstat -f 560171.pbs02.pic.es
Job Id: 560171.pbs02.pic.es
    Job_Name = STDIN
    Job_Owner = arnaubria at ui01.pic.es
    job_state = Q
    queue = short
    server = pbs02.pic.es
    Checkpoint = u
    ctime = Tue Sep 23 10:49:10 2008
    Error_Path = ui01.pic.es:/nfs/pic.es/user/a/arnaubria/STDIN.e560171
    Hold_Types = n
    Join_Path = n
    Keep_Files = n
    Mail_Points = a
    mtime = Tue Sep 23 10:49:10 2008
    Output_Path = ui01.pic.es:/nfs/pic.es/user/a/arnaubria/STDIN.o560171
    Priority = 0
    qtime = Tue Sep 23 10:49:10 2008
    Rerunable = True
    Resource_List.cput = 01:30:00
    Resource_List.other = espacio
    Resource_List.walltime = 03:00:00
    Variable_List = PBS_O_HOME=/nfs/pic.es/user/a/arnaubria,
        PBS_O_LANG=en_US.UTF-8,PBS_O_LOGNAME=arnaubria,
        PBS_O_PATH=/usr/kerberos/bin:/opt/glite/bin:/opt/glite/externals/bin:
        /opt/lcg/bin:/opt/lcg/sbin:/opt/edg/bin:/opt/edg/sbin:/opt/globus/sbin
        :/opt/globus/bin:/opt/gpt/sbin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6
        /bin:/opt/d-cache//srm/bin:/opt/d-cache//dcap/bin:/usr/java/jdk1.5.0_1
        4/bin:/nfs/pic.es/user/a/arnaubria/bin,
        PBS_O_MAIL=/var/spool/mail/arnaubria,PBS_O_SHELL=/bin/bash,
        PBS_O_HOST=ui01.pic.es,PBS_O_WORKDIR=/nfs/pic.es/user/a/arnaubria,
        PBS_O_QUEUE=short
    etime = Tue Sep 23 10:49:10 2008
    submit_args = -l other=espacio -q short


[root at pbs02 ~]# qstat  560171
qstat: Unknown Job Id 560171.pbs02.pic.es
[root at pbs02 ~]#  checkjob 560171


checking job 560171

State: Running
Creds:  user:arnaubria  group:grid  class:short  qos:DEFAULT
WallTime: 00:00:00 of 3:00:00
SubmitTime: Tue Sep 23 10:49:10
  (Time Queued  Total: 00:01:46  Eligible: 00:01:46)

StartTime: Tue Sep 23 10:50:56
Total Tasks: 1

Req[0]  TaskCount: 1  Partition: DEFAULT
Network: [NONE]  Memory >= 0  Disk >= 0  Swap >= 0
Opsys: [NONE]  Arch: [NONE]  Features: [slc4]
Allocated Nodes:
[td060.pic.es:1]


IWD: [NONE]  Executable:  [NONE]
Bypass: 0  StartCount: 1
PartitionMask: [ALL]
Flags:       BACKFILL RESTARTABLE

Reservation '560171' (00:00:00 -> 3:00:00  Duration: 3:00:00)
PE:  1.00  StartPriority:  0


....

It shouldn't start....


If I do it requesting file=10000kb, as you, it works, but then, I lose
the chance of specifying diff directory at worke node level.


> Tom
Cheers,
Arnau


More information about the mauiusers mailing list