[Mauiusers] setting busy status in funciotn of disk space and cpuload

Arnau Bria arnaubria at pic.es
Tue Sep 23 02:52:28 MDT 2008

On Mon, 22 Sep 2008 12:29:02 -0500
Tom Rudwick wrote:

Hi Tom,
> We have this in our mom configs:
> size [fs=/tmp]
yep, something similar (/home instead of /tmp).

> This makes the size resource hold the space available in /tmp
> Then we do:
> qsub -l ddisk=5gb
> Or you could set a default for ddisk for a queue or server.
> It should only run on nodes with the proper amount of free
> space. 
Sure, but what happens if I don't define that resource on some nodes,
cause we have some nodes with big disks and other with not.

> I'm sure you can do this other ways, but I don't have
> experience with those. Hopefully you can adapt this to what
> you need.

I was trying to define a dynamic rsource. A simple script that
returns free disk space in nodes with small disks, and big constant
value on nodes with big disks... But I don't know how to use it,
cause I send a job requesting some big space (that no nodes have), and
the job is always scheduled when it shouldn't. I have also modified the
dynamic resource so now it's a boolean value (if it has enough
space=1, if not, =0), but, again, if all nodes have the resource=0, and
I submit a job requesting that resource, the job is scheduled.


[root at pbs02 ~]# pbsnodes -a|grep -c espacio
[root at pbs02 ~]# pbsnodes -a|grep -c "espacio:0"
So, all nodes have the resource=0.

I submit a job:

[arnaubria at ui01 ~]$ echo sleep 5|qsub -l other=espacio -q short
[arnaubria at ui01 ~]$ qstat -f 560171.pbs02.pic.es
Job Id: 560171.pbs02.pic.es
    Job_Name = STDIN
    Job_Owner = arnaubria at ui01.pic.es
    job_state = Q
    queue = short
    server = pbs02.pic.es
    Checkpoint = u
    ctime = Tue Sep 23 10:49:10 2008
    Error_Path = ui01.pic.es:/nfs/pic.es/user/a/arnaubria/STDIN.e560171
    Hold_Types = n
    Join_Path = n
    Keep_Files = n
    Mail_Points = a
    mtime = Tue Sep 23 10:49:10 2008
    Output_Path = ui01.pic.es:/nfs/pic.es/user/a/arnaubria/STDIN.o560171
    Priority = 0
    qtime = Tue Sep 23 10:49:10 2008
    Rerunable = True
    Resource_List.cput = 01:30:00
    Resource_List.other = espacio
    Resource_List.walltime = 03:00:00
    Variable_List = PBS_O_HOME=/nfs/pic.es/user/a/arnaubria,
    etime = Tue Sep 23 10:49:10 2008
    submit_args = -l other=espacio -q short

[root at pbs02 ~]# qstat  560171
qstat: Unknown Job Id 560171.pbs02.pic.es
[root at pbs02 ~]#  checkjob 560171

checking job 560171

State: Running
Creds:  user:arnaubria  group:grid  class:short  qos:DEFAULT
WallTime: 00:00:00 of 3:00:00
SubmitTime: Tue Sep 23 10:49:10
  (Time Queued  Total: 00:01:46  Eligible: 00:01:46)

StartTime: Tue Sep 23 10:50:56
Total Tasks: 1

Req[0]  TaskCount: 1  Partition: DEFAULT
Network: [NONE]  Memory >= 0  Disk >= 0  Swap >= 0
Opsys: [NONE]  Arch: [NONE]  Features: [slc4]
Allocated Nodes:

IWD: [NONE]  Executable:  [NONE]
Bypass: 0  StartCount: 1
PartitionMask: [ALL]

Reservation '560171' (00:00:00 -> 3:00:00  Duration: 3:00:00)
PE:  1.00  StartPriority:  0


It shouldn't start....

If I do it requesting file=10000kb, as you, it works, but then, I lose
the chance of specifying diff directory at worke node level.

> Tom

More information about the mauiusers mailing list