[torqueusers] nodes file - basic install problems

Rob Holmes Rob.Holmes at bmtwbm.com.au
Thu Jul 5 22:58:01 MDT 2012


I’m installing a small HPC cluster at work, which I’ve never done before and it’s causing me problems.

My nodes file contains 14 compute nodes named node01, node02, etc.  At the moment I just have four nodes switched on, with the remainder shown as ‘down’ with pbsnodes -a.  When I submit a number of jobs, jobs are submitted to the first two nodes with the remaining two marked as ‘free’, regardless of how many jobs are waiting to be submitted.  Jobs are kept in the queue until either of node01 or node02 come free, then are run.  node03 and node12 (the other two live nodes) never run a job.

However, when I remove node01 for example (by commenting out node01 in the nodes file and restarting pbs_server), jobs will run on node12.  Bizarrely, node03 is then marked as ‘down’ in pbsnodes –a.

This is long but basically I’m getting a lot of odd behavior and I’m not sure where to start debugging.  All live nodes are running pbs_mom.  The system was working as expected with just one compute node.  With more than one it is having problems. I’m running pbs_sched.  Can anyone please help?


Rob Holmes
Environmental Scientist – Catchments and Receiving Environments

Level 8, 200 Creek Street
Brisbane QLD 4000 Australia

P: +61 7 3831 6744


W: www.bmtwbm.com.au

[cid:image4b8ac8.GIF at 1b9b1df5.4bb7fab4]          [cid:imagee21530.GIF at d18efb23.4ebf582f] <http://www.bmt.org/times100best>

E-mail confidentiality notice and disclaimer:
<http://www.bmt.org/times100best>The contents of this e-mail are intended for the use of the mail addressee(s) shown. If you are not that person, you are not allowed to read, action, copy, forward, distribute or disclose the contents and you should delete it from your system. BMT WBM accepts no liability for any errors or omissions in the content of this e-mail, nor does it accept liability for statements which are those of the author and clearly not made on behalf of the company.

Commercial Terms and Conditions:
Unless otherwise agreed by BMT WBM in writing, all services or products supplied by BMT WBM shall be subject to and governed by BMT WBM’s standard terms and conditions, which are available upon request.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20120706/4fb2f98a/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image4b8ac8.GIF
Type: image/gif
Size: 3074 bytes
Desc: image4b8ac8.GIF
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20120706/4fb2f98a/attachment-0002.gif 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: imagee21530.GIF
Type: image/gif
Size: 3455 bytes
Desc: imagee21530.GIF
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20120706/4fb2f98a/attachment-0003.gif 

More information about the torqueusers mailing list