[torqueusers] cannot locate feasible nodes

Mahmood Naderan nt_mahmood at yahoo.com
Thu Jun 14 09:18:48 MDT 2012


Sorry, our assistant made a mistake in his qsub script. Problem was a typo error in "#PBS -l nodes=sw01" instead of "#PBS -l nodes=ws01"
Thanks for your help



// Naderan *Mahmood;


________________________________
From: Ken Nielson <knielson at adaptivecomputing.com>
To: Mahmood Naderan <nt_mahmood at yahoo.com>; Torque Users Mailing List <torqueusers at supercluster.org> 
Sent: Thursday, June 14, 2012 7:05 PM
Subject: Re: [torqueusers] cannot locate feasible nodes





On Thu, Jun 14, 2012 at 5:53 AM, Mahmood Naderan <nt_mahmood at yahoo.com> wrote:

Dear all,
>I faced the problem before (6 month ago), however I don't remember the solution. There are also discussions about this error however I didn't find a clear solution. Here is the problem: When I run qsub command I get this error:
>
>qsub: Job exceeds queue resource limits MSG=cannot locate feasible nodes
>
>Assumptions:
>1)
>
>mahmood at srv:~$ pbsnodes -l all
>srv                    free
>ws01                 free
>ws02                 free
>ws03                 free
>ws04                 free
>ws05                 free
>
>
>2)
>
>mahmood at srv:~$ showq
>ACTIVE JOBS--------------------
>JOBNAME            USERNAME      STATE  PROC   REMAINING            STARTTIME
>
>     0 Active Jobs       0 of  166 Processors Active (0.00%)
>                         0 of    6 Nodes Active      (0.00%)
>
>
>3)
>mahmood at srv:~$ cat /etc/hosts
>127.0.0.1       localhost.localdomain localhost
>192.168.1.100 hpclab.srv srv
>192.168.1.1 hpclab.ws01 ws01
>192.168.1.2 hpclab.ws02 ws02
>192.168.1.3 hpclab.ws03 ws03
>192.168.1.4 hpclab.ws04 ws04
>192.168.1.5 hpclab.ws05 ws05
>
>
>4) passwordless ssh works fine:
>mahmood at srv:~$ ssh ws01
>Last login: Thu Jun 14 19:29:50 2012 from hpclab.srv
>mahmood at ws01:~$
>
>
>5)
>root at srv:~# cat /var/spool/pbs/server_priv/nodes
>srv np=14
>ws01 np=24
>ws02 np=32
>ws03 np=32
>ws04 np=32
>ws05 np=32
>
>
>6)
>root at srv:~# cat /var/spool/pbs/server_name
>hpclab.srv
>
>7)
>root at srv:~# qmgr
>Max open servers: 4
>Qmgr: print server
>#
># Create queues and set their attributes.
>#
>#
># Create and define queue srvq
>#
>create queue srvq
>set queue srvq queue_type = Execution
>set queue srvq Priority = 10
>set queue srvq resources_default.nodes = 1
>set queue srvq resources_default.walltime = 960:00:00
>set queue srvq enabled = True
>set queue srvq started = True
>#
># Create and define queue medium
>#
>create queue medium
>set queue medium queue_type = Execution
>set queue medium Priority = 80
>set queue medium resources_default.nodes = 1
>set queue medium resources_default.walltime = 05:00:00
>set queue medium enabled = True
>set queue medium started = True
>#
># Create and define queue small
>#
>create queue small
>set queue small queue_type = Execution
>set queue small Priority = 100
>set queue small resources_default.nodes = 1
>set queue small resources_default.walltime = 02:00:00
>set queue small enabled = True
>set queue small started = True
>#
># Create and define queue very_small
>#
>create queue very_small
>set queue very_small queue_type = Execution
>set queue very_small Priority = 120
>set queue very_small resources_default.nodes = 1
>set queue very_small resources_default.walltime = 01:00:00
>set queue very_small enabled = True
>set queue very_small started = True
>#
># Create and define queue big
>#
>create queue big
>set queue big queue_type = Execution
>set queue big Priority = 60
>set queue big resources_default.nodes = 1
>set queue big resources_default.walltime = 10:00:00
>set queue big enabled = True
>set queue big started = True
>#
># Set server attributes.
>#
>set server scheduling = True
>set server acl_hosts = hpclab.srv
>set server acl_hosts += srv
>set server default_queue = very_big
>set server log_events = 511
>set server mail_from = adm
>set server resources_available.nodect = 166
>set server scheduler_iteration = 600
>set server node_check_rate = 150
>set server tcp_timeout = 6
>set server next_job_number = 59522
>set server server_name = hpclab.srv
>
>
>
>I will be very thankful for any hint/help.
>Regards,
>
>// Naderan *Mahmood;
>
What version of TORQUE and what is the entire qsub line?

Ken


More information about the torqueusers mailing list