[torqueusers] $PBS_NODEFILE
rcbord at wm.edu
rcbord at wm.edu
Wed Oct 24 17:33:26 MDT 2007
Hi Aaron,
Something happened with the pbs_server, pbs_scheduler or pbs_mom.
We restarted all three and it started working again.
but to answer your question.
#PBS -l walltime=00:10:00
#PBS -l nodes=2:pinecone:ppn=2
It would run a two processor job or a serial job but only on the first
compute node which in this case pc02. The $PBS_NODEFILE with the above
resource request would only have "pc02 pc02" in it.
Chris Bording
Application Analyst
High Performance Computing Group
Information Technology
The College of William and Mary
(757)-221-3488
rcbord at wm.edu
On Wed, 24 Oct 2007, Aaron Knister wrote:
> So jobs aren't running? What syntax are you using to request resources for
> submitted jobs?
>
> -Aaron
>
> On Oct 24, 2007, at 11:18 AM, rcbord at wm.edu wrote:
>
>> Hi all,
>> Ok we installed torque-2.1.9 and had it running on Monday, but now it is
>> not working correctly. The $PBS_NODEFILE only add has two processors in
>> it.
>>
>> The server_priv/node file has
>>
>> pc02 np=2
>> pc03 np=2
>> .
>> .
>> pc13 np=2
>>
>> all the nodes are "free" according to the pbsnodes -a output
>>
>> pc12
>> state = free
>> np = 2
>> properties = pinecone,v20z,score
>> ntype = cluster
>> status = opsys=linux,uname=Linux pc12 2.6.16.53-0.8-smp #1 SMP Fri
>> Aug 31 13:07:27 UTC 2007
>> x86_64,sessions=4144,nsessions=1,nusers=1,idletime=167833,totmem=79389
>> 36kb,
>> availmem=7732980kb,physmem=3737980kb,ncpus=2,loadave=0.00,netload=2475
>> 13913,
>> state=free,jobs=? 15201,rectime=1193238080
>>
>>
>> #
>> # Set server attributes.
>> #
>> set server scheduling = True
>> set server default_queue = submit
>> set server log_events = 127
>> set server mail_from = adm
>> set server max_running = 24
>> set server max_user_run = 24
>> set server max_group_run =24
>> set server acl_host_enable = True
>> set server acl_hosts = pinecone.cwm.edu
>> set server acl_hosts += pc02
>> set server acl_hosts += pc03
>> set server acl_hosts += pc04
>> set server acl_hosts += pc05
>> set server acl_hosts += pc06
>> set server acl_hosts += pc07
>> set server acl_hosts += pc08
>> set server acl_hosts += pc09
>> set server acl_hosts += pc10
>> set server acl_hosts += pc11
>> set server acl_hosts += pc12
>> set server acl_hosts += pc13
>> set server query_other_jobs = True
>> set server acl_roots = root at pinecone.cwm.edu
>> set server managers = manager1 at pinecone.cwm.edu
>> set server operators = manager2 at pinecone.cwm.edu
>> set server operators += manager1 at pinecone.cwm.edu
>> set server resources_available.ncpus = 24
>> set server resources_available.nodect = 12
>> set server resources_max.nodect = 12
>> set server scheduler_iteration = 30
>> set server node_check_rate = 150
>> set server tcp_timeout = 6
>> set server log_level = 5
>> set server pbs_version = 2.1.9
>>
>> I don't think we have changed anything it just quit!!
>>
>>
>> Chris Bording
>> Application Analyst
>> High Performance Computing Group Information Technology
>> The College of William and Mary
>> (757)-221-3488
>> rcbord at wm.edu
>> _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/torqueusers
>
> Aaron Knister
> Associate Systems Administrator/Web Designer
> Center for Research on Environment and Water
>
> (301) 595-7001
> aaron at iges.org
>
>
>
>
More information about the torqueusers
mailing list