[torqueusers] Problem: qsub fails to submit jobs

Raymond Page pagerc at ufl.edu
Thu Sep 2 17:46:58 MDT 2004


Output:

fidget $ pbsnodes -a
osgmon.cns.ufl.edu
      state = free
      np = 1
      properties = local,server,production,eli,osgmon
      ntype = time-shared
      status = arch=linux,uname=Linux osgmon 2.6.5-gentoo-r1 #1 Mon May 
24 14:44:53 EDT 2004 i686,sessions=? 0,nsessions=? 
0,nusers=0,idletime=33354,totmem=2153892kb,availmem=418560kb,physmem=643792kb,ncpus=1,loadave=0.00,rectime=1094167796

fidget.cns.ufl.edu
      state = free
      np = 2
      properties = remote,workstation,pagerc,fidget
      ntype = time-shared
      status = arch=linux,uname=Linux fidget 2.6.6 #5 Thu Jul 1 17:28:15 
EDT 2004 i686,sessions=5022 5183 5184 5190 5200 5207 5214 5221 5228 5235 
5242 5249 5256 5263 5270 5279 5286 5509 5544 5545 5564 5579 5552 5584 
5598 
32307,nsessions=26,nusers=2,idletime=1066,totmem=1519092kb,availmem=2726224kb,physmem=515040kb,ncpus=1,loadave=0.54,rectime=1094167805

rah.cns.ufl.edu
      state = free
      np = 1
      properties = remote,server,drakee,rah
      ntype = time-shared
      status = arch=linux,uname=Linux rah 2.6.1-gentoo-r1 #3 Tue Jun 1 
13:56:01 EDT 2004 i686,sessions=3234 7538 7539 7540 8456 8457 8484 8485 
8509 8510 32073 32074 8681 9741 9742 10231 10233 10235 10237 10239 10241 
10243 10245 10247 10249 10255 10257 10259 10268 10492 10545 10546 11526 
24814 2460 2461 2476 2477 2480 2481 2484 2485 2488 2489 2604 2605 17123 
17147 17173 30443 
8520,nsessions=51,nusers=3,idletime=5131,totmem=1013384kb,availmem=2281268kb,physmem=515380kb,ncpus=1,loadave=0.94,rectime=1094167816



Wightman wrote:

> What is the output of a pbsnodes -a ?
> 
> 
> 
> On Thu, 2004-09-02 at 09:12, Raymond Page wrote:
> 
>>fidget $ qsub -l nodes=rah.cns.ufl.edu qsub.sh
>>qsub: Job exceeds queue resource limits
>>
>>I'm attempting to use the below setup and cannot understand why I am 
>>receiving the qsub failure above.  I had shared clusters working, and 
>>now I want the ability to load balance with timeshared nodes.  I'd 
>>appreciate being informed of what I'm missing in my setup to let jobs 
>>execute on a timeshared host.
>>
>>--
>>Raymond Page
>>
>>
>>
>>server_priv/nodes:
>>osgmon.cns.ufl.edu:ts local server production osgmon
>>fidget.cns.ufl.edu:ts np=2 remote workstation fidget
>>rah.cns.ufl.edu:ts remote server rah
>>
>>mom_priv/config:
>>$serverhost     localhost
>>$clienthost     osgmon.cns.ufl.edu
>>$restricted     *.cns.ufl.edu
>>$usecp          *.cns.ufl.edu:/home /home
>>$usecp          *.cns.ufl.edu:/u /u
>>$logevent       255
>>ideal_load      5
>>max_load        8
>>
>>
>>$ qmgr -c "p s"
>># Create queues and set their attributes.
>>create queue test
>>set queue test queue_type = Execution
>>set queue test resources_default.nodect = 1
>>set queue test resources_default.nodes = 1
>>set queue test resources_default.walltime = 01:00:00
>>set queue test enabled = True
>>set queue test started = True
>># Set server attributes.
>>set server scheduling = True
>>set server default_queue = test
>>set server log_events = 511
>>set server mail_from = adm
>>set server scheduler_iteration = 600
>>set server node_ping_rate = 300
>>set server node_check_rate = 600
>>_______________________________________________
>>torqueusers mailing list
>>torqueusers at supercluster.org
>>http://supercluster.org/mailman/listinfo/torqueusers



More information about the torqueusers mailing list