[torqueusers] beginning with torqueu
Josh Butikofer
josh at clusterresources.com
Thu Oct 19 16:32:52 MDT 2006
Standa,
You cannot submit jobs as root. Use "qsub" as a non-privileged user.
--
Joshua Butikofer
Cluster Resources, Inc.
josh at clusterresources.com
Voice: (801) 717-3707
Fax: (801) 717-3738
--------------------------
Standa Kunc wrote:
> Thank you. Now I am one step further but now I have different problem.
> I cannot submit job via qsub. This is the error message:
>
> root at pc-xubuntu:~/torque-2.1.3# echo "sleep 30" | qsub
> qsub: Bad UID for job execution
>
> My current configuration is:
>
> xubuntu distribution, user root is enabled
>
>
> I modified torque.setup to this:
> --
>
> root at pc-xubuntu:~/torque-2.1.3# cat torque.setup
> #!/bin/sh
> # torque.setup
>
> # USAGE: torque.setup <USERNAME>
>
> if [ "$1" = "" ] ; then
> echo "USAGE: torque.setup <USERNAME>"
> exit 1
> fi
>
> # create default queue
> # enable operator privileges
>
> #USER=$1@`hostname`
> USER=$1 at localhost
>
> echo "initializing TORQUE (admin: $USER)"
>
> pbs_server -t create
>
> qmgr -c "set server scheduling=true"
>
> echo set server operators += $USER | qmgr
> echo set server operators += sk at localhost | qmgr
> echo set server managers += $USER | qmgr
> echo set server managers += sk at localhost | qmgr
> echo set server submit_hosts = localhost | qmgr
>
> qmgr -c 'create queue batch'
> qmgr -c 'set queue batch queue_type = execution'
> qmgr -c 'set queue batch started = true'
> qmgr -c 'set queue batch enabled = true'
> qmgr -c 'set queue batch resources_default.walltime = 1:00:00'
> qmgr -c 'set queue batch resources_default.nodes = 1'
>
> qmgr -c 'set server default_queue = batch'
>
>
> This is result of running torque.setup:
> --
>
> root at pc-xubuntu:~/torque-2.1.3# ./torque.setup root
> initializing TORQUE (admin: root at localhost)
> PBS_Server pc-xubuntu.local: Create mode and server database exists,
> do you wish to continue y/(n)?y
> Max open servers: 4
> Max open servers: 4
> Max open servers: 4
> Max open servers: 4
> Max open servers: 4
>
>
> Server settings:
> --
> root at pc-xubuntu:~/torque-2.1.3# qmgr -c 'p s'
> #
> # Create queues and set their attributes.
> #
> #
> # Create and define queue batch
> #
> create queue batch
> set queue batch queue_type = Execution
> set queue batch resources_default.nodes = 1
> set queue batch resources_default.walltime = 01:00:00
> set queue batch enabled = True
> set queue batch started = True
> #
> # Set server attributes.
> #
> set server scheduling = True
> set server managers = root at localhost
> set server managers += sk at localhost
> set server operators = root at localhost
> set server operators += sk at localhost
> set server default_queue = batch
> set server log_events = 511
> set server mail_from = adm
> set server scheduler_iteration = 600
> set server node_check_rate = 150
> set server tcp_timeout = 6
> set server pbs_version = 2.1.3
> set server submit_hosts = localhost
>
>
>
> Queue is set up properly:
> --
>
> root at pc-xubuntu:~/torque-2.1.3# qstat -q
>
> server: pc-xubuntu
>
> Queue Memory CPU Time Walltime Node Run Que Lm State
> ---------------- ------ -------- -------- ---- --- --- -- -----
> batch -- -- -- -- 0 0 -- E R
> ----- -----
> 0 0
>
> Compute node is replying properly:
> --
>
> root at pc-xubuntu:~/torque-2.1.3# pbsnodes -a
> localhost
> state = free
> np = 1
> ntype = cluster
> status = opsys=linux,uname=Linux pc-xubuntu 2.6.16-xen #1 SMP Thu
> Apr 13 18:46:07 BST 2006 i686,sessions=2949 3005 3007 3298 3339 3351
> 3356 3380
> 3464,nsessions=9,nusers=3,idletime=0,totmem=1641312kb,availmem=1552080kb,physmem=131244kb,ncpus=1,loadave=0.03,netload=0,state=free,jobs=?
>
> 0,rectime=1161291975
>
> There is trust between server and compute node:
> --
> root at pc-xubuntu:~/torque-2.1.3# cat /var/spool/torque/mom_priv/config
> $pbsserver localhost
>
> root at pc-xubuntu:~/torque-2.1.3# cat /var/spool/torque/localhost
> localhost
>
>
> Node is defined:
> --
> root at pc-xubuntu:~/torque-2.1.3# cat /var/spool/torque/server_priv/nodes
> localhost
>
>
> But submiting job via qsub is not running:
> --
> root at pc-xubuntu:~/torque-2.1.3# echo "sleep 30" | qsub
> qsub: Bad UID for job execution
>
>
> User sk at localhost is also server operator and manager but this user
> does not help:
> --
> pbs_iff: file not setuid root, likely misconfigured
> pbs_iff: cannot connect to pc-xubuntu:15001 - fatal error, errno=13
> (Permission denied)
> No Permission.
> qsub: cannot connect to server pc-xubuntu (errno=15007)
>
>
> Do I need installed sshd?
> Do I need configured .rhost file?
> Is it possible to submit job via qsub as root?
>
> You have said that you use one pc configuration too. Would you be so
> kind and provide me some short version of your configuration process?
> I mean some checklist which I can follow. Probably settings for just
> one computer can be easier than setting for network.
>
> And yes, I have tried to search resolution in previous mails. I have
> tried google too.
>
> Thank you very much
> S. Kunc
>
>
>
>
>
>
>
>
>
>
> On 19/10/06, Glen Beane <glen.beane+torque at gmail.com> wrote:
>> On 10/18/06, Standa Kunc <standa.kunc at gmail.com> wrote:
>> > Hello,
>> >
>> > I am beginning with Torqueu and I would like to test it. Just simple
>> > test on one PC (compute node will be same computer as server).
>> >
>> > I have installed Torque 2.1.3 successfully but next phase called L.2
>> > Initialize/Configure TORQUE on the Server (pbs_server) in QuickStart
>> > Manual is not working for me.
>> >
>> > Problem is that running command ./torqueu.setup root results in
>> error messages.
>> >
>> > root at pc-xubuntu:~/torque-2.1.3# ./torque.setup root
>> > initializing TORQUE (admin: root at pc-xubuntu)
>> > Max open servers: 4
>> > qmgr obj= svr=default: Bad ACL entry in host list MSG=First bad
>> host: pc-xubuntu
>> > Max open servers: 4
>> > qmgr obj= svr=default: Bad ACL entry in host list MSG=First bad
>> host: pc-xubuntu
>> >
>> > Some errors in /var/log/daemon.log
>> >
>> > Oct 19 00:25:03 localhost PBS_Server: Connection refused (111) in
>> > contact_sched, Could not contact Scheduler - port 15004
>> > Oct 19 00:25:03 localhost PBS_Server: Bad ACL entry in host list
>> > (15073) in manager_oper_chk, bad entry in acl: root at pc-xubuntu
>> > Oct 19 00:25:03 localhost PBS_Server: Bad ACL entry in host list
>> > (15073) in manager_oper_chk, bad entry in acl: root at pc-xubuntu
>> >
>> > It seems to me that I have problem with hostname resolution but I am
>> > linux beginner and I can not fix it myself. Probably it is problem
>> > with forward/reverse name resolution.
>> >
>> > I hope I do not need DNS server for this. Can you advice me or give me
>> > some simple material to set up name resolution properly? Have you any
>> > other idea?
>>
>> use localhost for your server name. I often test torque on my
>> workstation
>>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
More information about the torqueusers
mailing list