[torqueusers] PBS and Globus gatekeeper node
ckirby3 at colsa.com
Wed Feb 2 08:52:40 MST 2005
Try adding the machine to /etc/hosts.equiv on the PBS server. That helped
me even though I'm using ssh. I was getting the same error.
From: torqueusers-bounces at supercluster.org
[mailto:torqueusers-bounces at supercluster.org]On Behalf Of Chris Samuel
Sent: Tuesday, February 01, 2005 6:59 PM
To: torqueusers at supercluster.org
Subject: Re: [torqueusers] PBS and Globus gatekeeper node
On Fri, 28 Jan 2005 11:20 am, Gerson Galang wrote:
> But if I set "router at localhost" to route jobs to queues on another
> machine "batch at anothermachine.mydomain.com", PBS won't run the jobs
> anymore. PBS will tell me "Jobs rejected by all possible destinations"
> even without me seeing it tried contacting anothermachine.mydomain.com.
NB: We've never tried this, no idea if it can work or not..
A few thoughts:
1) Is anything logged about rejection on the destination machine ? Could
be that it's not permitted to queue jobs to that PBS server, or some DNS vs
local hostname issues ?
2) Does the following command work from your gatekeeper ?
qstat -q @anothermachine.mydomain.com
3) Can you submit a test job to that remote system without using a routing
queue ? For example:
echo hostname | qsub -l walltime=0:1:0 -q queue at anothermachine.mydomain.com
This works here between machines (which suprised me!).
4) Do you see anything on the wire if you run a packet sniffer ?
Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
Victorian Partnership for Advanced Computing http://www.vpac.org/
Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.8.4 - Release Date: 2/1/2005
More information about the torqueusers