[torqueusers] PBS and Globus gatekeeper node

Clifton Kirby ckirby3 at colsa.com
Wed Feb 2 08:52:40 MST 2005


Try adding the machine to /etc/hosts.equiv on the PBS server.  That helped
me even though I'm using ssh.  I was getting the same error.

- Cliff

-----Original Message-----
From: torqueusers-bounces at supercluster.org
[mailto:torqueusers-bounces at supercluster.org]On Behalf Of Chris Samuel
Sent: Tuesday, February 01, 2005 6:59 PM
To: torqueusers at supercluster.org
Subject: Re: [torqueusers] PBS and Globus gatekeeper node


On Fri, 28 Jan 2005 11:20 am, Gerson Galang wrote:

> But if I set "router at localhost" to route jobs to queues on another
> machine "batch at anothermachine.mydomain.com", PBS won't run the jobs
> anymore. PBS will tell me "Jobs rejected by all possible destinations"
> even without me seeing it tried contacting anothermachine.mydomain.com.

NB: We've never tried this, no idea if it can work or not..

A few thoughts:

1) Is anything logged about rejection on the destination machine ?   Could
it
be that it's not permitted to queue jobs to that PBS server, or some DNS vs
local hostname issues ?

2) Does the following command work from your gatekeeper ?

 qstat -q @anothermachine.mydomain.com

3) Can you submit a test job to that remote system without using a routing
queue ?  For example:

 echo hostname | qsub -l walltime=0:1:0 -q queue at anothermachine.mydomain.com

This works here between machines (which suprised me!).

4) Do you see anything on the wire if you run a packet sniffer ?

cheers!
Chris
--
 Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
 Victorian Partnership for Advanced Computing http://www.vpac.org/
 Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia




-- 
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.8.4 - Release Date: 2/1/2005



More information about the torqueusers mailing list