[torqueusers] OS X problems with $usecp

Glen Beane beaneg at umcs.maine.edu
Tue Sep 7 19:32:39 MDT 2004


I have user home directories mounted from a different NFS server than 
the pbs_server machine (however, the home directory will soon be part of 
a SAN and directly attached via fibre channel)

anyway with the current setup I can get jobs to start, but I can't get 
the spool files to get copied back properly:

with the current setup,

the pbs_server is called bender (pbs_servername is 
bender.bender.clusters.umaine.edu)
the nfs server is called calculon

On bender and the compute nodes the user's home directory is in:
/private/var/automount/Network/Servers/calculon/Volumes/Home


pbs_mom would try to copy to 
bender.bender.clusters.umaine.edu:/private/var/automount/Network/Servers/calculon/Volumes/Home

I have a lot of experience with torque/PBS Pro on Linux, so obviously I 
knew to use the $usecp command in the config file for pbs_mom.

If I tried
$usecp bender.bender.clusters.umaine.edu:/Network/Servers/calculon/Volume
s/Home /private/var/automount/Network/Servers/calculon/Vomumes/Home then 
pbs_mom's log file would look like this when reading the config file:

09/07/2004 17:43:27;0002; 
pbs_mom;Svr;usecp;bender.bender.clusters.umaine.edu:/Network/Servers/calculon/Volume
s/Home /private/var/automount/Network/Servers/
09/07/2004 17:43:27;0080;   pbs_mom;n/a;add_static;config[6] add name 
calculon value /Volumes/Home

I figured the $usecp line was too long, so I symliked /home to the 
automount point mentioned above. This gave me the following output:

09/07/2004 17:54:08;0002; 
pbs_mom;Svr;usecp;bender.bender.clusters.umaine.edu:/Network/Servers/calculon/Volume
s/Home /home

Looks good so far, but when I run a job I get this error:

09/07/2004 17:56:08;0004;   pbs_mom;Fil;30.bender.b.OU;Unable to copy 
file 30.bender.b.OU to bender.bender.clusters.umaine.edu

It should be copying the file to /home, so I don't know why pbs_mom 
lists bender.bender.clusters.umaine.edu as the destination file 
(lookedup where this particular error is printed in requests.c, but I'm 
not exactly sure why this usecp command isn't working correctly...


Glen



More information about the torqueusers mailing list