[torqueusers] PBS_NODEFILE munged during file staging?

Garrick Staples garrick at usc.edu
Fri Jan 6 13:07:43 MST 2006


On Thu, Jan 05, 2006 at 05:05:37PM -0800, Garrick Staples alleged:
> I've not seen this before and I can't reproduce it over here.  I know
> that part of the code pretty well and I can't think of any connections
> between PBS_NODEFILE and stagein.
> 
> MOM makes a nodefile if the neednodes resource is set on the job.  Does
> 'qstat -f' show "Resource_List.neenodes" for the job?  Be sure to run
> qstat as a server manager (pbs_server doesn't let non-managers see
> neednodes.)

Which scheduler are you using?  I wonder if for some odd reason the
scheduler is clearing neednodes when stagein is used.

(Yes, I'm still struggling to find some sort of connection)

 
> A bug in wordexp would be the prime suspect here.  Does it happen when
> you rebuild without wordexp?

And of course, if it works without wordexp, I'd ask you to try again
with a p5 snapshot.  There hasn't been any changes to neednodes,
nodefile, or a job's env, but the wordexp code got a facelift to better
support globbing.

-- 
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20060106/8fdd40ec/attachment.bin


More information about the torqueusers mailing list