[torqueusers] Problem with Torque with AMD Opteron and RHEL 3

Bas van der Vlies basv at sara.nl
Thu Dec 9 03:05:25 MST 2004


Dave Jackson wrote:
> Bas,
> 
>   This should be easy to patch but we have so far been unable to
> reproduce it in our lab with or without root squash.  If any site can
> reliably reproduce it and is able to work with us, we can most likely
> correct this today.
> 

Dave,

  It is easily to reproduce for me. Just chmod 700 my homedir directory.
  Or must i try the new p5 snapshot on on node.


  We have an timezone difference ;-)


> 
> On Wed, 2004-12-08 at 03:58, Bas van der Vlies wrote:
> 
>>We at SARA have the same problem. I have turned on root_squash. The 
>>problem disappeared it i made my home directory 755. But that is not an
>>real soltion. we are using torque 1.1.0p4
>>
>>		Regards
>>
>>Leandro Tavares Carneiro wrote:
>>
>>>Chris,
>>>
>>>I can see the home directory of all users, but i dont have it exported 
>>>with no_root_squas parameter because we dont need it before, and this 
>>>home area is served by some NetApp fillers to the users.
>>>
>>>We have here other clusters with a much larger nodes and we never had 
>>>this problem. The other cluster are Xeon and the OS is the old RedHat. 
>>>This problem only happen with this Opteron/RHEL WS cluster.
>>>
>>>Thanks for your help,
>>>
>>>Regards,
>>>
>>>Leandro Tavares Carneiro
>>>Petrobras TI/TI-E&P/STEP Suporte Tecnico de E&P
>>>Av Chile, 65 sala 1501 EDISE - Rio de Janeiro / RJ
>>>Tel: (0xx21) 2534-1427
>>>
>>>
>>>Chris Samuel wrote:
>>>
>>>
>>>>On Tue, 7 Dec 2004 10:18 pm, Leandro Tavares Carneiro wrote:
>>>>
>>>>
>>>>
>>>>>       I have checked everything in my nodes and server and is 
>>>>>everything
>>>>>OK. All the nodes can recognize the user id i'm using and the home
>>>>>directory is mounting, but i still got this error.
>>>>>
>>>>>Dec  7 09:04:32 node002 pbs_mom: scan_for_exiting, cannot chdir to user
>>>>>home directory
>>>>
>>>>
>>>>
>>>>Are you exporting the users home directories with no_root_squash from 
>>>>the NFS server ?
>>>>
>>>>Easiest way to check that is to login to node002 as root and then try 
>>>>and cd to the users home directory - if you get a permission denied 
>>>>error this is probably what's going on.
>>>>
>>>>A number of folks have reported this recently, it doesn't affect us 
>>>>here as we're exporting with no_root_squash (we have total control 
>>>>over all clients and server).
>>>>
>>>>The other time we've seen this is after an NFS server crash when the 
>>>>clients have stale NFS file handles, again trying the above should 
>>>>tell you.
>>>>
>>>>It would be very nice if the pbs_mom reported the value of errno and 
>>>>its sys_errlist equivalent. :-)
>>>>
>>>>cheers,
>>>>Chris
>>>>
>>>>
>>>>------------------------------------------------------------------------
>>>>
>>>>_______________________________________________
>>>>torqueusers mailing list
>>>>torqueusers at supercluster.org
>>>>http://supercluster.org/mailman/listinfo/torqueusers
>>>
>>>_______________________________________________
>>>torqueusers mailing list
>>>torqueusers at supercluster.org
>>>http://supercluster.org/mailman/listinfo/torqueusers
>>
> 


-- 
--
********************************************************************
*                                                                  *
*  Bas van der Vlies                     e-mail: basv at sara.nl      *
*  SARA - Academic Computing Services    phone:  +31 20 592 8012   *
*  Kruislaan 415                         fax:    +31 20 6683167    *
*  1098 SJ Amsterdam                                               *
*                                                                  *
********************************************************************


More information about the torqueusers mailing list