[torqueusers] Problem with Torque with AMD Opteron and RHEL 3
Bas van der Vlies
basv at sara.nl
Thu Dec 9 03:05:25 MST 2004
Dave Jackson wrote:
> Bas,
>
> This should be easy to patch but we have so far been unable to
> reproduce it in our lab with or without root squash. If any site can
> reliably reproduce it and is able to work with us, we can most likely
> correct this today.
>
Dave,
It is easily to reproduce for me. Just chmod 700 my homedir directory.
Or must i try the new p5 snapshot on on node.
We have an timezone difference ;-)
>
> On Wed, 2004-12-08 at 03:58, Bas van der Vlies wrote:
>
>>We at SARA have the same problem. I have turned on root_squash. The
>>problem disappeared it i made my home directory 755. But that is not an
>>real soltion. we are using torque 1.1.0p4
>>
>> Regards
>>
>>Leandro Tavares Carneiro wrote:
>>
>>>Chris,
>>>
>>>I can see the home directory of all users, but i dont have it exported
>>>with no_root_squas parameter because we dont need it before, and this
>>>home area is served by some NetApp fillers to the users.
>>>
>>>We have here other clusters with a much larger nodes and we never had
>>>this problem. The other cluster are Xeon and the OS is the old RedHat.
>>>This problem only happen with this Opteron/RHEL WS cluster.
>>>
>>>Thanks for your help,
>>>
>>>Regards,
>>>
>>>Leandro Tavares Carneiro
>>>Petrobras TI/TI-E&P/STEP Suporte Tecnico de E&P
>>>Av Chile, 65 sala 1501 EDISE - Rio de Janeiro / RJ
>>>Tel: (0xx21) 2534-1427
>>>
>>>
>>>Chris Samuel wrote:
>>>
>>>
>>>>On Tue, 7 Dec 2004 10:18 pm, Leandro Tavares Carneiro wrote:
>>>>
>>>>
>>>>
>>>>> I have checked everything in my nodes and server and is
>>>>>everything
>>>>>OK. All the nodes can recognize the user id i'm using and the home
>>>>>directory is mounting, but i still got this error.
>>>>>
>>>>>Dec 7 09:04:32 node002 pbs_mom: scan_for_exiting, cannot chdir to user
>>>>>home directory
>>>>
>>>>
>>>>
>>>>Are you exporting the users home directories with no_root_squash from
>>>>the NFS server ?
>>>>
>>>>Easiest way to check that is to login to node002 as root and then try
>>>>and cd to the users home directory - if you get a permission denied
>>>>error this is probably what's going on.
>>>>
>>>>A number of folks have reported this recently, it doesn't affect us
>>>>here as we're exporting with no_root_squash (we have total control
>>>>over all clients and server).
>>>>
>>>>The other time we've seen this is after an NFS server crash when the
>>>>clients have stale NFS file handles, again trying the above should
>>>>tell you.
>>>>
>>>>It would be very nice if the pbs_mom reported the value of errno and
>>>>its sys_errlist equivalent. :-)
>>>>
>>>>cheers,
>>>>Chris
>>>>
>>>>
>>>>------------------------------------------------------------------------
>>>>
>>>>_______________________________________________
>>>>torqueusers mailing list
>>>>torqueusers at supercluster.org
>>>>http://supercluster.org/mailman/listinfo/torqueusers
>>>
>>>_______________________________________________
>>>torqueusers mailing list
>>>torqueusers at supercluster.org
>>>http://supercluster.org/mailman/listinfo/torqueusers
>>
>
--
--
********************************************************************
* *
* Bas van der Vlies e-mail: basv at sara.nl *
* SARA - Academic Computing Services phone: +31 20 592 8012 *
* Kruislaan 415 fax: +31 20 6683167 *
* 1098 SJ Amsterdam *
* *
********************************************************************
More information about the torqueusers
mailing list