[torqueusers] pbs_mom dies on exit of interactive session
DuChene, StevenX A
stevenx.a.duchene at intel.com
Fri Apr 27 22:29:17 MDT 2012
I don't suppose you have any idea why I am having tm connect problems in general do you?
Or any ideas about what I could look at?
--
Steven DuChene
From: torqueusers-bounces at supercluster.org [mailto:torqueusers-bounces at supercluster.org] On Behalf Of Ken Nielson
Sent: Friday, April 27, 2012 9:23 PM
To: Torque Users Mailing List
Cc: Brady Kimball; David Hill; Ryan Chabot
Subject: Re: [torqueusers] pbs_mom dies on exit of interactive session
On Fri, Apr 27, 2012 at 9:21 PM, DuChene, StevenX A <stevenx.a.duchene at intel.com<mailto:stevenx.a.duchene at intel.com>> wrote:
I am running torque-4.0.1 that I pulled from the svn 4.0.1 branch just today.
Earlier today I was running the 4.0-fixes tree from 04/03 and I had the same results.
I was hoping the update to current sources would fix these problems but no such luck.
If I run the following:
qsub -I -l nodes=7 -l arch=atomN570
from my pbs job submission host I get:
qsub: waiting for job 4.login2.sep.here to start
qsub: job 4.login2.sep.here ready
and then I get a shell prompt on the node 0 of this job.
If I then do:
$ echo $PBS_NODEFILE
/var/spool/torque/aux//4.login2.sep.here
And then:
$ cat /var/spool/torque/aux//4.login2.sep.here
atom255
atom255
atom255
atom255
atom254
atom254
atom254
and then I try:
$ pbsdsh -h atom254 ls /tmp
pbsdsh: error from tm_poll() 17002
Alternatively if I use the -v option it says:
$ pbsdsh -v -h atom254 /bin/ls /tmp
pbsdsh: tm_init failed, rc = TM_ESYSTEM (17000)
Steve,
I am able to reproduce the SIGABRT on the MOM. We will get this fixed. Thanks for the help.
Ken
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20120428/f749a065/attachment.html
More information about the torqueusers
mailing list