[torqueusers] pbs_mom dies on exit of interactive session

Ken Nielson knielson at adaptivecomputing.com
Fri Apr 27 22:23:09 MDT 2012


On Fri, Apr 27, 2012 at 9:21 PM, DuChene, StevenX A <
stevenx.a.duchene at intel.com> wrote:

>  I am running torque-4.0.1 that I pulled from the svn 4.0.1 branch just
> today.****
>
> Earlier today I was running the 4.0-fixes tree from 04/03 and I had the
> same results.****
>
> I was hoping the update to current sources would fix these problems but no
> such luck.****
>
> ** **
>
> If I run the following:****
>
> ** **
>
> qsub -I -l nodes=7 -l arch=atomN570****
>
> ** **
>
> from my pbs job submission host I get:****
>
> ** **
>
> qsub: waiting for job 4.login2.sep.here to start****
>
> qsub: job 4.login2.sep.here ready****
>
> ** **
>
> and then I get a shell prompt on the node 0 of this job.****
>
> ** **
>
> If I then do:****
>
> ** **
>
> $ echo $PBS_NODEFILE****
>
> /var/spool/torque/aux//4.login2.sep.here****
>
> ** **
>
> And then:****
>
> ** **
>
> $ cat /var/spool/torque/aux//4.login2.sep.here****
>
> atom255****
>
> atom255****
>
> atom255****
>
> atom255****
>
> atom254****
>
> atom254****
>
> atom254****
>
> ** **
>
> and then I try:****
>
> ** **
>
> $ pbsdsh -h atom254 ls /tmp****
>
> pbsdsh: error from tm_poll() 17002****
>
> ** **
>
> Alternatively if I use the –v option it says:****
>
> ** **
>
> $ pbsdsh -v -h atom254 /bin/ls /tmp****
>
> pbsdsh: tm_init failed, rc = TM_ESYSTEM (17000)****
>
>
>

Steve,

I am able to reproduce the SIGABRT on the MOM. We will get this fixed.
Thanks for the help.

Ken
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20120427/416cd587/attachment-0001.html 


More information about the torqueusers mailing list