[torquedev] Problem with TM interface in Torque 2.1.0p0

Brock Palen brockp at umich.edu
Fri May 19 15:21:48 MDT 2006


On May 19, 2006, at 5:07 PM, garrick at speculation.org wrote:

> On Fri, May 19, 2006 at 09:01:38AM -0400, Brock Palen alleged:
>>
>> Im jumping in the middle of this,  I represent the owners of teh PPC
>> cluster thats having the problem with TM and torque-2.1.0p0,  The
>> machine previously ran PBSPro, but we have been switching everything
>> (Linux OSX) to Torque+Moab  from PBSPro+Maui,  Bellow is the
>> requested information,
>
> So you have some of your PPC nodes running Linux and some running OSX?
> The problem happens on Linux, OSX, or both?  I have 0 experience with
> Linux on PPC, but I do have some OSX boxes to play with over here.

No the PPC are all OSX  we have linux also but that has 2.0.0p8  And  
have not tested 2.1 with linux.


>
>> aon:~ root# /home/software/torque-2.1.0p0/sbin/momctl -d 4 -h aon038
>>
>> Host: aon038.engin.umich.edu/aon038.engin.umich.edu   Version:  
>> 2.1.0p0
>> job[407.aon.engin.umich.edu]  state=RUNNING  sidlist=24059
>> Assigned CPU Count:     2
>
> Job in running state, with 2 allocated CPUs.  Good..
>
>
>>     exec_host = aon038/1+aon038/0
>
> exec_host is set, that's good.
>
> So pbs_mom has the right info, but for some reason the nodelist isn't
> getting passed back to TM clients.
>
>> From reading the code, I still looks like that error message can only
> come from a failed calloc(), but that isn't a reasonable precondition
> given that you have no other complaints about your system.
>
> I'll poke at an OSX box here.
>
> _______________________________________________
> torquedev mailing list
> torquedev at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torquedev
>
>



More information about the torquedev mailing list