[torquedev] Problem with TM interface in Torque 2.1.0p0
Brock Palen
brockp at umich.edu
Fri May 19 15:21:48 MDT 2006
On May 19, 2006, at 5:07 PM, garrick at speculation.org wrote:
> On Fri, May 19, 2006 at 09:01:38AM -0400, Brock Palen alleged:
>>
>> Im jumping in the middle of this, I represent the owners of teh PPC
>> cluster thats having the problem with TM and torque-2.1.0p0, The
>> machine previously ran PBSPro, but we have been switching everything
>> (Linux OSX) to Torque+Moab from PBSPro+Maui, Bellow is the
>> requested information,
>
> So you have some of your PPC nodes running Linux and some running OSX?
> The problem happens on Linux, OSX, or both? I have 0 experience with
> Linux on PPC, but I do have some OSX boxes to play with over here.
No the PPC are all OSX we have linux also but that has 2.0.0p8 And
have not tested 2.1 with linux.
>
>> aon:~ root# /home/software/torque-2.1.0p0/sbin/momctl -d 4 -h aon038
>>
>> Host: aon038.engin.umich.edu/aon038.engin.umich.edu Version:
>> 2.1.0p0
>> job[407.aon.engin.umich.edu] state=RUNNING sidlist=24059
>> Assigned CPU Count: 2
>
> Job in running state, with 2 allocated CPUs. Good..
>
>
>> exec_host = aon038/1+aon038/0
>
> exec_host is set, that's good.
>
> So pbs_mom has the right info, but for some reason the nodelist isn't
> getting passed back to TM clients.
>
>> From reading the code, I still looks like that error message can only
> come from a failed calloc(), but that isn't a reasonable precondition
> given that you have no other complaints about your system.
>
> I'll poke at an OSX box here.
>
> _______________________________________________
> torquedev mailing list
> torquedev at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torquedev
>
>
More information about the torquedev
mailing list