[torquedev] Re: 2.3.6 release?
Joshua Bernstein
jbernstein at penguincomputing.com
Tue Dec 16 12:08:30 MST 2008
Josh Butikofer wrote:
> No, it will not be based on the 2.4.0 snapshots ... it will be based on
> the 2.3.6 snapshots. :)
Hmmm. I guess I just grabbed the latest snapshot and went from there. I
can have a look at the 2.3.6 snap and generate a diff from there.
> However, I know the TORQUE developers are definitely interested in
> putting in your patch for the pbs_mom in 2.4. Are you sure that the
> pbs_mom's in 2.3.x are also not affected by the segfault you found?
Yes pbs_mom is affected in the 2.3.x branch as well as the 2.1 branch.
Though I've only directly observed the failures in version 2.3.3, 2.3.5,
and 2.1.9.
-Joshua Bernstein
Software Engineer
Penguin Computing
> Josh Butikofer
> Cluster Resources, Inc.
> #############################
>
>
> Joshua Bernstein wrote:
>> Is the 2.3.6 release based on the 2.4.0 snapshots?
>>
>> If so I would like to see a fix go in for the pbs_mom segfault I
>> mentioned here:
>>
>> http://www.clusterresources.com/pipermail/torqueusers/2008-December/008411.html
>>
>>
>> I can provide a patch and an explanation shortly.
>>
>> -Joshua Bernstein
>> Software Engineer
>> Penguin Computing
>>
>> Josh Butikofer wrote:
>>> Agreed. I will start the process so we can release soon. Does anyone
>>> on the list have any objections to releasing 2.3.6? Is there anything
>>> that needs put into TORQUE before this release?
>>>
>>> Josh Butikofer
>>> Cluster Resources, Inc.
>>> #############################
>>>
>>>
>>> Glen Beane wrote:
>>>> I think we should get 2.3.6 released, as it is right now pbs_sched can
>>>> not read its config file properly because of the hard tabs in strtok
>>>> delimiters that got replaced by space with astyle (these have been
>>>> fixed so they are \t in 2.3.6)
>>>>
>>>>
>>>>
>>>> 2.3.6
>>>> e - in Linux, a pbs_mom will now "kill" a job's task, even if that
>>>> task can no longer be
>>>> found in the OS processor table. This prevents jobs from getting
>>>> "stuck" when the PID
>>>> vanishes in some rare cases.
>>>> e - forward-ported change from 2.1-fixes (r2581) (b - reissue job
>>>> obit even if no
>>>> processes are found)
>>>> b - change back to not sending status updates until we get cluster
>>>> addr message
>>>> from server, also only try to send hello when the server
>>>> stream is down.
>>>> b - change pbs_server so log_file_max_size of zero behavior matches
>>>> documentation
>>>> e - added periodic logging of version and loglevel to help in support
>>>> e - added pbs_mom config option ignvmem to ignore vmem/pvmem limit
>>>> enforcement
>>>> b - change to correct strtoks that accidentally got changed in astyle
>>>> formatting
>>> _______________________________________________
>>> torquedev mailing list
>>> torquedev at supercluster.org
>>> http://www.supercluster.org/mailman/listinfo/torquedev
More information about the torquedev
mailing list