[torqueusers] what causes many defunct pbs_mom processes

Jeffrey Lang jrlang at uwyo.edu
Thu Dec 19 10:29:39 MST 2013


Roger

   This was a known bug in 4.2.3.1 maybe other old versions of torque.  
We had this problem and upgraded to 4.2.6 (the latest torque) and the 
problem seems to have been fixed.


On 12/19/2013 10:24 AM, Moye,Roger V wrote:
>
> Suddenly this week we have had a storm of problems with defunct 
> pbs_mom processes as shown here:
>
> root      6811  4589  0 11:19 ? 00:00:00 [pbs_mom] <defunct>
>
> The particular node from where this was taken has only been up 45 
> minutes so the problem occurred almost immediately upon new jobs 
> running on this node.  At present there are 70 of these defunct 
> processes.    I am seeing this on multiple nodes.
>
> We are using version 4.2.3.1 with Maui 3.3.1 on RHEL 6.4.
>
> Does anyone know what causes these to occur?
>
> Many thanks!
>
> -Roger
>
> -----------------------------------------------------------
>
> Roger V. Moye
>
> Systems Analyst III
>
> XSEDE Campus Champion
>
> University of Texas - MD Anderson Cancer Center
>
> Division of Quantitative Sciences
>
> Pickens Academic Tower - FCT4.6109
>
> Houston, Texas
>
> (713) 792-2134
>
> -----------------------------------------------------------
>
>
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20131219/0505251a/attachment.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: jrlang.vcf
Type: text/x-vcard
Size: 309 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20131219/0505251a/attachment.vcf 


More information about the torqueusers mailing list