[torqueusers] Torque/Maui kills jobs running on the same node

Jerry Smith jdsmit at sandia.gov
Fri Feb 19 08:49:33 MST 2010


Evgeni,

Are you doing any process cleanup in the epilogue?  If so you may be 
killing all of that user's jobs when the first job exits.

--Jerry


Evgeni Bezus wrote:
> Hi all,
>
> We are running Maui and Torque on a 14-node cluster. Each node has 8 cores
> (2 4-core processors). When running two (or more) jobs from a single
> user on the same node, Maui(or Torque?) stops all the jobs when one of them is
> finished. The finished job has Exit_status=0, killed jobs -
> Exit_status=271. The value of the NODEACCESSPOLICY parameter in
> maui.cfg is SHARED. This problem does not occur when running jobs from
> a single user on different nodes or when running jobs from different
> users on the same node.
>
> Does anyone know how to resolve the problem?
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>
>   



More information about the torqueusers mailing list