[torqueusers] UC Torque deletes all jobs of user on same node
JMRUSHTON at qinetiq.com
Tue Nov 22 03:49:48 MST 2011
Have a look to see if you are running epilogue scripts that clean up
/dev/shm for you. We had exactly the same issue, the script naively
assumes that a user has only one job on the node and removes all the
files belonging to the user which kills any other jobs. When running on
just one system the files are not created. We have had to set the Moab
configuration to include "NODEACCESSPOLICY UNIQUEUSER", I don't know if
Maui has the equivalent.
HPC System Manager, Weapons Technologies
Tel: 01959 514777, Mobile: 07939 219057
email: jmrushton at QinetiQ.com
QinetiQ - Delivering customer-focused solutions
Please consider the environment before printing this email.
From: torqueusers-bounces at supercluster.org
[mailto:torqueusers-bounces at supercluster.org] On Behalf Of Robert Jacobi
Sent: 20 November 2011 01:08
To: torqueusers at supercluster.org
Subject: [torqueusers] Torque deletes all jobs of user on same node
We've recently run into a curious problem with torque. When a user
deletes on of his jobs using "qdel jobid", and this job to be deleted
spans more than one processor on the node, then all other jobs of the
same user on the same node are canceled as well. If the deleted job only
runs on one processor, then the other jobs of the user on the node are
not affected and keep running.
Thus it seems to me that whenever the pbs mom on the node has to delete
from more than one processor it somehow indiscriminately tries to delete
them from all processors and the other user's jobs might only be
unaffected due to the lack of privileges over other users processes.
At this point I've no clue how to further diagnose or solve this issue.
I've tried to google this problem but couldn't find anything, so I hope
you have an idea.
University of Arizona
Department of Aerospace & Mechanical Engineering 1130 N. Mountain Ave.
Tucson, AZ, 85721-0119
tel: +1 (520) 621 4369
mail: rjacobi at email.arizona.edu
The less time you spent on algebra in life, the more time you have to be
a happy person. (Kerschen)
Doubt is not a pleasant condition, but certainty is absurd. (Voltaire)
All great truths begin as blasphemies. (Shaw)
Denken ist etwas, das auf Schwierigkeiten folgt und dem das Handeln
torqueusers mailing list
torqueusers at supercluster.org
This email and any attachments to it may be confidential and are
intended solely for the use of the individual to whom it is
addressed. If you are not the intended recipient of this email,
you must neither take any action based upon its contents, nor
copy or show it to anyone. Please contact the sender if you
believe you have received this email in error. QinetiQ may
monitor email traffic data and also the content of email for
the purposes of security. QinetiQ Limited (Registered in England
& Wales: Company Number: 3796233) Registered office: Cody Technology
Park, Ively Road, Farnborough, Hampshire, GU14 0LX http://www.qinetiq.com.
More information about the torqueusers