[torqueusers] Handling of double-fork-and-kill detached processes
David Singleton
David.Singleton at anu.edu.au
Thu Mar 3 13:51:02 MST 2005
Ian Stokes-Rees wrote:
> How does Torque deal with processes which detach from their parent
> process via the common "double fork and kill" technique? I'm just
> wondering if it is possible for users to start a process which then
> sticks around even when the original process group has been killed. We
> seem to be having this problem with a current cluster and are wondering
> if Torque does anything "auto-magically" to catch these processes and
> kill them.
>
PBS, and I presume torque, use process session ids to identify
processes in a job (actually identifies the individual tasks of
a job). A simple "fork, fork and exit daemonizing" code like the
following that does not change session id is OK.
if ( !fork() ) {
if ( fork() ) exit(0);
else sleep(60);
}
else
sleep(60);
> ps -fj
UID PID PPID PGID SID C STIME TTY TIME CMD
dbs900 1744 1742 1744 1744 0 07:28 pts/6 00:00:00 -tcsh
dbs900 1765 1744 1765 1744 0 07:28 pts/6 00:00:00 ./a.out
dbs900 1766 1765 1765 1744 0 07:28 pts/6 00:00:00 [a.out <defunct>]
dbs900 1767 1 1765 1744 0 07:28 pts/6 00:00:00 ./a.out
So all these processes are cleaned up when a job is killed or exits.
But processes that simply call setsid() (and all their children) do
escape from a PBS job.
Two things:
1. PBS loses the cputime used by daemonized processes when they
exit because there is no parent in the job to inherit that
usage.
2. Covering the setsid() jobs requires using an alternative job
identifier like PAGG (http://oss.sgi.com/projects/pagg/)
or cpusets or just hacking Linux to add a "sid" that users
cant change.
David
--
--------------------------------------------------------------------------
ANU Supercomputer Facility
David.Singleton at anu.edu.au and APAC National Facility
Phone: +61 2 6125 4389 Leonard Huxley Bldg (No. 56)
Fax: +61 2 6125 8199 Australian National University
Canberra, ACT, 0200, Australia
--------------------------------------------------------------------------
More information about the torqueusers
mailing list