[Mauiusers] Job Dependencies

Dave Jackson jacksond at supercluster.org
Thu Dec 2 12:08:41 MST 2004


  In TORQUE/OpenPBS, job dependencies are implemented as dynamically
allocated linked lists which are attached to the jobs, thus there is not
a hard limit on the number of jobs which can depend on a single job.  If
you are seeing hangs, you are most likely running into pbs_server based
processing delays.  The actual failure may be resulting from a poorly
handled timeout of a buffer overflow.  You can probably isolate this by
running pbs_server under gdb or valgrind.

  See http://clusterresources.com/torquedocs/torquetrouble.shtml


On Wed, 2004-12-01 at 17:34, Richard Rowbatham wrote:
> Does anyone know of a hard limit to the number of jobs that can be
> dependent on a single job in PBS (we are using OpenPBS 2.3.16). I had
> a user who queued up about 800 jobs all with a dependency of afterok
> on the same running job. This total hosed my server up until I deleted
> the target of the dependence from the server_priv/jobs directory and
> restarted. It would be nice to know what the limit is so as to avoid
> this in the future.
> Ricky Rowbatham
> Fairfield Industries
> Technology Group
> (281) 275-7547
> (281) 275-7660 Fax.
> ______________________________________________________________________
> _______________________________________________
> mauiusers mailing list
> mauiusers at supercluster.org
> http://supercluster.org/mailman/listinfo/mauiusers

More information about the mauiusers mailing list