[torqueusers] qsub crashing with dependency list between 62 and 75 jobs long

Gabe Turner gabe at msi.umn.edu
Wed Nov 25 14:40:16 MST 2009


On Wed, Nov 25, 2009 at 12:17:24PM -0800, Chris Berthiaume wrote:
> Hello,
> 
> qsub will consistently crash when a job is submitted that has a job
> dependency list between 62 and 75 jobs long.  This could be caused by job
> count or by the size of the dependency string, I'm not sure.  All the job
> IDs are 5 digits long so the limit in terms of characters in the list
> would be 372 to 450 characters (5 digits per job ID plus one for every
> colon), not counting "afterany".  Fewer than 62 jobs work fine, and more
> than 75 jobs produces an "illegal -W value" from qsub. The error I'm
> seeing is

I just grepped through the 2.3.7 source and it looks like the limit is on
the length of the dependency string:

./include/cmds.h:#define PBS_DEPEND_LEN 2040
./cmds/qsub.c:              pdepend = malloc(PBS_DEPEND_LEN);
./cmds/qsub.c: 			parse_depend_list(valuewd,pdepend,PBS_DEPEND_LEN))

I have no idea if increasing PBS_DEPEND_LEN in cmd.h will be safe, (a
Torque developer will need to comment on that), but it's worth a shot.

HTH,

Gabe

-- 
Gabe Turner                                             gabe at msi.umn.edu
HPC Systems Administrator,
University of Minnesota
Supercomputing Institute                          http://www.msi.umn.edu


More information about the torqueusers mailing list