[torqueusers] qsub crashing with dependency list between 62 and 75 jobs long

Glen Beane glen.beane at gmail.com
Wed Nov 25 14:48:41 MST 2009


On Wed, Nov 25, 2009 at 4:40 PM, Gabe Turner <gabe at msi.umn.edu> wrote:
> On Wed, Nov 25, 2009 at 12:17:24PM -0800, Chris Berthiaume wrote:
>> Hello,
>>
>> qsub will consistently crash when a job is submitted that has a job
>> dependency list between 62 and 75 jobs long.  This could be caused by job
>> count or by the size of the dependency string, I'm not sure.  All the job
>> IDs are 5 digits long so the limit in terms of characters in the list
>> would be 372 to 450 characters (5 digits per job ID plus one for every
>> colon), not counting "afterany".  Fewer than 62 jobs work fine, and more
>> than 75 jobs produces an "illegal -W value" from qsub. The error I'm
>> seeing is
>
> I just grepped through the 2.3.7 source and it looks like the limit is on
> the length of the dependency string:
>
> ./include/cmds.h:#define PBS_DEPEND_LEN 2040
> ./cmds/qsub.c:              pdepend = malloc(PBS_DEPEND_LEN);
> ./cmds/qsub.c:                  parse_depend_list(valuewd,pdepend,PBS_DEPEND_LEN))
>
> I have no idea if increasing PBS_DEPEND_LEN in cmd.h will be safe, (a
> Torque developer will need to comment on that), but it's worth a shot.


this has already been increased to 65528 in what will be released as 2.3.8.

Speaking of 2.3.8,  I think it has enough interesting fixes to warrant
a release.


More information about the torqueusers mailing list