Bug 99 - qsub crashes with -W option and specific number of chars
: qsub crashes with -W option and specific number of chars
Status: RESOLVED FIXED
Product: TORQUE
clients
: 2.5.x
: PC Linux
: P5 major
Assigned To: Glen
:
:
:
  Show dependency treegraph
 
Reported: 2010-11-12 07:30 MST by Maarten van Ingen
Modified: 2010-11-12 10:19 MST (History)
2 users (show)

See Also:


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description Maarten van Ingen 2010-11-12 07:30:20 MST
Hi,

When I run qsub with the -W option, it crashes with the following error:

*** glibc detected *** qsub: double free or corruption (!prev):
0x000000000b9cd950 ***
======= Backtrace: =========
/lib64/libc.so.6[0x347ee7230f]
/lib64/libc.so.6(cfree+0x4b)[0x347ee7276b]
qsub[0x402a12]
qsub[0x402da4]
qsub[0x404670]
qsub[0x406a2c]
qsub[0x40702a]
qsub[0x4076ee]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x347ee1d994]
qsub[0x402559]
======= Memory map: ========
00400000-0040b000 r-xp 00000000 08:02 3228554                           
/usr/bin/qsub
0060a000-0060c000 rw-p 0000a000 08:02 3228554                           
/usr/bin/qsub
0060c000-0060e000 rw-p 0060c000 00:00 0 
0b9cd000-0b9ee000 rw-p 0b9cd000 00:00 0                                  [heap]
347ea00000-347ea1c000 r-xp 00000000 08:02 6537318                       
/lib64/ld-2.5.so
347ec1b000-347ec1c000 r--p 0001b000 08:02 6537318                       
/lib64/ld-2.5.so
347ec1c000-347ec1d000 rw-p 0001c000 08:02 6537318                       
/lib64/ld-2.5.so
347ee00000-347ef4e000 r-xp 00000000 08:02 6537331                       
/lib64/libc-2.5.so
347ef4e000-347f14d000 ---p 0014e000 08:02 6537331                       
/lib64/libc-2.5.so
347f14d000-347f151000 r--p 0014d000 08:02 6537331                       
/lib64/libc-2.5.so
347f151000-347f152000 rw-p 00151000 08:02 6537331                       
/lib64/libc-2.5.so
347f152000-347f157000 rw-p 347f152000 00:00 0 
3480600000-348060d000 r-xp 00000000 08:02 6537355                       
/lib64/libgcc_s-4.1.2-20080825.so.1
348060d000-348080d000 ---p 0000d000 08:02 6537355                       
/lib64/libgcc_s-4.1.2-20080825.so.1
348080d000-348080e000 rw-p 0000d000 08:02 6537355                       
/lib64/libgcc_s-4.1.2-20080825.so.1
2af4638d2000-2af4638d4000 rw-p 2af4638d2000 00:00 0 
2af4638d4000-2af4638fe000 r-xp 00000000 08:02 3244368                   
/usr/lib/libtorque.so.2.0.0
2af4638fe000-2af463afe000 ---p 0002a000 08:02 3244368                   
/usr/lib/libtorque.so.2.0.0
2af463afe000-2af463b00000 rw-p 0002a000 08:02 3244368                   
/usr/lib/libtorque.so.2.0.0
2af463b00000-2af463be5000 rw-p 2af463b00000 00:00 0 
2af463bfb000-2af463bfc000 rw-p 2af463bfb000 00:00 0 
7fff42ff8000-7fff43021000 rw-p 7ffffffd5000 00:00 0                     
[stack]
ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0                  [vdso]
Aborted


This happens if and only if I use the following in the script (or as -W on the
commandline)

#PBS -W
stagein=CREAM798436615_jobWrapper.sh@gb-ce-ams.els.sara.nl:/opt/glite/var/cream_sandbox/ops/_O_dutchgrid_O_users_O_sara_CN_Ernst_Pijper_ops_Role_lcgadmin_Capability_NULL_ops001/79/CREAM798436615/CREAM798436615_jobWrapper.sh,stagein=cream_798436615.proxy@gb-ce-ams.els.sara.nl:/opt/glite/var/cream_sandbox/ops/_O_dutchgrid_O_users_O_sara_CN_Ernst_Pijper_ops_Role_lcgadmin_Capability_NULL_ops001/proxy/12881031212E589696sam2Dwms2Egrid2Esara2Enl13921030130163

When I remove one char or add at least 2 chars at the end, all goes well.

We have had a look and it seems that the source of this bug is the following in
qsub.c (2.5.3 version)

In the function smart_strtok (line 375 in qsub.c)
tmpLineSize = (line == NULL) ? strlen(*ptrPtr+ 1) : strlen(line) + 1;
It seems this needs to be:
tmpLineSize = (line == NULL) ? strlen(*ptrPtr)+ 1 : strlen(line) + 1;


We have seen this error in 2.4.10 and 2.5.3.
2.3.7 works fine, although we have not looked in the sources.
Comment 1 Ken Nielson 2010-11-12 10:19:40 MST
I was able to duplicate this bug and then verified the fix. It has been checked
into 2.4-fixes, 2.5-fixes, 3.0 and trunk.