[torqueusers] file size limited 2GB

Ronny T. Lampert telecaadmin at uni.de
Sat Nov 5 03:45:13 MST 2005


>>I have a cluster using Redhat 7.3, OpenPBS-2.3.16 and maui-3.2.5p2.
>>User can login computing node and create file more then 2GB.
>>But user submit job to computing node, the job cannot create file more 2GB.
> I don't think RH7.3 had very good large file support.  I recall having
> lots of problems with RH7.2, it was one of the main reasons I upgraded
> to RHEL3 (that, and the lack of security updates).  But I could be wrong
> since I never actually used RH7.3.

Basically

1) you have to check that your code is largefile safe and you don't wrongly
use int / long or somesuch for offsets, sizes, etc - that's what off_t is for.

2) make sure you don't use: ftell(), fseek() or similar.
Use ftello(), fseeko() instead (or lseek and friends of course).

3) you have to compile your code with

-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64

added to CFLAGS. That will activate the LARGEFILE magic that will replace
all file access calls and off_t sizes with their 64bit counterparts.

4) add some asserts() to check if the seeks/... were successful.

RH7.3 definately had some problems with important programms like cp, ls, rm
 not being compiled with LARGEFILE support. Their libc SHOULD be ready, though.
That means you won't be able to interact with such big files properly.
If you further run their distribution kernels I am not sure if the supplied
ext3/ext2 was ready for large files.

Secondly and more on topic, TORQUE definately has a problem with job logs
(.err and .out files) being larger than 2GB.
That is because of the shell redirection used and most distros don't compile
bash/tcsh/... with -D_LARGEFILE et al.
You may do this for youself using an SRPM and altering the CFLAGS.


I think now we've got everything covered :)

Cheers,
Ronny



More information about the torqueusers mailing list