[torqueusers] Empty output/error log file

FyD fyd at q4md-forcefieldtools.org
Mon Mar 28 04:49:04 MDT 2011


ok thanks Cole; I have some tests to do by now...

I will summarize to the Torque Users Mailing List what we did to solve  
our problem. I let you know. regards, Francois


Quoting "Coyle, James J [ITACD]" <jjc at iastate.edu>:

> Francois,
>
>   That is a huge %used number of inodes for a scratch disk.
> I don't have more than 2% used on the scratch disk for
> any of my nodes. If I read the listing you provided,
> I see 80 Million inodes used (80 million files?)
>
>   Make sure that you clean up scratch files used exclusively
> by one process when that process completes, and any shared
> scratch files when your job completes.
>
>   If the above does not fix the problem, and if you can redo
> the filesystem for /dev/sdb1 on each node, I'd make more inodes.
> See the -i option on mke2fs.  (Inodes don't take much room and if
> you run out of inodes, files cannot be created.)
>
>   OR:
>
>   Since you are on CentoS 5.x , I'd suggest switching
> to the xfs filesystem if you have large numbers of files.
> I switched from ext3 to xfs at the advice of my cluster vendor
> and my users report better performance for large numbers
> of files in a directory, and I see better fileserver stability.
> That is my experience, though yours might be different.
>
> For changing to XFS:
> --------------------
>   Become root and check that you have /sbin/mkfs.xfs
> or you should get it with yum.
>
>   If all you have on the /dev/sdb disk is the scratch partition, you  
>  can issue:
>
> umount /dev/sdb1
>
> parted /dev/sdb rm 1      #remove existing partition number 1 if any
> parted -s /dev/sdb mklabel gpt   #create the label for the drive
> parted /dev/sdb mkpart primary xfs "0 -0"  # make a single xfs   
> partition spanning the whole drive
>
> mkfs.xfs -f /dev/sdb1         # Create the XFS filesystem.
>
>   You might want to use the -i option of mkfs.xfs to increase the   
> number of inodes if
> the default is not enough.  On my fileserver, I got one inode per Mbyte,
>
>
>   For performance you might also want to mount with the options:
> noatime,nodiratime
>
> e.g.  comment out your /etc/fstab entry for /dev/sdb1 and change it to:
>
> /dev/sdb1 /scratch xfs rw,noatime,nodiratime,usrquota 0 0
>
> and issue umount /dev/sdb1 ; mount /dev/sdb1
>
> This is so that the filesystem does not need to keep updating the   
> access time.
>
>
>  James Coyle, PhD
>  High Performance Computing Group
>  115 Durham Center
>  Iowa State Univ.           phone: (515)-294-2099
>  Ames, Iowa 50011           web: http://www.public.iastate.edu/~jjc
>
>> -----Original Message-----
>> From: torqueusers-bounces at supercluster.org [mailto:torqueusers-
>> bounces at supercluster.org] On Behalf Of FyD
>> Sent: Friday, March 25, 2011 7:22 AM
>> To: Torque Users Mailing List
>> Subject: Re: [torqueusers] Empty output/error log file
>>
>> Michael,
>>
>>>> /dev/sdb1     ext3    917G  414G  457G  48% /scratch
>>>>
>>>> as you can see the /scratch partition is not full...
>>>
>>> still, those 1M files might be the problem.
>>
>> I can requested 256 000 files (smaller grid) instead of 1 000 000 &
>> the same problem
>> appends...
>>
>>> Are those files temporary or do they belong to the result set?
>>
>> Yes
>>
>>> Will they be copied back to your head node and deleted afterwards?
>>
>> No
>>
>>> I guess you use /home with nfs to get your results back?
>>
>> No otherwise (because of NFS) it will take for ever...
>>
>>> Please check "df -i", too. A good way to exclude inode problems is
>> to
>>> run 8 jobs and issue "df -i" during computation.
>>
>> ok
>>
>> [xxxx at node2 ~]$ df -i
>> Sys. de fich.         Inodes   IUtil.  ILib. %IUti. Monté sur
>> /dev/sda3             767232   93248  673984   13% /
>> /dev/sda5            59211776      15 59211761    1% /tmp
>> /dev/sda1              26104      41   26063    1% /boot
>> tmpfs                1537806       1 1537805    1% /dev/shm
>> /dev/sdb1            122109952 80024695 42085257   66% /scratch
>> master0:/home        5859342208 1326970 5858015238    1% /home
>> master0:/usr/local   7285856  166312 7119544    3% /usr/local
>> master0:/opt         3840192   27564 3812628    1% /opt
>>
>>    ---
>>
>> Here is what we guess:
>>
>> When the 8 first jobs are started all goes well. Then, among these 8
>> jobs, one will finish first while the 7 others still write on the
>> common hard drive (/scratch partition), making the hard drive very
>> busy; So a core is freed and the 9th job can be ran. However, for a
>> reason we do not master, nothing is done for this 9th job & an empty
>> error log file is generated.
>>
>> We suspect that our hard drive (/scratch partition) is busy and the
>> 'system' does not 'answer' when PBS send the 9th job.
>>
>> Does it make sense to you?
>>
>> regards, Francois
>>
>>
>> _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/torqueusers
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>



           F.-Y. Dupradeau
                 ---
http://q4md-forcefieldtools.org/FyD/



More information about the torqueusers mailing list