[torqueusers] Empty output/error log file

Coyle, James J [ITACD] jjc at iastate.edu
Fri Mar 25 15:17:46 MDT 2011


Francois,

  That is a huge %used number of inodes for a scratch disk.
I don't have more than 2% used on the scratch disk for
any of my nodes. If I read the listing you provided, 
I see 80 Million inodes used (80 million files?)

  Make sure that you clean up scratch files used exclusively 
by one process when that process completes, and any shared 
scratch files when your job completes. 

  If the above does not fix the problem, and if you can redo 
the filesystem for /dev/sdb1 on each node, I'd make more inodes.  
See the -i option on mke2fs.  (Inodes don't take much room and if 
you run out of inodes, files cannot be created.)

  OR:

  Since you are on CentoS 5.x , I'd suggest switching 
to the xfs filesystem if you have large numbers of files.  
I switched from ext3 to xfs at the advice of my cluster vendor
and my users report better performance for large numbers 
of files in a directory, and I see better fileserver stability.
That is my experience, though yours might be different.  

For changing to XFS:
--------------------
  Become root and check that you have /sbin/mkfs.xfs
or you should get it with yum.

  If all you have on the /dev/sdb disk is the scratch partition, you can issue:

umount /dev/sdb1

parted /dev/sdb rm 1      #remove existing partition number 1 if any
parted -s /dev/sdb mklabel gpt   #create the label for the drive
parted /dev/sdb mkpart primary xfs "0 -0"  # make a single xfs partition spanning the whole drive

mkfs.xfs -f /dev/sdb1         # Create the XFS filesystem.

  You might want to use the -i option of mkfs.xfs to increase the number of inodes if
the default is not enough.  On my fileserver, I got one inode per Mbyte, 


  For performance you might also want to mount with the options:
noatime,nodiratime

e.g.  comment out your /etc/fstab entry for /dev/sdb1 and change it to:

/dev/sdb1 /scratch xfs rw,noatime,nodiratime,usrquota 0 0

and issue umount /dev/sdb1 ; mount /dev/sdb1

This is so that the filesystem does not need to keep updating the access time.


 James Coyle, PhD
 High Performance Computing Group     
 115 Durham Center            
 Iowa State Univ.           phone: (515)-294-2099
 Ames, Iowa 50011           web: http://www.public.iastate.edu/~jjc

>-----Original Message-----
>From: torqueusers-bounces at supercluster.org [mailto:torqueusers-
>bounces at supercluster.org] On Behalf Of FyD
>Sent: Friday, March 25, 2011 7:22 AM
>To: Torque Users Mailing List
>Subject: Re: [torqueusers] Empty output/error log file
>
>Michael,
>
>>> /dev/sdb1     ext3    917G  414G  457G  48% /scratch
>>>
>>> as you can see the /scratch partition is not full...
>>
>> still, those 1M files might be the problem.
>
>I can requested 256 000 files (smaller grid) instead of 1 000 000 &
>the same problem
>appends...
>
>> Are those files temporary or do they belong to the result set?
>
>Yes
>
>> Will they be copied back to your head node and deleted afterwards?
>
>No
>
>> I guess you use /home with nfs to get your results back?
>
>No otherwise (because of NFS) it will take for ever...
>
>> Please check "df -i", too. A good way to exclude inode problems is
>to
>> run 8 jobs and issue "df -i" during computation.
>
>ok
>
>[xxxx at node2 ~]$ df -i
>Sys. de fich.         Inodes   IUtil.  ILib. %IUti. Monté sur
>/dev/sda3             767232   93248  673984   13% /
>/dev/sda5            59211776      15 59211761    1% /tmp
>/dev/sda1              26104      41   26063    1% /boot
>tmpfs                1537806       1 1537805    1% /dev/shm
>/dev/sdb1            122109952 80024695 42085257   66% /scratch
>master0:/home        5859342208 1326970 5858015238    1% /home
>master0:/usr/local   7285856  166312 7119544    3% /usr/local
>master0:/opt         3840192   27564 3812628    1% /opt
>
>    ---
>
>Here is what we guess:
>
>When the 8 first jobs are started all goes well. Then, among these 8
>jobs, one will finish first while the 7 others still write on the
>common hard drive (/scratch partition), making the hard drive very
>busy; So a core is freed and the 9th job can be ran. However, for a
>reason we do not master, nothing is done for this 9th job & an empty
>error log file is generated.
>
>We suspect that our hard drive (/scratch partition) is busy and the
>'system' does not 'answer' when PBS send the 9th job.
>
>Does it make sense to you?
>
>regards, Francois
>
>
>_______________________________________________
>torqueusers mailing list
>torqueusers at supercluster.org
>http://www.supercluster.org/mailman/listinfo/torqueusers


More information about the torqueusers mailing list