[torqueusers] Altix cpusets

Dave Jackson jacksond at clusterresources.com
Thu Oct 27 14:55:24 MDT 2005


Jeroen,

  Thanks for the patch.  The changes have been made but before they are
checked in, can you verify that the following is what was intended:

#ifdef CPUSETS_FIRST_CPU
    for (i = CPUSETS_FIRST_CPU;i < nCPUS;i++) /* CPUSET not CPUSETS */
#else
    for (i = 0;i < nCPUS;i++)
#endif

  Note the comment.

  If you can confirm, we will check in the changes and roll out a new
snapshot.

Thanks,
Dave
 
On Thu, 2005-10-27 at 11:07 +1000, Jeroen van den Muyzenberg wrote:
> Hi,
> 
> I've had the chance to play on an Altix 3700 before it joins our
> existing Altix in production next week and have been experimenting with
> using cpusets, with little initial success.
> 
> Turns out there were two problems. A cpuset name can be a max of 8
> characters, and the string (cQueueName in start_exec.c) holding this
> name didn't have the space for the terminating null. Also cQueueName was
> initialising with garbage, and the strncpy and strncat used to create
> the cpuset name don't append a null terminator if not found in the
> source string.
> 
> We also intend to start using bootcpusets, and the existing code doesn't
> account for that. ie it will start placing jobs from CPU 0 onwards
> regardless that this CPU is already in another cpuset.
> 
> Attached is a patch that addresses all these issues. For bootcpuset support,
> there needs to be a define in pbs_config.h
> 
> #define CPUSETS_FIRST_CPU X
> 
> where X is the first CPU outside the defined bootcpuset.
> 
> Looking forward to seeing this work in production next week.
> 
> Further improvements would be the ability to specify the type of memory
> access regime for the cpuset, and a better cpu allocation algorithm that
> would try to pack multi-cpu jobs onto the same node/brick if at all
> possible.
> 
> Cheers,
> Jeroen
> 
> Jeroen van den Muyzenberg
> CSIRO High Performance Scientific Computing
> Bureau of Meteorology/CSIRO HPCCC -
> High Performance Computing and Communications Centre
> Ph: +61 3 9669 8111 Fax: +61 3 9669 8112
> Jeroen.vandenMuyzenberg at csiro.au
> _______________________________________________ torqueusers mailing list torqueusers at supercluster.org http://www.supercluster.org/mailman/listinfo/torqueusers



More information about the torqueusers mailing list