[torqueusers] There's any way to allocate nodes by CPUs ?

James J Coyle jjc at iastate.edu
Tue Feb 13 15:54:21 MST 2007


Leandro,

   Take a look at

http://andrew.ait.iastate.edu/HPC/lightning/lightning_script_writer.html

  I have this for users who don't want to have to learn a new batch scheduler
syntax for each machine.

  I'm attaching the CGI script which is just a perl script which you can get 
with the View->Page Source, in case you need to break it out of the web and 
use standalone.  I'll apologize in advance, it has been modified over time for 
use
on 4 different machines from an SGI Origin 2000 to a new Opteron Cluster
with 4 processors/node.

 - Jim Coyle

-- 
 James Coyle, PhD
 SGI Origin, Alpha, Xeon and Opteron Cluster Manager
 High Performance Computing Group     
 235 Durham Center            
 Iowa State Univ.           phone: (515)-294-2099
 Ames, Iowa 50011           web: http://jjc.public.iastate.edu

> --===============0119732312==
> Content-Type: multipart/alternative; 
> 	boundary="----=_Part_73233_23015475.1171392852770"
> 
> ------=_Part_73233_23015475.1171392852770
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> Content-Transfer-Encoding: 7bit
> Content-Disposition: inline
> 
> Hi,
> 
> Sorry by the cross post, but I'm looking for a solution for a problem I have
> with Torque with and without Maui.
> 
> What I need is choose nodes by CPUs in a simpler way than "-l nodes=X:ppn=Y"
> way. What I'm looking for is a "-l cpus=X" and let the resource
> manager/scheduler choose the nodes, packing the tasks in the nodes with more
> CPUs.
> 
> I'm testing a heterogeneous cluster, where I have nodes with 2 CPUs single
> core and nodes with 2 CPUs dual core (4 CPUs). This test cluster have 3
> nodes with 4 CPUs and 2 nodes with 2 CPUs.
> 
> All the jobs we run here are called automatically by an application, which
> calls some scripts to make the job on the cluster. If I need 16 CPUs I don't
> see any easy way to make a generic script, using some standard parameters,
> to create a "-l nodes=3:ppn=4+2:ppn=2" line :-/
> 
> The easy way is to have something like "-l cpus=16" to the previous example,
> but it is possible?
> 
> Today we are using dual core CPUs but what is gonna happen when we use quad
> core CPUs? I see an ugly picture to my job allocation task...
> 
> Thanks in advance for any help,
> 
> Regards,
> 
> -- 
> Leandro Tavares Carneiro
> Analista de Suporte Linux/Unix
> 
> ------=_Part_73233_23015475.1171392852770
> Content-Type: text/html; charset=ISO-8859-1
> Content-Transfer-Encoding: 7bit
> Content-Disposition: inline
> 
> Hi,<br><br>Sorry by the cross post, but I&#39;m looking for a solution for a problem I have with Torque with and without Maui.<br><br>What I need is choose nodes by CPUs in a simpler way than &quot;-l nodes=X:ppn=Y&quot; way. What I&#39;m looking for is a &quot;-l cpus=X&quot; and let the resource manager/scheduler choose the nodes, packing the tasks in the nodes with more CPUs. 
> <br><br>I&#39;m testing a heterogeneous cluster, where I have nodes with 2 CPUs single core and nodes with 2 CPUs dual core (4 CPUs). This test cluster have 3 nodes with 4 CPUs and 2 nodes with 2 CPUs.<br><br>All the jobs we run here are called automatically by an application, which calls some scripts to make the job on the cluster. If I need 16 CPUs I don&#39;t see any easy way to make a generic script, using some standard parameters, to create a &quot;-l nodes=3:ppn=4+2:ppn=2&quot; line :-/
> <br><br>The easy way is to have something like &quot;-l cpus=16&quot; to the previous example, but it is possible?<br><br>Today we are using dual core CPUs but what is gonna happen when we use quad core CPUs? I see an ugly picture to my job allocation task...
> <br><br>Thanks in advance for any help,<br><br>Regards,<br clear="all"><br>-- <br>Leandro Tavares Carneiro<br>Analista de Suporte Linux/Unix
> 
> ------=_Part_73233_23015475.1171392852770--
> 
> --===============0119732312==
> Content-Type: text/plain; charset="us-ascii"
> MIME-Version: 1.0
> Content-Transfer-Encoding: 7bit
> Content-Disposition: inline
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
> 
> --===============0119732312==--
> 


-------------- next part --------------
#!/usr/local/bin/perl
#
# Written by Jim Coyle,(jjc at istate.edu) Iowa State Univ, Ames Iowa.
# (C) All rights reserved.
# Disclaimer: The author and Iowa State Univ. assume no 
#             responsibility regarding the use and/or misuse of
#             this software.
#
# This perl file is a CGI script written to go with the WWW form 
# nqs_writer.html. It writes the NQS script based upon user input.
# The sample programs come from the directory ../htdocs/SAMPLES/.
# All these files need to be owned by the username "nobody" to
# work properly.
#
# print mime header
print "Content-type: text/html\n\n";
# name for NQS output file:
$out="BATCH_OUTPUT";
$err="BATCH_ERRORS";
# Note for special arangements
$arranged="\n# Time : as scheduled with machine room operator at 4-2256 option 2";
# Standard divider used in instructions
$divider="#-------------------------------------------------------------#\n";

#Queues may be open or restricted. Script will contain a URL for application
# form if queue is restricted.

%QACCESS=("FARM-S", "open",
           "FARM-T", "open",
           "FARM-M", "open",
           "FARM-L", "restricted",
           "FARM-H", "restricted",
           "FARM-P", "restricted",
           "FARM-PS","restricted",
           "FARM-LS","restricted",
           );

%QCOMMENT=("FARM-S", "# small queue: 1 hr CPU time 64MB memory ",
           "FARM-T", "# short queue: 10 min CPU time 200MB memory ",
           "FARM-M", "# medium queue: 2 hrs CPU time 200MB memory ",
           "FARM-L", "# large queue: 24 hrs CPU time 200MB memory ",
           "FARM-LS","# long-small queue: 24 hrs CPU time 100MB memory ",
           "FARM-H", "# huge queue: 450MB memory $arranged",
           "FARM-P", "# 4 CPU parallel queue: 24 hrs CPU time 1000MB memory ",
           "FARM-PS","# high priority 4 CPU parallel queue; 15 min CPU time "
           );

%SAMPLES= ("kf90" , "Parallel/OpenMP_F",
           "kcc" , "Parallel/FARM-P_4_KAP_C",
           "MPI_C", "Parallel/FARM-P_4_MPI_C",
           "MPI_F", "Parallel/FARM-P_4_MPI_F",
           "PVM_C", "Parallel/FARM-P_4_PVM_C",
           "PVM_F", "Parallel/FARM-P_4_PVM_F",
           "maple", "Serial/FARM_MAPLE",
           "matlab","Serial/FARM_MATLAB"
           );

%NOTIFY=("start", "# notify when job starts \n#\$ -mb \n", 
	 "end",   "# notify when job completes\n#\$ -me \n");

$Instructions = 
	   '# Instructions for new farm users:' . "\n"
         . '#  To use this script:' . "\n"
         . '#   1) Save this script as a file named myscript in an AFS directory ' . "\n"
         . '#       (eg. your home directory or your private AFS locker)' . "\n"
         . '#   2) Issue                   ' . "\n"
         . '#       add farm ( if you haven\'t already )' . "\n"
         . '#       relogin -l 2d ( if you will be running for hours ) ' . "\n"
         . '#       qsub myscript           ' . "\n"
         . '#        Use qstat to see job status; ' . "\n"
         . '#            or to delete one of your jobs' . "\n\n"
         . "# Start job in current directory.\n" . '#$ -cwd ' . "\n\n"
         . "# Output goes to file $out in current directory.\n#\$ -eo $out \n\n";

$Instructions = 
	   '# Instructions for new Opteron/HTX Cluster users:' . "\n"
         . '#  To use this script:' . "\n"
         . '#   1) Save this script as a file named myscript on hpc4' . "\n"
         . '#   2) On hpc4, Issue                   ' . "\n"
         . '#       qsub myscript    to submit the job ' . "\n"
         . '#        Use qstat -a to see job status, ' . "\n"
         . '#         Use qdel jobname to delete one of your jobs' . "\n"
         . '#         jobnames are of the form 1234.hpc4 '  . "\n\n"
#         . "# This script has cd $PBS_O_WORKDIR as the first command. \n"
#         . "# qsub command was executed. This is what most users want. Change\n"
#         . "# that command if you want something else.\n"
#         . "###########################################\n"
#         . "# MPI Users:                              \n"
#         . '#            For now, use mpirun -machinefile $PBS_NODEFILE' ."\n"
#         . "#            instead of just mpirun                         \n"
#         . "#            This will be fixed in the near future.         \n"
          . "###########################################\n"
         . "# Output goes to file $out.\n"
         . "# Error output goes to file $err.\n"
         . "# If you want the output to go to another file, change $out \n"
         . "# or $err in the following lines to the full path of that file. \n\n"
	 . "#PBS  -o $out \n"
	 . "#PBS  -e $err \n\n" ;
##NQE retired  . "#QSUB -o $out \n"
##NQE retired  . "#QSUB -e $err\n\n" ;



$Comments_for_restricted_queues= 
   $divider
.   "#   You must have a computing grant to use the $queue queue.\n"
.   "#   To obtain a computing grant, see URL \n"
.   '#      <a HREF="http://www.public.iastate.edu/~farm/application.html"> http://www.public.iastate.edu/~farm/application.html </a>' . "\n"
.   '#       http://www.public.iastate.edu/~farm/application.html ' . "\n"
.  "$divider \n"; 

$nodes=0;$cpus=0;
@NOTIFY_LIST=();
@args = &get_args();
foreach $a ( @args ) {
  ($t,$v) = split(/=/,$a);
  if ( $t eq "user_name" ) { $user=$v;}
  if ( $t eq "notify" ) { push(@NOTIFY_LIST, $NOTIFY{$v});}
  if ( $t eq "time" ) { $time=$v;$time =~ s/ *hr\.?/:00:00/; }
  if ( $t eq "memory" ) { $memory=$v; $memory =~ s/ *(.)bytes?/$1b/;}
  if ( $t eq "nodes" ) { $nodes=$v;}
  if ( $t eq "cpus" )  { $cpus=$v;}
  if ( $t eq "shell" ) { $shell=$v;}
  if ( $t eq "queue" ) { $queue=$v;}
  if ( $t eq "samples" ) { $sample=$v;}
  if ( $t eq "commands" ) { $commands=$v;}
}
$np=$cpus;
if ( $cpus ne 0 ) {
   $cpus_per_node=4;
   $cpu_remainder=$cpus % $cpus_per_node;
   $nodes += ($cpus-$cpu_remainder)/$cpus_per_node;
   $cpus = $cpu_remainder;
   if ( $nodes != 0 && $cpu_remainder !=0) {
# Syntax is like nodes=4:ppn=2+1:ppn=1  for 9 cpus.
     $node_req='nodes=' . "$nodes" . ':ppn=' . $cpus_per_node .
                   '+1:ppn=' . $cpu_remainder;
    }elsif ( $cpu_remainder == 0 )  {
# Syntax is like nodes=4:ppn=2  for 8 cpus.
     $node_req='nodes=' . "$nodes" . ':ppn=' . $cpus_per_node;
    }else {
# Syntax is like nodes=1:ppn=1  for 1 cpu.
     $node_req='nodes=1:ppn=' . $cpu_remainder;
   }

}


$SCRIPT  = '#!/bin/' . $shell . "\n\n" . $Instructions;
#if ( $#NOTIFY_LIST >= 0 ) 
#      {$SCRIPT .= join("\n", at NOTIFY_LIST , "\n")};
#if ( $QACCESS{$queue} eq "restricted" ) 
#      {$SCRIPT .=  $Comments_for_restricted_queues; }
#$SCRIPT .=     "$QCOMMENT{$queue} \n";
#$SCRIPT .=     '#$ -G ' . $queue . " 1 \n\n";
##NQE retired  $SCRIPT  .= '#QSUB -lt ' . $time . "\n";
##NQE retired  $SCRIPT  .= '#QSUB -lM ' . $memory . "\n";
###if ( $sample =~ /^MPI|^PVM|^OPEN_MP/i && $processors == 1 ) { 
##if (  $processors == 1 ) { 
#if ( $sample =~ /^MPI|^PVM|^OPEN_MP/i && ($nodes != 2) ) { 
#if ( $sample =~ /^MPI|^PVM|^OPEN_MP/i ) { 
#  $SCRIPT  .= "### The parallel sample requires 1 nodes to run.\n" ;
#  $SCRIPT  .= "### Changing to 4 nodes: four processors.\n";
#  $nodes = 2;
#  $node_req='nodes=1:ppn=4';
#  }
##NQE retired  $SCRIPT  .= '#QSUB -l mpp_p=' . $processors . "\n\n";
$cput=&cnvt_to_cput(4,$time);
#$SCRIPT  .= '#PBS -l' . "mem=$memory,nodes=$nodes:ppn=2,cput=$cput,walltime=$time\n\n";
# Use $node_req created above, which allows things like nodes=4:ppn=2+1:ppn=1
$SCRIPT  .= '#PBS -l' . "mem=$memory,$node_req,cput=$cput,walltime=$time\n\n";
$SCRIPT  .= "# Change to directory from which qsub command was issued \n" ;
$SCRIPT  .= '   cd $PBS_O_WORKDIR' . "\n\n";
if ( $sample eq "My_commands" ) { $SCRIPT .=     "$commands \n";}
  else {   
   $sample_file="../htdocs/HPC4/SAMPLES/" . $SAMPLES{$sample};
   open(FHtty,"<$sample_file") || warn ("Can't open $sample_file for read\n");
# append sample script to above NQS commands.
   @A=<FHtty>; $SCRIPT .= join("", at A) . "\n" ;
  }

$SCRIPT =~ s/\-np 4/-np $np/g;
#$SCRIPT =~ s/NPS/$np/g;
print "<PRE>\n";
    print $SCRIPT;
print "</PRE>\n";
#-----------------------------------------------------------------------
# Routine to convert processor* wall-time list into a total cputime in HH:MM:SS format
sub cnvt_to_cput{
  local($processors,$time) = @_;
  local(@T,$k,$carry,$cput);
  @T=split(/:/,$time);
  $carry=0.;
  foreach $k (reverse(0..$#T) ) {
   $T[$k] = $T[$k]*$processors*1.+$carry;
   if ($k > 0 && $T[$k] > 59 ) {
     $carry = int($T[$k]/60.);
     $T[$k] -= $carry*60.;
    }
  }
  $cput=join(':', at T);
# Avoid returns like 10:0:0, instead make them 10:00:00 for looks.
  $cput =~ s/:0:/:00:/;
  $cput =~ s/:0$/:00/;
  return($cput);
}
sub get_args {
  local (@args) = ();
  local ($method,$length,$qstring);
  local ($string,$name,$value);

  $method  = $ENV{'REQUEST_METHOD'};
  $length  = $ENV{'CONTENT_LENGTH'};
  $qstring = $ENV{'QUERY_STRING'};

  if ( $method eq "GET" ) {
    $string = $qstring
  }
  elsif ( $method eq "POST" ) {
    sysread(STDIN,$string,$length) 
      || &reject("$0: Cannot read $length bytes from STDIN\n");
  }
  else {
    # guess request method - good for testing on command line
    if ( length($qstring) > 0 && $length > 0 ) {
      &reject("$0: Cannot guess request method. " . 
	    "Both QUERY_STRING and CONTENT_LENGTH are set.\n");
    }
    elsif ( length($qstring) > 0 ) {
      # probably a GET script
      $string = $qstring;
    }
    elsif (  $length > 0 ) {
      # probably a POST script
      sysread(STDIN,$string,$length) 
	|| &reject("$0: Cannot read $length bytes from STDIN\n");
    }
    else {
      &reject("<PRE>$0: unknown REQUEST_METHOD - $method</PRE>\n");
    }
  }
  foreach $temp ( split('&',$string) ) {
    ($name,$value) = split('=',$temp,2);
    $value =~ tr/+/ /;
    $value = &unescape_hex($value);
    push (@args,"$name=$value");
    $IN{$name} = $value;
  }
  return @args;
}
#-----------------------------------------------------------------------
sub unescape_hex {
  local($s) = @_;
  $s =~ s/%(..)/pack('c',hex($1))/ge;
  return $s;
}
#-----------------------------------------------------------------------
sub txt2html {
  local($txt)=@_;
  study $txt;
  $txt =~ s/&/&amp;/g;
  $txt =~ s/</&lt;/g;
  $txt =~ s/>/&gt;/g;
  return $txt;
}
#-----------------------------------------------------------------------
sub reject {
  print join(' ', at _);
  exit;
}
#-----------------------------------------------------------------------
sub get_add_alias {
 $setup_add_alias = "# Set up add alias for use in NQS script\n" 
   . "alias add 'set aenv = `attach -c \!*` && eval $aenv ; unset aenv'"
   . "\n\n";
return( $setup_add_alias );
}
#-----------------------------------------------------------------------
# Local Variables:
# mode: perl
# End:
__END__



More information about the torqueusers mailing list