[torqueusers] Fluent Job submission

Coyle, James J [ITACD] jjc at iastate.edu
Thu Mar 11 09:34:29 MST 2010

I.Kureshi ,

  We run a similar cluster here, and use Fluent frequently.

License issue:

  I suspect the problem is that your nodes cannot communicate with 
the windows machine that serves the licenses.

  To test that, use qsub -I -l ...
without the script name to get an interactive session and 
try the fluent commands there. I suspect that it won't work.
You could then try pinging the license server to see if it is
Even network accessible.  My guiess is that (like us) you are
On a private subnet, or you have a firewall running.  In either
Case, you will need to find a way to allow network access to
The license server machine.  If it is a firewall, you may be able to
add a line like:

-A RH-Firewall-1-INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT

  Then issue /sbin/service iptables restart

To allow responses to requests to come that your machine initiates 
to come back.

  If it is a private subnet, you will need to use a gateway machine
(probably the front-end machine)
You do this by issuing the command 

/sbin/route add default gw ttt.xxx.yyy.zzz  eth0

On each of your compute nodes, ( you may also need this in /etc/rc3.d/S99local)
ttt.xxx.yyy.zzz is the numerical ip address of your gateway machine (the login node)
and on the  login node, you may need to issue:

/sbin/sysctl net.ipv4.ip_forward=1

To get this set on each boot, Change the line 
in /etc/sysctl.conf  on the login node.

   If someone else has a better way, let me know.

Fluent issue:

  If you run on multiple nodes, you will need -cnf=nodefile  ,
So use
Opn the fluent command.

If you have IB or myrinet, you may want

To add -pib  or -pmyri
To get communication running over the low latency switch as opposed to 
the default of ethernet.
We have one cluster of each, and we find huge improvements using these
options when running "small" fluent problems.

 James Coyle, PhD
 High Performance Computing Group     
 115 Durham Center            
 Iowa State Univ.           phone: (515)-294-2099
 Ames, Iowa 50011           web: http://www.public.iastate.edu/~jjc

-----Original Message-----
From: torqueusers-bounces at supercluster.org [mailto:torqueusers-bounces at supercluster.org] On Behalf Of I.Kureshi U0850037
Sent: Thursday, March 11, 2010 4:30 AM
To: torqueusers at supercluster.org
Subject: [torqueusers] Fluent Job submission

Hi all,

I am a system admin at a UK university and we got a request to provide an HPC resource for fluent.

After installing Fluent on one of Our Cluster which is running CENTOS5.4 with OSCAR5.1b2 and is of the architecture nodes=16 plus a head node we were successfully able to start fluent via the terminal and submit parallel jobs through the shell with a journal file and the -g switch. The University has 45 licenses for Fluent 6.3.26 and 30 licenses for an older version 6.0/2?? (not sure which). These licenses reside on a windows server with flexlm running on it. We have floating licenses for many softwares on that machine.

When I try to submit a job through the job scheduler Torque/MAUI the simulations do not run as there is a license problem, even though it seems to be looking in the right place.

I have posted this on a FLUENT based forum as well but I thought since it seems to be  case of the environment variables created by TORQUE users here might be better help.

I would appreciate any help regarding this. Below are the submission script, the journal file, the output file and the error file respectively.

The job was submitted using qsub fluent.job
Submission Script
#PBS -S /bin/bash
#PBS -m e
#PBS -M sengik at hud.ac.uk
#PBS -N fluent
#PBS -l nodes=3
#PBS -e stderr
#PBS -o stdout
fluent 2d -g -ssh -t3 -i /home/sengik/Desktop/test/input.in
Journal File
file/read-case /home/sengik/Desktop/test/2dcar_10.cas
file/write-data /home/sengik/Desktop/test/2dcar_10.dat
#as you can see just a simple case of load initialise save and exit

Output File

/usr/Fluent.Inc/fluent6.3.26/bin/fluent -r6.3.26 2d -g -ssh -t3 -i /home/sengik/Desktop/test/input.in
Loading "/usr/Fluent.Inc/fluent6.3.26/lib/fluent.dmp.114-32"
/usr/Fluent.Inc/fluent6.3.26/bin/fluent -r6.3.26 2d -pethernet -host -alnx86 -t3 -mpi=hp -path/usr/Fluent.Inc -ssh -cx node16.testbed-CLS:56711:56126

Server node is down or not responding
See the system adminstrator about starting the server, or
make sure the you're referring to the right host (see LM_LICENSE_FILE)
Feature: fluent
Hostname: mech1
License path: 7241 at mech1:/usr/Fluent.Inc/license/lnx86/../license.dat
FLEXlm error: -96,7. System Error: 11 "Resource temporarily unavailable"
For further information, refer to the FLEXlm End User Manual,
available at "www.macrovision.com".

Error File
/usr/Fluent.Inc/fluent6.3.26/bin/fluent: line 2397: glxinfo: command not found
/usr/Fluent.Inc/fluent6.3.26/cortex/lnx86/cortex.3.7.3 -f fluent -g -i /home/sengik/Desktop/test/input.in (fluent "2d -pethernet -host -alnx86 -r6.3.26 -t3 -mpi=hp -path/usr/Fluent.Inc -ssh")
Starting /usr/Fluent.Inc/fluent6.3.26/lnx86/2d_host/fluent.6.3.26 host -cx node16.testbed-CLS:56711:56126 "(list (rpsetvar (QUOTE parallel/function) "fluent 2d -node -alnx86 -r6.3.26 -t3 -pethernet -mpi=hp -ssh") (rpsetvar (QUOTE parallel/rhost) "") (rpsetvar (QUOTE parallel/ruser) "") (rpsetvar (QUOTE parallel/nprocs_string) "3") (rpsetvar (QUOTE parallel/auto-spawn?) #t) (rpsetvar (QUOTE parallel/trace-level) 0) (rpsetvar (QUOTE parallel/remote-shell) 1) (rpsetvar (QUOTE parallel/path) "/usr/Fluent.Inc") (rpsetvar (QUOTE parallel/hostsfile) "") )"

Welcome to Fluent 6.3.26

Copyright 2006 Fluent Inc.
All Rights Reserved

Loading "/usr/Fluent.Inc/fluent6.3.26/lib/flprim.dmp.1119-32"

Unexpected license problem; exiting.


The simple line:
fluent 2d -g -ssh -t3 -cnf=<hostfile> -i /home/sengik/Desktop/test/input.in
works perfectly fine.

EDIT: TORQUE allocates the nodes correctly and is working fine. the simulation just ends when the license error occurs.

Thanks in advance for the help

This transmission is confidential and may be legally privileged. If you receive it in error, please notify us immediately by e-mail and remove it from your system. If the content of this e-mail does not relate to the business of the University of Huddersfield, then we do not endorse it and will accept no liability.
torqueusers mailing list
torqueusers at supercluster.org

More information about the torqueusers mailing list