[torqueusers] strange behaviour of ppn
Govind
govind.rhul at googlemail.com
Fri Nov 26 09:21:56 MST 2010
Hi All,
I have figured out that maui allocating correct number of processors but
torque does not.
I am using torque 2.3.6 and maui 3.2.6p21
This is my basic test script
====================
#PBS -q long
#PBS -l nodes=4:ppn=4
# Check the number of nodes and their names
echo "Number of nodes:"
echo `wc -l < $PBS_NODEFILE`
echo "Node list:"
cat $PBS_NODEFILE
===================
The job output and tracejob show a single node
========================
Number of nodes:
4
Node list:
node53.beowulf.cluster
node53.beowulf.cluster
node53.beowulf.cluster
node53.beowulf.cluster
=================
Maui (showq, checkjob) shows total 16 proc alloacted
==============
checkjob -v 192520
checking job 192520 (RM job '192520.pbs1.pp.rhul.ac.uk')
State: Running
Creds: user:gsongara group:hep2 class:long qos:DEFAULT
WallTime: 00:00:22 of 2:12:00:00
SubmitTime: Fri Nov 26 16:14:34
(Time Queued Total: 00:00:01 Eligible: 00:00:01)
StartTime: Fri Nov 26 16:14:35
Total Tasks: 16
Req[0] TaskCount: 16 Partition: DEFAULT
Network: [NONE] Memory >= 0 Disk >= 0 Swap >= 0
Opsys: [NONE] Arch: [NONE] Features: [NONE]
Exec: '' ExecSize: 0 ImageSize: 0
Dedicated Resources Per Task: PROCS: 1
Utilized Resources Per Task: [NONE]
Avg Util Resources Per Task: PROCS: 0.01
Max Util Resources Per Task: [NONE]
Average Utilized Procs: 16.00
NodeAccess: SHARED
TasksPerNode: 4 NodeCount: 4
Allocated Nodes:
[node56.beowulf.clust:4][node55.beowulf.clust:4][node54.beowulf.clust:4][node53.beowulf.clust:4]
Task Distribution:
node56.beowulf.clust,node56.beowulf.clust,node56.beowulf.clust,node56.beowulf.clust,node55.beowulf.clust,node55.beowulf.clust,node55.beowulf.clust,node55.beowulf.clust,node54.beowulf.clust,node54.beowulf.clust,node54.beowulf.clust,...
========================================
I having basic queue config
set queue long resources_max.nodes = 4
set queue long resources_default.nodes = 1
Could you please advise, i think torque is bottleneck here.
Regards
Govind
On Wed, Nov 24, 2010 at 7:18 AM, Christopher Samuel
<samuel at unimelb.edu.au>wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 24/11/10 18:15, Govind wrote:
>
> > We are using both openmpi (complied with torque support) and mpich.
> > In both the cases it run jobs on single slot only. Please advise if i
> > required need any specific configuration on torque.
>
> For Open-MPI you need to specify --with-tm=$PATH_TO_TORQUE
> for it to recognise and use the Torque build.
>
> Other than that it works fine, we've been using it for
> ages without any issues.
>
> cheers,
> Chris
> - --
> Christopher Samuel - Senior Systems Administrator
> VLSCI - Victorian Life Sciences Computational Initiative
> Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
> http://www.vlsci.unimelb.edu.au/
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAkzsvFoACgkQO2KABBYQAh/rhwCglzJBmQVSeXbbVufNJr4rElFD
> laoAmwTi1yUTymLu8R4FNI7gp92Pwiv6
> =M2Nr
> -----END PGP SIGNATURE-----
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20101126/8b709480/attachment.html
More information about the torqueusers
mailing list