[torqueusers] php based web interface for qstat viewing
arnaud.renard at univ-reims.fr
arnaud.renard at univ-reims.fr
Thu Apr 19 02:21:44 MDT 2007
Hi all,
i'm new to this mailing list, so sorry if i have stupid questions.
I'm looking for a php based web interface for qstat viewing, pearhaps like it was describes there :
http://www.supercluster.org/pipermail/torqueusers/2006-November/004711.html
Can you give me some url ?
-thanks-
--
Arnaud Renard,
tél : +33 326 91 85 91 - fax : +33 326 91 33 97
http://cosy.univ-reims.fr/~arenard/
Université de Reims Champagne-Ardenne - CReSTIC SysCom
UFR Sciences Exactes et Naturelles
Département de Mathématiques et Informatique
Moulin de la Housse - BP 1039
51687 Reims Cedex 2.
----- Original Message -----
From: Fabio Martinelli
To: torqueusers at supercluster.org
Sent: Wednesday, April 18, 2007 11:03 PM
Subject: [torqueusers] cloning the installation jobs for a generic LCG site
Hi all,
I manage an LCG Grid site in Italy with about 15 HPC linux nodes,
normally every node see the application software from a
NFS storage element and the application software is installed
from outside my site by someone that submit an 'installation job':
when the job arrive it runs on a node and it modifies the NFS area.
I never know when these jobs could arrive.
this is a terrible deployment strategy for me because:
1) you don't know when/where/why something change in your cluster
2) you can't check if something is changed after the installation
because you don't have a package manager, so you don't know
which is a correct status for the software ( permissions? crc? owner? size?)
3) the NFS area is a single point of failure
4) the NFS area is a performance bottleneck for all the nodes: there is not
the concept of 'cache', so during the time the nodes read many times
the same files
5) my nodes have hard disks with a lot of storage not used: sure almost
every node can store all the application software ( that's about 90Gb )
without problems.
all the others Grid sites are designed in this way!
for at least mitigate this scenario I want:
1) Install the application software everywhere it's possible: where it's
not possible continue to use NFS for now ( or AFS with a big cache? or what else? )
2) run tools like tripwire or samhain to understand when/where/why something change
on each node ( or what else? )
3) the most important point for Torque/Maui: run N copy of the installation job when that arrives,
one for each node that can store the application software.
I well know the list of users that run the install job, they are about 15:
so how may solve this in Torque? do you have suggestions for a better deployment design?
I want more security and performances from my cluster.
many thanks in advance,
Fabio Martinelli
--
------------------------------------------------------
Dr. Fabio Martinelli
Grid Computing Lab
institute INFN Tor Vergata
address via della ricerca scientifica, 00133 Roma
phone +39 06 7259 4113
+39 06 7259 4036
email fabio.martinelli at roma2.infn.it
fabio.martinelli at cern.ch
web http://grid.roma2.infn.it/
------------------------------------------------------------------------------
_______________________________________________
torqueusers mailing list
torqueusers at supercluster.org
http://www.supercluster.org/mailman/listinfo/torqueusers
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20070419/b22796e4/attachment-0001.html
More information about the torqueusers
mailing list