[torquedev] change in torque distribution (automake)

Josh Butikofer josh at clusterresources.com
Thu Apr 12 15:56:55 MDT 2007


Standa,

You may be interested in knowing that we have beta code within Moab Workload Manager that uses node
virtualization to accomplish much of the same functionality that you hope to achieve. Moab
interfaces with an unmodified TORQUE to control physical nodes as usual. When necessary, however,
Moab can launch a virtual node on a physical node and then run a job in this virtual node, via
TORQUE. The job needs no modification. For example, we and our collaborators have even been able to
run an unmodified MPI job across virtual nodes in two different clusters. After the jobs complete,
the virtual nodes are retired. During this whole process TORQUE reports on both the status of the
physical and virtual nodes AND the jobs running on those nodes. The whole operation is fairly
transparent and works quite well. Moab can schedule the use of these virtual nodes and their
required resources (disk images, IP addresses, MAC addresses, etc.) You can check out more
information about Moab's virtualization capabilities at the following URL:

http://www.clusterresources.com/products/mwm/docs/5.6resourceprovisioning.shtml#virtualization

I don't mean to discourage your work on TORQUE and virtualization, but I just thought that you may
want to know more about others' efforts and progress as well.

Regards,

-- 
Joshua Butikofer
Cluster Resources, Inc.

josh at clusterresources.com
Voice: (801) 717-3707
Fax:   (801) 717-3738
--------------------------


Standa Kunc wrote:
> Firstly I will try to explain you my motive:
> 
> I can run unmodified pbs_mom under virtual machine. But this virtual
> machine has to run all the time and such solution has limited
> flexibility.
> 
> I want to be able to start or stop virtual machines transparently. I
> want to have multiple virtual machines, none of them have to run. Or
> all of them can run when there are free resources on physical
> computer.
> 
> I want to have one pbs_mom running on each physical compute node all
> the time (I call it physmom). This physmom acts little bit like local
> server for pbs_moms running in virtual computers (I call them
> virtmoms).
> 
> Idea is that pbs_server communicates with physmoms. Physmom should
> report state of all their virtmoms. When scheduler decides that some
> job should run on some virtmom, pbs_server cooperates with physmom to
> send job (or wake up virtual computer and then send job).
> 
> First step is creation of some kind of API, functions like start_vm(),
> stop_vm() . Then I can implement these generic functions for concrete
> virtualization system. This is reason of my need of src/resmom/openvz
> directory. Because there could be other directories like
> src/resmom/xen etc.
> 
> 
> Now the problem I challenge:
> 
> I am working with Torque 2.1.6 original revision number 1147. And I
> should be able to regenerage makefiles and configure but there are
> some errors and it fails.
> 
> I have ubuntu distribution, all tools are installed:
> autoconf (GNU Autoconf) 2.60
> automake (GNU automake) 1.4-p6
> ltmain.sh (GNU libtool) 1.5.22 Debian 1.5.22-4 (1.1220.2.365
> 2005/12/18 22:14:06)
> 
> Is there some special way to build torque distribution? Do I have to
> change some parameters in configuration?
> 
> See output of commands below.
> 
> Thank you for your reply
> S. Kunc


More information about the torquedev mailing list