[torqueusers] inital torque setup - jobs are dieing right away

Adams, Samuel D Contr AFRL/HEDR Samuel.Adams at BROOKS.AF.MIL
Mon Jul 23 15:15:08 MDT 2007


On the clients I typed

# make install_mom install_clients

Is this the nonstandard way to do it?  I didn't compile the packages or
the configure script on the nodes as they don't have compilers or
development tools installed.  

Sam Adams
General Dynamics Information Technology
Phone: 210.536.5945

-----Original Message-----
From: torqueusers-bounces at supercluster.org
[mailto:torqueusers-bounces at supercluster.org] On Behalf Of James A.
Peltier
Sent: Monday, July 23, 2007 3:26 PM
To: torqueusers at supercluster.org
Subject: Re: [torqueusers] inital torque setup - jobs are dieing right
away

Garrick Staples wrote:
> On Mon, Jul 23, 2007 at 03:00:46PM -0500, Adams, Samuel D Contr
AFRL/HEDR alleged:
>> I got it to work.  I needed to give the user trying to run the job
>> permissions to write to /var/spool/torque/spool.  The he could run
the a
>> job.
> 
> There's still something screwey there.  spool should be 1777, and
should
> have been set that way when you ran 'make install' on the nodes.
> 
> 
> 
>
------------------------------------------------------------------------
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers

I agree, when I had that problem it was only to do with the 
mom_priv/config file being wrong.

If this is a new install I would, remove the /var/spool/torque directory

(or move it out of the way) on all nodes and run through the

./configure
make
sudo make packages

Then on the head node install the clients, doc and server packages
and optionally the mom (if head node will also compute) and gui package.

On the compute nodes just install the pbs_mom package and optionally the

clients package and start with

$pbsserver server_name
$usecp *:/ /

in the mom_priv/config file.  If it doesn't work after that, something 
else is wrong and further troubleshooting would be needed.  This would 
get you to somewhat of a "known good" configuration for everyone to work

from, since now everyone knows exactly how it is configured.

-- 
James A. Peltier
Technical Director, RHCE
SCIRF | GrUVi @ Simon Fraser University - Burnaby Campus
Phone   : 604-291-3610
Fax     : 604-291-3045
Mobile  : 778-840-6434
E-Mail  : jpeltier at cs.sfu.ca
Website : http://gruvi.cs.sfu.ca | http://scirf.cs.sfu.ca
MSN     : subatomic_spam at hotmail.com
_______________________________________________
torqueusers mailing list
torqueusers at supercluster.org
http://www.supercluster.org/mailman/listinfo/torqueusers


More information about the torqueusers mailing list