[torqueusers] inital torque setup - jobs are dieing right away

James A. Peltier jpeltier at cs.sfu.ca
Mon Jul 23 14:26:20 MDT 2007

Garrick Staples wrote:
> On Mon, Jul 23, 2007 at 03:00:46PM -0500, Adams, Samuel D Contr AFRL/HEDR alleged:
>> I got it to work.  I needed to give the user trying to run the job
>> permissions to write to /var/spool/torque/spool.  The he could run the a
>> job.
> There's still something screwey there.  spool should be 1777, and should
> have been set that way when you ran 'make install' on the nodes.
> ------------------------------------------------------------------------
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers

I agree, when I had that problem it was only to do with the 
mom_priv/config file being wrong.

If this is a new install I would, remove the /var/spool/torque directory 
(or move it out of the way) on all nodes and run through the

sudo make packages

Then on the head node install the clients, doc and server packages
and optionally the mom (if head node will also compute) and gui package.

On the compute nodes just install the pbs_mom package and optionally the 
clients package and start with

$pbsserver server_name
$usecp *:/ /

in the mom_priv/config file.  If it doesn't work after that, something 
else is wrong and further troubleshooting would be needed.  This would 
get you to somewhat of a "known good" configuration for everyone to work 
from, since now everyone knows exactly how it is configured.

James A. Peltier
Technical Director, RHCE
SCIRF | GrUVi @ Simon Fraser University - Burnaby Campus
Phone   : 604-291-3610
Fax     : 604-291-3045
Mobile  : 778-840-6434
E-Mail  : jpeltier at cs.sfu.ca
Website : http://gruvi.cs.sfu.ca | http://scirf.cs.sfu.ca
MSN     : subatomic_spam at hotmail.com

More information about the torqueusers mailing list