[torqueusers] momctl error - A description

Garrick Staples garrick at clusterresources.com
Tue Mar 6 16:26:08 MST 2007


On Tue, Mar 06, 2007 at 06:14:18PM -0500, michael young alleged:
> The thing that bothers me is that there is a job running currently.
> Should it show up with 'qstat'?

Yes, if there is a job running then it shows up in 'qstat'.  If it
doesn't, then there isn't a job.

> 
> >The jobid is first printed to the user when running qsub, and running
> >'qstat' with no arguments lists all jobs with their ids.
> > 
> 
> Our users use a GUI interface to submit jobs from Spartan '04. It does 
> not return the jobid.

Then I suspect it is just directly running processes and not actually
talking to TORQUE.

Do you see any activity in pbs_server's logfile indicating job
submissions?

You might also be doing this backwards.  Try running an interactive job
first, and then running Spartan from within the job.


> >>>Since your master node isn't running pbs_mom, this implies that the
> >>>problem is in your job script.  Is your job script using $PBS_NODEFILE
> >>>to spawn the processes?
> >>>
> >>>
> >>Where do I find the job script?
> >>I did a 'env' and there is no "$PBS_NODEFILE"
> >>   
> >>
> >
> >Inside of the job environment, the job will have the list of nodes
> >assigned to the job in the file named in $PBS_NODEFILE.

Jobs consists of a user-written job script that does what the user
wants and is submitted to TORQUE with qsub.  TORQUE will then send the
job to a node where the job script is executed.


> How do I get to this job environment?
> 
> In my reading on this, a doc. said to run "echo "sleep 30" | qsub" to 
> give me a second job.
> It returns "qsub: Bad UID for job execution".

Are you running qsub on the same host that is running pbs_server, or a
different host?  You aren't running it as root, right?

 
> >For example, if you launching an MPI program with mpirun, then you would
> >pass the nodes with something like:
> >
> > np=`wc -l < $PBS_NODEFILE`
> > mpirun -machinefile $PBS_NODEFILE -np $np ./command
> > 
> >
> 
> Does Linux come with a MPI program I can run or do I d/l 1 or make 1 or 
> what?
> Sorry, I'm really new to this whole clustering business.
> I do know Linux fairly well though.

If you aren't currently running MPI programs, then there is no point in
getting an MPI implementation.  I was just using it as an example of a
common type of job that could be run within PBS.



More information about the torqueusers mailing list