[torqueusers] Problem with pbs_iff

Abraham Zamudio abraham.zamudio at gmail.com
Wed Nov 3 08:14:16 MDT 2010


Ken ,

I found the problem ,  in my job scriipt the line
#PBS -o ROJ1_$PBS_O_JOBID.out

is the guilty .
Now my job script is thus begins :
#PBS -S /bin/bash#PBS -N ROJ1#PBS -q batch#PBS -l nodes=2:ppn=4#PBS -j oe#PBS
-o ROJ1.out













On Tue, Nov 2, 2010 at 6:01 PM, Ken Nielson
<knielson at adaptivecomputing.com>wrote:

>
>
> ----- Original Message -----
> From: "Abraham Zamudio" <abraham.zamudio at gmail.com>
> To: "Torque Users Mailing List" <torqueusers at supercluster.org>
> Sent: Tuesday, November 2, 2010 4:52:46 PM
> Subject: Re: [torqueusers] Problem with pbs_iff
>
>
> I'm not running pbs_iff from the command line ... i just pbs_from from the
> command line only for proof .
>
>
> Mi problem is when run a job with a qsub command
>
>
>
> grep 16.master /var/spool/torque/server_logs/20101102
> 11/02/2010 17:49:54;0100;PBS_Server;Job;16.master;enqueuing into batch,
> state 1 hop 1
> 11/02/2010 17:49:54;0008;PBS_Server;Job;16.master;Job Queued at request of
> mpiX at master, owner = mpiX at master, job name = ROJ1, queue = batch
> 11/02/2010 17:49:55;0008;PBS_Server;Job;16.master;Job Run at request of
> root at master
> 11/02/2010 17:49:56;000d;PBS_Server;Job;16.master;Not sending email: User
> does not want mail of this type.
> 11/02/2010 17:49:56;0010;PBS_Server;Job;16.master;Exit_status=0
> resources_used.cput=00:00:00 resources_used.mem=3916kb
> resources_used.vmem=234924kb resources_used.walltime=00:00:01
> 11/02/2010 17:49:56;000d;PBS_Server;Job;16.master;Post job file processing
> error; job 16.master on host quad2/2+quad2/1+quad2/0+quad4/2+quad4/1+quad4/0
> 11/02/2010 17:49:56;0100;PBS_Server;Job;16.master;dequeuing from batch,
> state COMPLETE
> 11/02/2010 17:50:56;000d;PBS_Server;Job;16.master;Email 'o' to mpiX at masterfailed: Child process '/usr/lib/sendmail -f adm mpiX at master'
> returned 78 (errno 10:No child processes)
>
>
> my qsub file is :
>
>
>
> #PBS -S /bin/bash
> #PBS -N ROJ1
> #PBS -q batch
> #PBS -l nodes=2:ppn=3
> #PBS -j oe
> #PBS -o ROJ1_$PBS_O_JOBID.out
> cd $PBS_O_WORKDIR
> /usr/local/mpiexec83/bin/mpiexec
> /jro_cluster/mpiX/CapacitacionMPI_ROJ/ROJ1-mpi
>
>
>
> grep 17.master /var/spool/torque/server_logs/20101102
> 11/02/2010 18:00:02;0100;PBS_Server;Job;17.master;enqueuing into batch,
> state 1 hop 1
> 11/02/2010 18:00:02;0008;PBS_Server;Job;17.master;Job Queued at request of
> mpiX at master, owner = mpiX at master, job name = ROJ1, queue = batch
> 11/02/2010 18:00:03;0008;PBS_Server;Job;17.master;Job Run at request of
> root at master
> 11/02/2010 18:00:03;000d;PBS_Server;Job;17.master;Not sending email: User
> does not want mail of this type.
> 11/02/2010 18:00:03;0010;PBS_Server;Job;17.master;Exit_status=0
> resources_used.cput=00:00:00 resources_used.mem=3932kb
> resources_used.vmem=234924kb resources_used.walltime=00:00:01
> 11/02/2010 18:00:03;000d;PBS_Server;Job;17.master;Post job file processing
> error; job 17.master on host quad2/2+quad2/1+quad2/0+quad4/2+quad4/1+quad4/0
> 11/02/2010 18:00:03;0100;PBS_Server;Job;17.master;dequeuing from batch,
> state COMPLETE
>
>
>
>
>
>
>
> On Tue, Nov 2, 2010 at 5:44 PM, Ken Nielson <
> knielson at adaptivecomputing.com > wrote:
>
>
>
>
> On 11/02/2010 04:37 PM, Abraham Zamudio wrote:
>
>
> launch a job
>
>
> [mpiX at master CapacitacionMPI_ROJ]$ qsub ROJ1-mpi.qsub
> 15.master
>
>
> the output of tracejob command :
>
>
>
> [mpiX at master ~]$ tracejob 15
> /var/spool/torque/server_priv/accounting/20101102: Permission denied
> /var/spool/torque/mom_logs/20101102: No such file or directory
> /var/spool/torque/sched_logs/20101102: No such file or directory
>
>
> Job: 15.master
>
>
> 11/02/2010 17:42:28 S enqueuing into batch, state 1 hop 1
> 11/02/2010 17:42:28 S Job Queued at request of mpiX at master, owner =
> mpiX at master, job name = ROJ1, queue = batch
> 11/02/2010 17:42:29 S Job Run at request of root at master
> 11/02/2010 17:42:29 S Not sending email: User does not want mail of this
> type.
> 11/02/2010 17:42:29 S Not sending email: User does not want mail of this
> type.
> 11/02/2010 17:42:29 S Exit_status=1 resources_used.cput=00:00:00
> resources_used.mem=0kb resources_used.vmem=0kb
> resources_used.walltime=00:00:00
> 11/02/2010 17:42:29 S Post job file processing error
> 11/02/2010 17:42:29 S dequeuing from batch, state COMPLETE
>
> Abraham,
>
> I do not think pbs_iff is the problem. Reading the log files the job
> successfully submitted and ran. This is a post job processing error. It
> looks like maybe the output directory or something of that nature is not
> available to your application.
>
> pbs_iff is only used at the beginning of a client operation such as qsub.
>
> Ken
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>



-- 
Abraham Zamudio Ch.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20101103/a106eb95/attachment.html 


More information about the torqueusers mailing list