[torqueusers] why not have error file and o file when job
is finished
zhyang at lzu.edu.cn
zhyang at lzu.edu.cn
Tue Dec 2 19:16:45 MST 2008
Hi Gus Correa
Thanks for you detail explanation, according to you said ,I check out all directory,and I found the problem. Now is ok!
Thanks for you again.
>----------
> : "Gus Correa" <gus at ldeo.columbia.edu>
> : 2008-12-03 02:01:56
> : "Torque Users" <torqueusers at supercluster.org>
> : zhyang at lzu.edu.cn
> : Re: [torqueusers] why not have error file and o file when job is finished
> Hi Zhyang and list
>
>
>
> On different occasions, with different versions of PBS,
>
> different PBS scripts, different computers and clusters,
>
> different NFS, local disk, etc,
>
> I found the *.o and *.e files on:
>
>
>
> 1) The work directory $PBS_O_WORKDIR
>
>
>
> 2) The user home directory.
>
> When the script doesn't cd to $PBS_O_WORKDIR.
>
> The home dir may be on the master node or the compute node, if there are
>
> multiple home directories
>
> on each node.
>
>
>
> 3) On the "Mother Superior" node in:
>
>
>
> $PBS_HOME/spool
>
>
>
> or in:
>
>
>
> $PBS_HOME/undelivered
>
>
>
> This indicates a problem and the files are still named *.ER and *.OU.
>
> Not necessarily a job failure, maybe a glitch in NFS, or something else.
>
>
>
> $PBS_HOME is wherever Torque/PBS is installed.
>
> The "Mother Superior" is the first node on the $PBS_NODEFILE list of
>
> each job.
>
>
>
> I hope this helps.
>
> Gus Correa
>
>
>
> ---------------------------------------------------------------------
>
> Gustavo Correa, PhD - Email: gus at ldeo.columbia.edu
>
> Lamont-Doherty Earth Observatory - Columbia University
>
> P.O. Box 1000 [61 Route 9W] - Palisades, NY, 10964-8000 - USA
>
> ---------------------------------------------------------------------
>
>
>
>
>
> Garrick wrote:
>
>
>
> > Look in syslog on the node where the job executed, and in the email
>
> > that may have been sent to the user.
>
> >
>
> > HPCC/Linux Systems Admin
>
> >
>
> > On Dec 1, 2008, at 5:56 PM, zhyang at lzu.edu.cn wrote:
>
> >
>
> >> I saw the frontend syslog,pbs_server log, It's seem not information
>
> >> about this error. other account is all right. only one account run
>
> >> into this problem.
>
> >>
>
> >>
>
> >>> "Garrick Staples" <garrick at usc.edu>
>
> >>> 2008-12-02 09:56:32
>
> >>> torqueusers at supercluster.org
>
> >>>
>
> >>> Re: [torqueusers] why not have error file and o file when job is
>
> >>> finished
>
> >>> On Tue, Dec 02, 2008 at 09:24:07AM +0800, zhyang at lzu.edu.cn alleged:
>
> >>>
>
> >>>> Hi
>
> >>>
>
> >>>
>
> >>>>
>
> >>>
>
> >>>> I recentlt found when my job finished,I have not any out file,such
>
> >>>> as job.e* or job.o*, I know that one normal finish job, torque will
>
> >>>> give two files,e file anf o file. who can give me some
>
> >>>> suggestiones? Thanks!
>
> >>>
>
> >>>
>
> >>>>
>
> >>>
>
> >>>
>
> >>>
>
> >>> Look in the syslog of the node where your job ran.
>
> >>>
>
> >>>
>
> >>>
>
> >>> --
>
> >>>
>
> >>> Garrick Staples, GNU/Linux HPCC SysAdmin
>
> >>>
>
> >>> University of Southern California
>
> >>>
>
> >>>
>
> >>>
>
> >>> Revoke LDS Church 501(c)(3) Status - http://lds501c3.wordpress.com/
>
> >>>
>
> >>>
>
> >>>
>
> >>>
>
> >>
>
> >> --
>
> >>
>
> >>
>
> >>
>
> >> Lan Zhou University
>
> >>
>
> >> Email:zhyang at lzu.edu.cn
>
> >> _______________________________________________
>
> >> torqueusers mailing list
>
> >> torqueusers at supercluster.org
>
> >> http://www.supercluster.org/mailman/listinfo/torqueusers
>
> >
>
> > _______________________________________________
>
> > torqueusers mailing list
>
> > torqueusers at supercluster.org
>
> > http://www.supercluster.org/mailman/listinfo/torqueuser
>
> > s
>
>
>
>
>
>
--
Lan Zhou University
Email:zhyang at lzu.edu.cn
More information about the torqueusers
mailing list