[torqueusers] why not have error file and o file when job is finished

zhyang at lzu.edu.cn zhyang at lzu.edu.cn
Tue Dec 2 19:16:45 MST 2008


Hi Gus Correa
Thanks for you detail explanation, according to you said ,I check out all directory,and I found the problem. Now is ok!

Thanks for you again.


>----------
> : "Gus Correa" <gus at ldeo.columbia.edu>
> : 2008-12-03 02:01:56
> : "Torque Users" <torqueusers at supercluster.org>
> : zhyang at lzu.edu.cn
> : Re: [torqueusers] why not have error file and o file when job	is finished
> Hi Zhyang and list
> 
> 
> 
> On different occasions, with different versions of PBS,
> 
> different PBS scripts, different computers and clusters,
> 
> different NFS, local disk, etc,
> 
> I found the *.o and *.e files on:
> 
> 
> 
> 1) The work directory $PBS_O_WORKDIR
> 
> 
> 
> 2) The user home directory.
> 
> When the script doesn't cd to $PBS_O_WORKDIR.
> 
> The home dir may be on the master node or the compute node, if there are 
> 
> multiple home directories
> 
> on each node.
> 
> 
> 
> 3) On the "Mother Superior" node in:
> 
> 
> 
> $PBS_HOME/spool
> 
> 
> 
> or in:
> 
> 
> 
> $PBS_HOME/undelivered
> 
> 
> 
> This indicates a problem and the files are still named *.ER and *.OU.
> 
> Not necessarily a job failure, maybe a glitch in NFS, or something else.
> 
> 
> 
> $PBS_HOME is wherever Torque/PBS is installed.
> 
> The "Mother Superior" is the first node on the $PBS_NODEFILE list of 
> 
> each job.
> 
> 
> 
> I hope this helps.
> 
> Gus Correa
> 
> 
> 
> ---------------------------------------------------------------------
> 
> Gustavo Correa, PhD - Email: gus at ldeo.columbia.edu
> 
> Lamont-Doherty Earth Observatory - Columbia University
> 
> P.O. Box 1000 [61 Route 9W] - Palisades, NY, 10964-8000 - USA
> 
> ---------------------------------------------------------------------
> 
> 
> 
> 
> 
> Garrick wrote:
> 
> 
> 
> > Look in syslog on the node where the job executed, and in the email 
> 
> > that may have been sent to the user.
> 
> >
> 
> > HPCC/Linux Systems Admin
> 
> >
> 
> > On Dec 1, 2008, at 5:56 PM, zhyang at lzu.edu.cn wrote:
> 
> >
> 
> >> I saw the frontend syslog,pbs_server log, It's seem not information 
> 
> >> about this error. other account is all right. only one account run 
> 
> >> into this problem.
> 
> >>
> 
> >>
> 
> >>> "Garrick Staples" <garrick at usc.edu>
> 
> >>> 2008-12-02 09:56:32
> 
> >>> torqueusers at supercluster.org
> 
> >>>
> 
> >>> Re: [torqueusers] why not have error file and o file when job is 
> 
> >>> finished
> 
> >>> On Tue, Dec 02, 2008 at 09:24:07AM +0800, zhyang at lzu.edu.cn alleged:
> 
> >>>
> 
> >>>> Hi
> 
> >>>
> 
> >>>
> 
> >>>>
> 
> >>>
> 
> >>>> I recentlt found when my job finished,I have not any out file,such 
> 
> >>>> as job.e* or job.o*, I know that one normal finish job, torque will 
> 
> >>>> give two files,e file anf o file. who can give me some 
> 
> >>>> suggestiones? Thanks!
> 
> >>>
> 
> >>>
> 
> >>>>
> 
> >>>
> 
> >>>
> 
> >>>
> 
> >>> Look in the syslog of the node where your job ran.
> 
> >>>
> 
> >>>
> 
> >>>
> 
> >>> -- 
> 
> >>>
> 
> >>> Garrick Staples, GNU/Linux HPCC SysAdmin
> 
> >>>
> 
> >>> University of Southern California
> 
> >>>
> 
> >>>
> 
> >>>
> 
> >>> Revoke LDS Church 501(c)(3) Status - http://lds501c3.wordpress.com/
> 
> >>>
> 
> >>>
> 
> >>>
> 
> >>>
> 
> >>
> 
> >> -- 
> 
> >>
> 
> >>
> 
> >>
> 
> >> Lan Zhou University
> 
> >>
> 
> >> Email:zhyang at lzu.edu.cn
> 
> >> _______________________________________________
> 
> >> torqueusers mailing list
> 
> >> torqueusers at supercluster.org
> 
> >> http://www.supercluster.org/mailman/listinfo/torqueusers
> 
> >
> 
> > _______________________________________________
> 
> > torqueusers mailing list
> 
> > torqueusers at supercluster.org
> 
> > http://www.supercluster.org/mailman/listinfo/torqueuser
> 
> > s
> 
> 
> 
> 
> 
> 

--

 
   
   Lan Zhou University
  
  Email:zhyang at lzu.edu.cn


More information about the torqueusers mailing list