[torqueusers] why not have error file and o file when job is
finished
Gus Correa
gus at ldeo.columbia.edu
Tue Dec 2 11:01:56 MST 2008
Hi Zhyang and list
On different occasions, with different versions of PBS,
different PBS scripts, different computers and clusters,
different NFS, local disk, etc,
I found the *.o and *.e files on:
1) The work directory $PBS_O_WORKDIR
2) The user home directory.
When the script doesn't cd to $PBS_O_WORKDIR.
The home dir may be on the master node or the compute node, if there are
multiple home directories
on each node.
3) On the "Mother Superior" node in:
$PBS_HOME/spool
or in:
$PBS_HOME/undelivered
This indicates a problem and the files are still named *.ER and *.OU.
Not necessarily a job failure, maybe a glitch in NFS, or something else.
$PBS_HOME is wherever Torque/PBS is installed.
The "Mother Superior" is the first node on the $PBS_NODEFILE list of
each job.
I hope this helps.
Gus Correa
---------------------------------------------------------------------
Gustavo Correa, PhD - Email: gus at ldeo.columbia.edu
Lamont-Doherty Earth Observatory - Columbia University
P.O. Box 1000 [61 Route 9W] - Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------
Garrick wrote:
> Look in syslog on the node where the job executed, and in the email
> that may have been sent to the user.
>
> HPCC/Linux Systems Admin
>
> On Dec 1, 2008, at 5:56 PM, zhyang at lzu.edu.cn wrote:
>
>> I saw the frontend syslog,pbs_server log, It's seem not information
>> about this error. other account is all right. only one account run
>> into this problem.
>>
>>
>>> "Garrick Staples" <garrick at usc.edu>
>>> 2008-12-02 09:56:32
>>> torqueusers at supercluster.org
>>>
>>> Re: [torqueusers] why not have error file and o file when job is
>>> finished
>>> On Tue, Dec 02, 2008 at 09:24:07AM +0800, zhyang at lzu.edu.cn alleged:
>>>
>>>> Hi
>>>
>>>
>>>>
>>>
>>>> I recentlt found when my job finished,I have not any out file,such
>>>> as job.e* or job.o*, I know that one normal finish job, torque will
>>>> give two files,e file anf o file. who can give me some
>>>> suggestiones? Thanks!
>>>
>>>
>>>>
>>>
>>>
>>>
>>> Look in the syslog of the node where your job ran.
>>>
>>>
>>>
>>> --
>>>
>>> Garrick Staples, GNU/Linux HPCC SysAdmin
>>>
>>> University of Southern California
>>>
>>>
>>>
>>> Revoke LDS Church 501(c)(3) Status - http://lds501c3.wordpress.com/
>>>
>>>
>>>
>>>
>>
>> --
>>
>>
>>
>> Lan Zhou University
>>
>> Email:zhyang at lzu.edu.cn
>> _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/torqueusers
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueuser
> s
More information about the torqueusers
mailing list