[torqueusers] regarding tracejob command

Andrus, Brian Contractor bdandrus at nps.edu
Fri Oct 25 15:11:16 MDT 2013


Pankaj,

It looks like it is working to me.
It went back 6 days from today and checked all the log files under the /archive directory that would cover that range.
That job it did find started more than 6 days ago, so the other log entries were skipped.

Brian Andrus
ITACS/Research Computing
Naval Postgraduate School
Monterey, California
voice: 831-656-6238




From: torqueusers-bounces at supercluster.org [mailto:torqueusers-bounces at supercluster.org] On Behalf Of Pankaj Dorlikar
Sent: Thursday, October 24, 2013 2:12 AM
To: Torque Users Mailing List
Subject: Re: [torqueusers] regarding tracejob command

Hi,
thanks for the reply. However, -p option with -n is not helping i.e.

tracejob -n 6 -p /archive 25829
/archive/server_priv/accounting/20131024: No such file or directory such file or directory
/archive/server_logs/20131024: No such file or directory such file or directory
/archive/mom_logs/20131024: No such file or directory such file or directory
/archive/sched_logs/20131024: No such file or directory such file or directory
/archive/server_priv/accounting/20131023: No such file or directory such file or directory
/archive/server_logs/20131023: No such file or directory such file or directory
/archive/mom_logs/20131023: No such file or directory such file or directory
/archive/sched_logs/20131023: No such file or directory such file or directory
/archive/server_priv/accounting/20131022: No such file or directory such file or directory
/archive/server_logs/20131022: No such file or directory such file or directory
/archive/mom_logs/20131022: No such file or directory such file or directory
/archive/sched_logs/20131022: No such file or directory such file or directory
/archive/server_priv/accounting/20131021: No such file or directory such file or directory
/archive/server_logs/20131021: No such file or directory such file or directory
/archive/mom_logs/20131021: No such file or directory such file or directory
/archive/server_priv/accounting/20131020: No such file or directory such file or directory
/archive/server_logs/20131020: No such file or directory such file or directory
/archive/mom_logs/20131020: No such file or directory such file or directory
/archive/sched_logs/20131020: No such file or directory such file or directory
/archive/server_priv/accounting/20131019: No such file or directory such file or directory
/archive/server_logs/20131019: No such file or directory such file or directory
/archive/mom_logs/20131019: No such file or directory such file or directory
/archive/sched_logs/20131019: No such file or directory such file or directory

Job: 25829.pbs.server

10/21/2013 03:56:17  L    Exit_status=-10 resources_used.cput=202:47:37 resources_used.mem=1809384kb resources_used.vmem=4702624kb resources_used.walltime=48:30:57
10/21/2013 04:01:17  L    dequeuing from TESTq, state COMPLETE


On Wed, Oct 23, 2013 at 11:40 PM, Andrus, Brian Contractor <bdandrus at nps.edu<mailto:bdandrus at nps.edu>> wrote:
Carles,

I think what we are saying is that you can use the -p WITH the -n options.
That way you can go back for instance 300 days, but it will only look at such files in the path that contains the first 90 of those 300 days.


Brian Andrus
ITACS/Research Computing
Naval Postgraduate School
Monterey, California
voice: 831-656-6238



From: torqueusers-bounces at supercluster.org<mailto:torqueusers-bounces at supercluster.org> [mailto:torqueusers-bounces at supercluster.org<mailto:torqueusers-bounces at supercluster.org>] On Behalf Of Pankaj Dorlikar
Sent: Wednesday, October 23, 2013 6:15 AM
To: Carles Acosta
Cc: Torque Users Mailing List
Subject: Re: [torqueusers] regarding tracejob command

Dear Carles,
thanks for the reply.
yes, we are aware of the flag, but if we were to trace 200 days old job, then tracejob -n 200 <job-id> will trace all 200 days files which we dont want. and we are having start and end date/time of the job . so we are trying to find out, how to provide that day's pbs_server log file as argument to tracejob directly instedof tracejob itself scanning all the files.


On Wed, Oct 23, 2013 at 1:32 PM, Carles Acosta <cacosta at pic.es<mailto:cacosta at pic.es>> wrote:
Dear Pankaj,

To search older jobs, you have to use "-n" option. For instance: "tracejob -n 3 $job_id" should search the job in the log files of the last three days.

You can find more information here:

http://www.clusterresources.com/torquedocs21/11.1troubleshooting.shtml#tracejob

Regards,

Carles


On 10/23/2013 09:44 AM, Pankaj Dorlikar wrote:
thanks a lot sir, for the reply. it is very useful. however, -p flag of tracejob always try to look in the current day's log file only.
e.g. job with id 258295 was started on 19 th oct 13 and ended on 21 st Oct 13. I have included both files in the /archive/sched_logs/ directory, but it always look for the current days file only.

tracejob -p /archive/sched_logs/oct13 258295

/archive/sched_logs/oct13/server_priv/accounting/20131023: No such file or directory
/archive/sched_logs/oct13/server_logs/20131023: No such file or directory
/archive/sched_logs/oct13/mom_logs/20131023: No such file or directory
/archive/sched_logs/oct13/sched_logs/20131023: No such file or directory
thanks for help once again.


On Wed, Oct 23, 2013 at 2:28 AM, Andrus, Brian Contractor <bdandrus at nps.edu<mailto:bdandrus at nps.edu>> wrote:
You may want to move logs to an archived area.
Then you could use the -p option to specify where they are.
Eg: tracejob -p /archive/torque/Jan2013 <jobid>

Then it would look at /archive/torque/Jan2013/*logs

This doesn't cover the corner case of a job the spans the dividing period, but it may work for you.

Otherwise, create an alias to a script that will grep as you need.


Brian Andrus
ITACS/Research Computing
Naval Postgraduate School
Monterey, California
voice: 831-656-6238




From: torqueusers-bounces at supercluster.org<mailto:torqueusers-bounces at supercluster.org> [mailto:torqueusers-bounces at supercluster.org<mailto:torqueusers-bounces at supercluster.org>] On Behalf Of Pankaj Dorlikar
Sent: Monday, October 21, 2013 10:19 PM
To: torqueusers
Subject: [torqueusers] regarding tracejob command

Hi,
 We have torque version : 2.5.8. How can we specify the pbs server log file as argument ot tracejob command?

To see the job details of the old jobs using tracejob command, -n <no. fo days> needs to be provided, which traces through all the previous logs till it reaches that day. THis takes very long time if the job is very old.
To avoid this, and if we know the job start date, can't we specify that day's log file as the argument to tracejob, to see the job details.
Or is there any solution to this issue?



--
Pankaj V. Dorlikar

_______________________________________________
torqueusers mailing list
torqueusers at supercluster.org<mailto:torqueusers at supercluster.org>
http://www.supercluster.org/mailman/listinfo/torqueusers



--
Pankaj V. Dorlikar


_______________________________________________

torqueusers mailing list

torqueusers at supercluster.org<mailto:torqueusers at supercluster.org>

http://www.supercluster.org/mailman/listinfo/torqueusers


--

Carles Acosta i Silva

PIC (Port d'Informació Científica)

Campus UAB, Edifici D

E-08193 Bellaterra, Barcelona

Tel: +34 93 581 33 08

Fax: +34 93 581 41 10

http://www.pic.es

Avís - Aviso - Legal Notice: http://www.ifae.es/legal.html



--
Pankaj V. Dorlikar

_______________________________________________
torqueusers mailing list
torqueusers at supercluster.org<mailto:torqueusers at supercluster.org>
http://www.supercluster.org/mailman/listinfo/torqueusers



--
Pankaj V. Dorlikar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20131025/50b14232/attachment-0001.html 


More information about the torqueusers mailing list