[Mauiusers] Output and error files are missing

Preethi Chockalingam cpreethi86 at yahoo.co.in
Tue Oct 30 06:27:11 MDT 2007


This is the output of /var/log/messages file.. 
 
Oct 30 09:00:07 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool/85.academyl.ER jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.e85' failed with status=1, giving up after 4 attempts
Oct 30 09:00:07 academylab3 pbs_mom: req_cpyfile, Unable to copy file /var/spool/torque/spool/85.academyl.ER to jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.e85
Oct 30 09:01:22 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool/86.academyl.OU jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.o86' failed with status=1, giving up after 4 attempts
Oct 30 09:01:22 academylab3 pbs_mom: req_cpyfile, Unable to copy file /var/spool/torque/spool/86.academyl.OU to jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.o86
Oct 30 09:01:26 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool/86.academyl.ER jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.e86' failed with status=1, giving up after 4 attempts
Oct 30 09:01:26 academylab3 pbs_mom: req_cpyfile, Unable to copy file /var/spool/torque/spool/86.academyl.ER to jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.e86
Oct 30 09:05:04 academylab3 nagios: SERVICE NOTIFICATION: admins;10.237.6.58;Mysql;CRITICAL;notify-by-email;#HY000Host 10.229.62.56 is not allowed to connect to this MySQL server
Oct 30 09:05:14 academylab3 nagios: SERVICE NOTIFICATION: admins;10.237.6.89;Mysql;CRITICAL;notify-by-email;#HY000Host 10.229.62.56 is not allowed to connect to this MySQL server
Oct 30 09:05:47 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool/87.academyl.OU jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.o87' failed with status=1, giving up after 4 attempts
Oct 30 09:05:47 academylab3 pbs_mom: req_cpyfile, Unable to copy file /var/spool/torque/spool/87.academyl.OU to jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.o87
Oct 30 09:05:51 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool/87.academyl.ER jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.e87' failed with status=1, giving up after 4 attempts
Oct 30 09:05:51 academylab3 pbs_mom: req_cpyfile, Unable to copy file /var/spool/torque/spool/87.academyl.ER to jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.e87
Oct 30 09:09:27 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool/88.academyl.OU jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.o88' failed with status=1, giving up after 4 attempts
Oct 30 09:09:27 academylab3 pbs_mom: req_cpyfile, Unable to copy file /var/spool/torque/spool/88.academyl.OU to jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.o88
Oct 30 09:09:32 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool/88.academyl.ER jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.e88' failed with status=1, giving up after 4 attempts
Oct 30 09:09:32 academylab3 pbs_mom: req_cpyfile, Unable to copy file /var/spool/torque/spool/88.academyl.ER to jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.e88

 
Oct 30 09:13:33 academylab3 sshd(pam_unix)[28875]: authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=10.229.60.143  user=jaya
Oct 30 09:13:42 academylab3 sshd(pam_unix)[28880]: authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=10.229.60.143  user=jaya
Oct 30 09:13:46 academylab3 sshd(pam_unix)[28885]: session opened for user jaya by (uid=0)
Oct 30 09:29:54 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool/89.academyl.OU jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.o89' failed with status=1, giving up after 4 attempts
Oct 30 09:29:54 academylab3 pbs_mom: req_cpyfile, Unable to copy file /var/spool/torque/spool/89.academyl.OU to jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.o89
Oct 30 09:29:58 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool/89.academyl.ER jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.e89' failed with status=1, giving up after 4 attempts
Oct 30 09:29:58 academylab3 pbs_mom: req_cpyfile, Unable to copy file /var/spool/torque/spool/89.academyl.ER to jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.e89
Oct 30 09:30:14 academylab3 nagios: Auto-save of retention data completed successfully.
Oct 30 09:30:17 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool/90.academyl.OU jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.o90' failed with status=1, giving up after 4 attempts
Oct 30 09:30:17 academylab3 pbs_mom: req_cpyfile, Unable to copy file /var/spool/torque/spool/90.academyl.OU to jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.o90
Oct 30 09:30:21 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool/90.academyl.ER jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.e90' failed with status=1, giving up after 4 attempts
Oct 30 09:30:21 academylab3 pbs_mom: req_cpyfile, Unable to copy file /var/spool/torque/spool/90.academyl.ER to jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.e90
Oct 30 09:37:02 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool/91.academyl.OU jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.o91' failed with status=1, giving up after 4 attempts
Oct 30 09:37:02 academylab3 pbs_mom: req_cpyfile, Unable to copy file /var/spool/torque/spool/91.academyl.OU to jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.o91
Oct 30 09:37:06 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool/91.academyl.ER jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.e91' failed with status=1, giving up after 4 attempts
Oct 30 09:37:06 academylab3 pbs_mom: req_cpyfile, Unable to copy file /var/spool/torque/spool/91.academyl.ER to jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.e91
Oct 30 09:49:39 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool/92.academyl.OU jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.o92' failed with status=1, giving up after 4 attempts
Oct 30 09:49:39 academylab3 pbs_mom: req_cpyfile, Unable to copy file /var/spool/torque/spool/92.academyl.OU to jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.o92
Oct 30 09:49:44 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool/92.academyl.ER jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.e92' failed with status=1, giving up after 4 attempts
Oct 30 09:49:44 academylab3 pbs_mom: req_cpyfile, Unable to copy file /var/spool/torque/spool/92.academyl.ER to jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/trial.e92
Oct 30 10:08:14 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool/93.academyl.OU jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/loop.out' failed with status=1, giving up after 4 attempts
Oct 30 10:08:14 academylab3 pbs_mom: req_cpyfile, Unable to copy file /var/spool/torque/spool/93.academyl.OU to jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/loop.out
Oct 30 10:08:18 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool/93.academyl.ER jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/loop.error' failed with status=1, giving up after 4 attempts
Oct 30 10:08:18 academylab3 pbs_mom: req_cpyfile, Unable to copy file /var/spool/torque/spool/93.academyl.ER to jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/loop.error
Oct 30 10:08:49 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool/94.academyl.OU jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/loop.out' failed with status=1, giving up after 4 attempts
Oct 30 10:08:49 academylab3 pbs_mom: req_cpyfile, Unable to copy file /var/spool/torque/spool/94.academyl.OU to jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/loop.out
Oct 30 10:08:53 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool/94.academyl.ER jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/loop.error' failed with status=1, giving up after 4 attempts
Oct 30 10:08:53 academylab3 pbs_mom: req_cpyfile, Unable to copy file /var/spool/torque/spool/94.academyl.ER to jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/loop.error
Oct 30 10:11:27 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool/95.academyl.OU jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/try.o95' failed with status=1, giving up after 4 attempts
Oct 30 10:11:27 academylab3 pbs_mom: req_cpyfile, Unable to copy file /var/spool/torque/spool/95.academyl.OU to jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/try.o95
Oct 30 10:11:31 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool/95.academyl.ER jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/try.e95' failed with status=1, giving up after 4 attempts
Oct 30 10:11:31 academylab3 pbs_mom: req_cpyfile, Unable to copy file /var/spool/torque/spool/95.academyl.ER to jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/try.e95
Oct 30 10:22:36 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool/96.academyl.OU jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/try.o96' failed with status=1, giving up after 4 attempts
Oct 30 10:22:36 academylab3 pbs_mom: req_cpyfile, Unable to copy file /var/spool/torque/spool/96.academyl.OU to jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/try.o96
Oct 30 10:22:40 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB /var/spool/torque/spool/96.academyl.ER jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/try.e96' failed with status=1, giving up after 4 attempts
Oct 30 10:22:40 academylab3 pbs_mom: req_cpyfile, Unable to copy file /var/spool/torque/spool/96.academyl.ER to jaya at academylab2.ctc.com:/home/jaya/torque-2.1.6/try.e96




----- Original Message ----
From: Alexander Piavka <piavka at cs.bgu.ac.il>
To: Preethi Chockalingam <cpreethi86 at yahoo.co.in>
Cc: rishi pathak <mailmaverick666 at gmail.com>; mauiusers at supercluster.org
Sent: Tuesday, 30 October, 2007 4:16:56 PM
Subject: Re: [Mauiusers] Output and error files are missing

On Tue, 30 Oct 2007, Preethi Chockalingam wrote:

> Hi,
>
> There are no error messages reg scp and ssh.. on pbs_mom node.

  so you don't jave any errors in /var/log/messages on pbs_mom regarding scp/ssh

  on pbs mom do you have the output and error files of your job in 
/var/spool/pbs/undelivered
or /var/spool/pbs/spool while the job is in E state and after it exits the
queue?


> Job status displays the name of the error and output path, but I dont find the files in the specified path.

You can add:
qmgr  -c "set queue NAME keep_completed = 600"
so that then job completes it is still keeped track of for 10 minutes

and after the job completes run 'qstat -f jid'


>
> Thanks
> -Preethi
>
>
> ----- Original Message ----
> From: Alexander Piavka <piavka at cs.bgu.ac.il>
> To: Preethi Chockalingam <cpreethi86 at yahoo.co.in>
> Cc: rishi pathak <mailmaverick666 at gmail.com>; mauiusers at supercluster.org
> Sent: Tuesday, 30 October, 2007 1:43:07 PM
> Subject: Re: [Mauiusers] Output and error files are missing
>
>
>  Look for scp/ssh errors in syslog messages on pbs_mom node
> what does 'qstat -f jid' gives?
>
>
> On Tue, 30 Oct 2007, Preethi Chockalingam wrote:
>
>> Hi Rishi,
>>
>> I checked my pbs_server and mom logs.. I dont find any error..
>> I am able to scp from all nodes in the cluster to the server node.. But still the output and error files are not created.
>> Wat else do u think could be wrong?
>>
>> Thanks,
>> -Preethi
>>
>> ----- Original Message ----
>> From: rishi pathak <mailmaverick666 at gmail.com>
>> To: Preethi Chockalingam <cpreethi86 at yahoo.co.in>
>> Cc: mauiusers at supercluster.org
>> Sent: Tuesday, 30 October, 2007 11:36:22 AM
>> Subject: Re: [Mauiusers] Output and error files are missing
>>
>> HI,
>>  Check your mom logs and pbs_server logs for 'post job file processing error'.
>> Also check if you can rsh/rcp(as a cluster user) from any compute node to the node where pbs_server is running.
>> This has not got any relation to maui.
>>
>> I suggest you to post mom_logs and server_logs for better identificatino of the problem.
>>
>>
>>
>> On 10/30/07, Preethi Chockalingam <cpreethi86 at yahoo.co.in> wrote:
>> Hi all,
>>
>> I have been integratinf Maui and Torque. When I submit jobs through torque they appear in state 'E' and the job comes out of the queue.
>>
>> I am not able to find th output and input files anywhere.
>>
>> Any suggestions on this please??
>>
>> Thanks in Advance,
>> Preethi.C
>>
>>
>>
>> Save all your chat conversations. Find them online.
>>
>> _______________________________________________
>> mauiusers mailing list
>> mauiusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/mauiusers
>>
>>
>>
>>
>>
>> --
>> Regards--
>> Rishi Pathak
>>


      Bring your gang together - do your thing. Go to http://in.promos.yahoo.com/groups
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20071030/983b2489/attachment-0001.html


More information about the mauiusers mailing list