[Mauiusers] jobs not run when scheduled by maui

Glen Beane beaneg at umcs.maine.edu
Tue Dec 14 13:27:24 MST 2004


I just tried switching a small cluster over to maui from the standard 
torque fifo scheduler.

Jobs run fine with the standard scheduler,  but when they are 
instructed to run by maui the following happens

(this is from the mother superior mom log)

12/14/2004 15:12:24;0008;   
pbs_mom;Job;3564.kearney.clusters.umaine.edu;Job Modified at request of 
PBS_Server at kearney.clusters.umaine.edu
12/14/2004 15:12:26;0008;   
pbs_mom;Job;3564.kearney.clusters.umaine.edu;Started, pid = 5718
12/14/2004 15:12:26;0001;   pbs_mom;Svr;pbs_mom;tm_eof, matching task 
located, marking interface closed
12/14/2004 15:12:26;0008;   
pbs_mom;Job;3564.kearney.clusters.umaine.edu;start_process: task 
started, tid 2, sid 5759, cmd /bin/sh
12/14/2004 15:12:26;0008;   
pbs_mom;Job;3564.kearney.clusters.umaine.edu;start_process: task 
started, tid 3, sid 5760, cmd /bin/sh
12/14/2004 15:12:27;0001;   pbs_mom;Svr;pbs_mom;tm_eof, matching task 
located, marking interface closed
12/14/2004 15:12:27;0008;   
pbs_mom;Job;3564.kearney.clusters.umaine.edu;kill_task: killing pid 
5722 task 1 with sig 9
12/14/2004 15:12:32;0008;   
pbs_mom;Job;3564.kearney.clusters.umaine.edu;kill_task: killing pid 
5758 task 1 with sig 9
12/14/2004 15:12:32;0008;   
pbs_mom;Job;3564.kearney.clusters.umaine.edu;Terminated
12/14/2004 15:12:32;0001;   pbs_mom;Svr;pbs_mom;task_check, cannot 
tm_reply to 3564.kearney.clusters.umaine.edu task 1
12/14/2004 15:12:32;0001;   pbs_mom;Svr;pbs_mom;task_check, cannot 
tm_reply to 3564.kearney.clusters.umaine.edu task 1
12/14/2004 15:12:32;0001;   pbs_mom;Svr;pbs_mom;task_check, cannot 
tm_reply to 3564.kearney.clusters.umaine.edu task 1
12/14/2004 15:12:32;0001;   pbs_mom;Svr;pbs_mom;task_check, cannot 
tm_reply to 3564.kearney.clusters.umaine.edu task 1
12/14/2004 15:12:32;0001;   pbs_mom;Svr;pbs_mom;task_check, cannot 
tm_reply to 3564.kearney.clusters.umaine.edu task 1
12/14/2004 15:12:32;0001;   pbs_mom;Svr;pbs_mom;task_check, cannot 
tm_reply to 3564.kearney.clusters.umaine.edu task 1
12/14/2004 15:12:32;0001;   pbs_mom;Svr;pbs_mom;task_check, cannot 
tm_reply to 3564.kearney.clusters.umaine.edu task 1
12/14/2004 15:12:32;0001;   pbs_mom;Svr;pbs_mom;task_check, cannot 
tm_reply to 3564.kearney.clusters.umaine.edu task 1
.
.
.
12/14/2004 15:12:58;0008;   
pbs_mom;Job;3564.kearney.clusters.umaine.edu;kill_task: killing pid 
5759 task 2 with sig 9
12/14/2004 15:13:03;0008;   
pbs_mom;Job;3564.kearney.clusters.umaine.edu;kill_task: killing pid 
5760 task 3 with sig 9
12/14/2004 15:13:03;0080;   
pbs_mom;Job;3564.kearney.clusters.umaine.edu;Obit sent
12/14/2004 15:13:04;0080;   
pbs_mom;Job;3564.kearney.clusters.umaine.edu;Obit sent
12/14/2004 15:13:04;0001;   pbs_mom;Req;obit reply;Job not found for 
obit reply


this is what happens when started using the standard scheduler:

12/14/2004 15:23:26;0008;   
pbs_mom;Job;3566.kearney.clusters.umaine.edu;Started, pid = 5592
12/14/2004 15:23:26;0001;   pbs_mom;Svr;pbs_mom;tm_eof, matching task 
located, marking interface closed
12/14/2004 15:23:26;0008;   
pbs_mom;Job;3566.kearney.clusters.umaine.edu;start_process: task 
started, tid 2, sid 5633, cmd /bin/sh
12/14/2004 15:23:26;0008;   
pbs_mom;Job;3566.kearney.clusters.umaine.edu;start_process: task 
started, tid 3, sid 5634, cmd /bin/sh
12/14/2004 15:23:27;0001;   pbs_mom;Svr;pbs_mom;tm_eof, matching task 
located, marking interface closed

(then the job is happily running)



More information about the mauiusers mailing list