[Mauiusers] jobs not run when scheduled by maui
Hyrum Carroll
hyrum at clusterresources.com
Wed Dec 15 09:55:20 MST 2004
Glen,
I need some more information to diagnose the problem. Please send me
the output of qmgr -c "p server", maui.cfg and a LOGLEVEL=7 maui.log
file.
Best regards,
Hyrum Carroll
Cluster Resources, Inc.
On Tue, 2004-12-14 at 13:27, Glen Beane wrote:
> I just tried switching a small cluster over to maui from the standard
> torque fifo scheduler.
>
> Jobs run fine with the standard scheduler, but when they are
> instructed to run by maui the following happens
>
> (this is from the mother superior mom log)
>
> 12/14/2004 15:12:24;0008;
> pbs_mom;Job;3564.kearney.clusters.umaine.edu;Job Modified at request of
> PBS_Server at kearney.clusters.umaine.edu
> 12/14/2004 15:12:26;0008;
> pbs_mom;Job;3564.kearney.clusters.umaine.edu;Started, pid = 5718
> 12/14/2004 15:12:26;0001; pbs_mom;Svr;pbs_mom;tm_eof, matching task
> located, marking interface closed
> 12/14/2004 15:12:26;0008;
> pbs_mom;Job;3564.kearney.clusters.umaine.edu;start_process: task
> started, tid 2, sid 5759, cmd /bin/sh
> 12/14/2004 15:12:26;0008;
> pbs_mom;Job;3564.kearney.clusters.umaine.edu;start_process: task
> started, tid 3, sid 5760, cmd /bin/sh
> 12/14/2004 15:12:27;0001; pbs_mom;Svr;pbs_mom;tm_eof, matching task
> located, marking interface closed
> 12/14/2004 15:12:27;0008;
> pbs_mom;Job;3564.kearney.clusters.umaine.edu;kill_task: killing pid
> 5722 task 1 with sig 9
> 12/14/2004 15:12:32;0008;
> pbs_mom;Job;3564.kearney.clusters.umaine.edu;kill_task: killing pid
> 5758 task 1 with sig 9
> 12/14/2004 15:12:32;0008;
> pbs_mom;Job;3564.kearney.clusters.umaine.edu;Terminated
> 12/14/2004 15:12:32;0001; pbs_mom;Svr;pbs_mom;task_check, cannot
> tm_reply to 3564.kearney.clusters.umaine.edu task 1
> 12/14/2004 15:12:32;0001; pbs_mom;Svr;pbs_mom;task_check, cannot
> tm_reply to 3564.kearney.clusters.umaine.edu task 1
> 12/14/2004 15:12:32;0001; pbs_mom;Svr;pbs_mom;task_check, cannot
> tm_reply to 3564.kearney.clusters.umaine.edu task 1
> 12/14/2004 15:12:32;0001; pbs_mom;Svr;pbs_mom;task_check, cannot
> tm_reply to 3564.kearney.clusters.umaine.edu task 1
> 12/14/2004 15:12:32;0001; pbs_mom;Svr;pbs_mom;task_check, cannot
> tm_reply to 3564.kearney.clusters.umaine.edu task 1
> 12/14/2004 15:12:32;0001; pbs_mom;Svr;pbs_mom;task_check, cannot
> tm_reply to 3564.kearney.clusters.umaine.edu task 1
> 12/14/2004 15:12:32;0001; pbs_mom;Svr;pbs_mom;task_check, cannot
> tm_reply to 3564.kearney.clusters.umaine.edu task 1
> 12/14/2004 15:12:32;0001; pbs_mom;Svr;pbs_mom;task_check, cannot
> tm_reply to 3564.kearney.clusters.umaine.edu task 1
> .
> .
> .
> 12/14/2004 15:12:58;0008;
> pbs_mom;Job;3564.kearney.clusters.umaine.edu;kill_task: killing pid
> 5759 task 2 with sig 9
> 12/14/2004 15:13:03;0008;
> pbs_mom;Job;3564.kearney.clusters.umaine.edu;kill_task: killing pid
> 5760 task 3 with sig 9
> 12/14/2004 15:13:03;0080;
> pbs_mom;Job;3564.kearney.clusters.umaine.edu;Obit sent
> 12/14/2004 15:13:04;0080;
> pbs_mom;Job;3564.kearney.clusters.umaine.edu;Obit sent
> 12/14/2004 15:13:04;0001; pbs_mom;Req;obit reply;Job not found for
> obit reply
>
>
> this is what happens when started using the standard scheduler:
>
> 12/14/2004 15:23:26;0008;
> pbs_mom;Job;3566.kearney.clusters.umaine.edu;Started, pid = 5592
> 12/14/2004 15:23:26;0001; pbs_mom;Svr;pbs_mom;tm_eof, matching task
> located, marking interface closed
> 12/14/2004 15:23:26;0008;
> pbs_mom;Job;3566.kearney.clusters.umaine.edu;start_process: task
> started, tid 2, sid 5633, cmd /bin/sh
> 12/14/2004 15:23:26;0008;
> pbs_mom;Job;3566.kearney.clusters.umaine.edu;start_process: task
> started, tid 3, sid 5634, cmd /bin/sh
> 12/14/2004 15:23:27;0001; pbs_mom;Svr;pbs_mom;tm_eof, matching task
> located, marking interface closed
>
> (then the job is happily running)
>
> _______________________________________________
> mauiusers mailing list
> mauiusers at supercluster.org
> http://supercluster.org/mailman/listinfo/mauiusers
More information about the mauiusers
mailing list