First of all, sorry for sending it to crossed mailing lists.

I am running SLURM testsuite with Maui configured, but I see many
FAILURES and srun doesn't respond for most options. Is it expected or
should I fix something to run SLURM testsuite smoothly ?

Here is one example (I see many other failures like this):

TEST: 1.23
spawn /usr/bin/srun -N1 -l --mincpus=999999 -t1 hostname
srun: Job is in held state, pending scheduler release
srun: job 38 queued and waiting for resources

FAILURE: srun not responding

When removing option --mincpus, "srun -N1 -l -t1 hostname" works fine.

# cat slurm-maui-sles11.log |grep SUCCESS| wc -l

# cat slurm-maui-sles11.log |grep FAIL| wc -l

I couldn't finish the testsuite, it was running for more than 10 hours
and just getting errors...

FAILURE: srun not responding
FAILURE: salloc not responding

In spite of Maui/SLURM seem to be working, I see this error on maui.log:

06/02 08:07:36 MRMCheckEvents()
06/02 08:07:36 ALERT:    cannot query events on RM (RM 'cluster-ib-5'
does not support function 'rmeventquery')
06/02 08:07:36 MSUAcceptClient(5,ClientSD,HostName,TCP)
06/02 08:07:36 INFO:     accept call failed, errno: 11 (Resource
temporarily unavailable)
06/02 08:07:36 INFO:     all clients connected.  servicing requests

# showq
ACTIVE JOBS--------------------

     0 Active Jobs       0 of    4 Processors Active (0.00%)

I appreciate if somebody point me to the root of the problem and clarify
what is going on.

Thanks in advance.


Rafael Folco
Linux on Power
IBM Linux Technology Center
E-Mail: rfolco at linux.vnet.ibm.com
