[torqueusers] Torque 2.4.9 - Could not create cpuset (Was: Re: Torque 2.4.9 - job reported idle at time)

Ken Nielson knielson at adaptivecomputing.com
Wed Jul 28 09:17:31 MDT 2010


On 07/28/2010 02:23 AM, torqueusers at calcua.ua.ac.be wrote:
>> On Tue, 27 Jul 2010, Ken Nielson wrote:
>>
>>      
>>> This might be something to look at. It appears job 19000 is failing
>>> with an exit status of -3. This job is on the machines in the
>>> hostlist. The job is then set to rerun.
>>>        
>> I have set the loglevel of pbs_mom to 7 on the machine the job will run on
>> (using the -l option of qsub).  In the logfile, I see "state return
>> code=-3" coming up at 09:33:50, immediately being followed by a "job not
>> started" message.
>>
>> Looking at syslog, I now find the following problem:
>>
>>    Jul 28 10:06:32 cn090 pbs_mom: LOG_ERROR::TMomFinalizeChild, Could not
>>    create cpuset for job 19007.ourmachine.com.
>>
>> Indeed, I used the --enable-cpuset option when configuring Torque.  So the
>> question now is why cpuset is not working.  Any ideas?
>>      
> Having reread section 3.5 from the manual, I saw that /dev/cpuset had the
> wrong mount options and that /dev/cpuset/torque was missing.  Now jobs are
> starting again!
>
> Thank you very much for your help, and my apologies for the hassle.
>
> -- Regards,
>
> Franky
>    
No apologies needed. I'm glad I could point you in the right direction.

Ken


More information about the torqueusers mailing list