<div id="RTEContent"> Hi,<br> <br> I have 128 nodes cluster running torque 1.2.0p6 . Everytime when the user submit a batch of jobs, the torque scheduler will terminated itself and come with following error in the log file. Then the users can't submit any more jobs, unless the torque scheduler is been restarted again.<br> <br> PBS_Server;Connection refused (111) in contact_sched, Could not contact Scheduler - port 15004 <br> 01/12/2006 09:58:46;0001;PBS_Server;Svr;PBS_Server;Connection refused (111) in contact_sched, Could not contact Scheduler - port 15004<br> <br> I have to write a cron job to check the health of torque scheduler process, if it is dealth then start it again.<br> <br> Any helpful people please help me in this. Thanks.<br> </div><p>
<hr size=1>Yahoo! Photos<br>
Got holiday prints? <a href="http://us.rd.yahoo.com/mail_us/taglines/holidayprints/*http://pa.yahoo.com/*http://us.rd.yahoo.com/mail_us/taglines/photos/evt=38089/*http://pg.photos.yahoo.com/ph//print_splash">See all the ways</a> to get quality prints in your hands ASAP.