[torquedev] [Fwd: [Mauiusers] FW: maui pausing on Torque multiple qsubs]

Craig Macdonald craigm at dcs.gla.ac.uk
Wed Feb 21 06:02:24 MST 2007



<snip>
> pbs_disconnect() sets an alarm, for 9 seconds, then tries to read the 
> socket.
> read() is defined as read_nonblocking_socket() in nonblock.c. However, 
> this is what blocks.
>
<snip>
> NB: I havent recompiled torque to see what value of PBSAPITIMEOUT it 
> sees, but I have checked that Maui sets PBSAPITIMEOUT correctly.
The alarm in pbs_disconnect() is indeed set to 9 seconds.
>
> 2. Why isnt' read_nonblocking_socket() doing what it says on the tin?

I suspect that is because the call to fcntl which makes the socket 
non-blocking before the read call is commented out,
suggesting it should be in pbs_disconnect(), which it is not.

The fix happened at svn version r41:
r41 | dev | 2005-05-13 00:17:26 +0100 (Fri, 13 May 2005) | 2 lines
fix


Can anyone comment on why the set O_NOBLOCK was removed from nonblock.c 
read_nonblocking_socket()?
Is there any erasons I should not check and make the socket non blocking 
in pbs_disconnect?



r41:40 diff is below
+    /* NOTE:  the pbs scheduling API passes in a blocking socket which
+              should be a non-blocking socket in pbs_disconnect.  Also,
+              qsub passes in a blocking socket which must remain
+              non-blocking */

+    /* the below non-blocking socket flag check should be rolled into
+       pbs_disconnect and removed from here (NYI) */
+
+    /*
     if (fcntl(fd,F_SETFL,flags) == -1)
       {
       return(-1);
       }
-*/
+    */
     }    /* END else (flags & BLOCK) */



More information about the torquedev mailing list