[torquedev] [Bug 184] New: pbs_server fails after you've added 12 job arrays

bugzilla-daemon at supercluster.org bugzilla-daemon at supercluster.org
Tue Apr 24 00:21:28 MDT 2012


http://www.clusterresources.com/bugzilla/show_bug.cgi?id=184

           Summary: pbs_server fails after you've added 12 job arrays
           Product: TORQUE
           Version: 3.0.x
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: critical
          Priority: P5
         Component: pbs_server
        AssignedTo: dbeer at adaptivecomputing.com
        ReportedBy: rhys.hill at adelaide.edu.au
                CC: torquedev at supercluster.org
   Estimated Hours: 0.0


There's a documentation and then code bug which means that if you add 12 job
arrays to pbs_server, it prints out this message:

04/24/2012 15:41:23;0001;PBS_Server;Svr;PBS_Server;LOG_ERROR::Cannot allocate
memory (12) in insert_array, No memory to resize the array...SYSTEM FAILURE

After this, job arrays are broken until pbs_server is restarted, then you get
phantom job arrays left in the queue. This fixes it (Note that ENOMEM==12):

Index: src/lib/Libutils/u_resizable_array.c
===================================================================
--- src/lib/Libutils/u_resizable_array.c    (revision 6023)
+++ src/lib/Libutils/u_resizable_array.c    (working copy)
@@ -186,7 +186,8 @@
 /*
  * inserts an item, resizing the array if necessary
  *
- * @return the index in the array or ENOMEM
+ * @return the index in the array or -1 on failure,
+ * which indicates ENOMEM.
  */
 int insert_thing(

Index: src/server/array_func.c
===================================================================
--- src/server/array_func.c    (revision 6023)
+++ src/server/array_func.c    (working copy)
@@ -1879,7 +1879,7 @@

   pthread_mutex_lock(allarrays.allarrays_mutex);

-  if ((rc = insert_thing(allarrays.ra,pa)) == ENOMEM)
+  if ((rc = insert_thing(allarrays.ra,pa)) == -1)
     {
     log_err(rc,id,"No memory to resize the array...SYSTEM FAILURE\n");
     }

-- 
Configure bugmail: http://www.clusterresources.com/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


More information about the torquedev mailing list