Bug 188 - job log deadlock
: job log deadlock
Status: RESOLVED FIXED
Product: TORQUE
pbs_server
: 4.0.*
: PC Linux
: P5 critical
Assigned To: David Beer
:
:
:
  Show dependency treegraph
 
Reported: 2012-04-29 05:21 MDT by rhys.hill
Modified: 2012-05-03 13:59 MDT (History)
2 users (show)

See Also:


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description rhys.hill 2012-04-29 05:21:17 MDT
There is currently a deadlock that commonly occurs when job logging is enabled.

The deadlock occurs because the function mk_job_log_name locks job_log_mutex to
update the time when the log was opened, even though the lock is already take
every time its only caller, job_log_open, is executed. The problem is fixed by
simply removing the lock:

Index: src/lib/Liblog/pbs_log.c
===================================================================
--- src/lib/Liblog/pbs_log.c    (revision 6023)
+++ src/lib/Liblog/pbs_log.c    (working copy)
@@ -272,9 +272,7 @@
             ptm->tm_mday);
     }

-  pthread_mutex_lock(job_log_mutex);
   joblog_open_day = ptm->tm_yday; /* Julian date log opened */
-  pthread_mutex_unlock(job_log_mutex);

   return(pbuf);
   }  /* END mk_job_log_name() */

the structure of the code then matches mk_log_name.
Comment 1 David Beer 2012-05-03 13:59:43 MDT
Committed to 4.0.2