[Mauiusers] maui-torque-openmpi-blcr query [SEC=UNCLASSIFIED]

DOHERTY, Greg gdz at ansto.gov.au
Tue Apr 19 17:53:45 MDT 2011


I am attempting to automatic job pre-emption using the maui
preemptor/premptee queue mechanism with openmpi jobs being pre-empted by
blcr. Almost all of this works, and my thanks are due to Eric Roman of
lbl who has written a procedure cr_mpirun to facilitate the openmpi-blcr
interaction , which he intends to release soon. I believe that the
remaining problem is caused by a maui-torque interaction on which I am
seeking advice here.

 

The problem is that for some jobs, identified below, the job on which a
hold has been placed restarts instantly and is checkpointed again. This
results in the time stamp on the ckpt file not matching what was
expected, so the pre-empted job goes into the W state and the preemptor
cannot start. (The other problem of incomplete .o files is minor by
comparison because the simple workaround  identified will suffice until
the problem is fixed). I have experimented unsuccessfully with a few
modifications to maui, as indicated below. I was hoping for some advice
on what else I might try. I would be interested to know whether the
moab-torque-openmpi-blcr combination is working for anyone.

Thank you. Greg Doherty 

	 
	A job asking for m nodes can pre-empt jobs with p<=m nodes, and
restart

	the original jobs. However the .o files of the pre-empted jobs
do not
	contain the output produced prior to them being checkpointed.
For the
	time being, this can be circumvented by redirecting stdout to a
file
	when executing the cr_mpirun command, which works OK.
	 
	A job asking for m nodes cannot successfully pre-empt a job
already
	running with p>m nodes. I believe that this is because the
pre-empted
	job restarts immediately so the ckpt files have labels which
don't
	match. I tried to modify MPBSI.c to stop the pre-empted job from
	restarting immediately, by adding a pbs_alterjob between the
pbs_holdjob

	and the pbs_rlsjob to delay the execution of the pre-empted job
by one
	minute, but that simply fails with a pbs_error of 15016
	 
	04/18 16:21:35 MRMJobCheckpoint(1272,1,SC)
	04/18 16:21:35 MPBSJobCkpt(1272,R,SC)
	04/18 16:21:37 MPBSJobCkpt(Execution_Time, 1622.37)
	04/18 16:21:37 MPBSJobCkpt(Illegal attribute or resource value
for )
	04/18 16:21:37 ERROR: PBS job '1272.liberty.ansto.gov.au' attr
	'Execution_Time:' to '1622.37' (rc: 15016 'Illegal attribute or
resource

	value for ')
	04/18 16:21:37 INFO:     attribute 'PREEMPTEE' set for job 1272 
	 
	 
	So, obviously I don't know what I am doing. I have fiddled with
various

	strings to include the month and day when trying to reset the
execution

	time, but to no avail. Probably pbs_alterjob does not want me to
fiddle

	with execution time at all at this point in proceedings. I can't
find
	very much detailed documentation on those attributes. I have
	experimented with short sleep()s between the pbs_holdjob and
pbs_rlsjob

	also to no avail.
	 
	I enclose the following in case you can see immediately that I
have done

	something stupid.
	
-------------------------------------------------------------------
	 
	int MPBSJobCkpt(
	 
	  mjob_t  *J,    /* I (modified) */
	  mrm_t   *R,    /* I */
	  mbool_t  DoTerminateJob, /* I (boolean) */
	  char    *Msg,  /* O (optional) */
	  int     *SC)   /* O (optional) */
	 
	  {
	  struct attrl Ckattrib;
	 
	  char          *CkRptr;
	  time_t        Cktime;
	  struct tm     *Cktmp;
	  char          Cktmps[256];
	  char          Cktmpline[MAX_MLINE];
	 
	  Ckattrib.next = NULL;
	  Ckattrib.name = ATTR_a;
	  Ckattrib.op = SET;
	 
	  Cktmpline[0] = '\0';
	  CkRptr = Cktmpline;
	  Ckattrib.resource = CkRptr;
	 
	  int   rc;
	  int   holdtimeout;
	 
	  char *ErrMsg;
	 
	  char tmpJobName[MAX_MNAME];
	 
	  const char *FName = "MPBSJobCkpt";
	 
	  DBG(2,fPBS) DPrint("%s(%s,R,SC)\n",
	    FName,
	    (J != NULL) ? J->Name : "NULL");
	 
	  if ((J == NULL) ||
	      (R == NULL) ||
	     ((J->State != mjsStarting) && (J->State != mjsRunning)))
	    {
	    return(FAILURE);
	    }
	 
	  MJobGetName(J,NULL,R,tmpJobName,sizeof(tmpJobName),mjnRMName);
	 
	  rc =
blocking_pbs_holdjob(R->U.PBS.ServerSD,tmpJobName,"s",NULL);
	  /* still ok to release the job if the hold timed out, the
request

was

	   * successful.  */
	  if (rc != -2) { holdtimeout = 0; } else { holdtimeout = 1; }
	 
	  if (rc != 0 && !holdtimeout)
	    {
	    ErrMsg = pbs_geterrmsg(R->U.PBS.ServerSD);
	 
	    DBG(0,fPBS) DPrint("ERROR:    PBS job '%s' cannot be
checkpointed
	(rc: %d  '%s')\n",
	      J->Name,
	      rc,
	      ErrMsg);
	 
	    if (R->FailIteration != MSched.Iteration)
	      {
	      R->FailIteration = MSched.Iteration;
	      R->FailCount     = 0;
	      }
	 
	    R->FailCount++;
	 
	    return(FAILURE);
	    }
	 
	  for (rc=0; rc<256; rc++) {
	        Cktmps[rc] = '\0';
	  }
	 
	  Cktime = time(NULL);
	  Cktime += 60;
	  Cktmp = localtime(&Cktime);
	 
	  if (strftime(Cktmps, sizeof(Cktmps), "%m%d%H%M.%S", Cktmp) ==
0) {
	    DBG(0,fPBS) DPrint("ERROR: Greg's checkpoint addition %d
\n",
	Cktime);
	    return(FAILURE);
	  }
	  Ckattrib.value = Cktmps;
	  DBG(2,fPBS) DPrint("%s(%s, %s)\n",
	    FName, Ckattrib.name, Ckattrib.value);
	 
	  rc = pbs_alterjob(R->U.PBS.ServerSD, tmpJobName, &Ckattrib,
NULL);
	    ErrMsg = pbs_geterrmsg(R->U.PBS.ServerSD);
	  DBG(2,fPBS) DPrint("%s(%s)\n",
	    FName, ErrMsg);
	 
	  if (rc != 0)
	    {
	    ErrMsg = pbs_geterrmsg(R->U.PBS.ServerSD);
	 
	    DBG(2,fPBS) DPrint("ERROR: PBS job '%s' attr '%s:%s' to '%s'
(rc: %d

	'%s')\n",
	      tmpJobName,
	      Ckattrib.name,
	      Ckattrib.resource,
	      Ckattrib.value,
	      rc,
	      ErrMsg);
	/* If I do not comment this bit out, maui simply stops of course
	   and I do not even get to see all the debug messages in the
log

file.

	 
	    if (R->FailIteration != MSched.Iteration)
	      {
	      R->FailIteration = MSched.Iteration;
	      R->FailCount     = 0;
	      }
	 
	    R->FailCount++;
	 
	    return(FAILURE);
	*/
	    }
	 
	  rc = pbs_rlsjob(R->U.PBS.ServerSD,tmpJobName,"s",NULL);
	 
	  if (rc != 0)
	    {
	    ErrMsg = pbs_geterrmsg(R->U.PBS.ServerSD);
	 
	    DBG(0,fPBS) DPrint("ERROR:    PBS job '%s' cannot be
released from
	hold (rc: %d  '%s')\n",
	      J->Name,
	      rc,
	      ErrMsg);
	 
	    if (R->FailIteration != MSched.Iteration)
	      {
	      R->FailIteration = MSched.Iteration;
	      R->FailCount     = 0;
	      }
	 
	    R->FailCount++;
	 
	    return(FAILURE);
	    }
	 
	  if (holdtimeout) { return(FAILURE); }
	 
	  /* NOTE:  'DoTerminateJob' flag not supported */
	 
	  DBG(7,fPBS) DPrint("INFO:     job '%s' checkpointed\n",
	    J->Name);
	 
	  return(SUCCESS);
	  }  /* END MPBSJobCkpt() */

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20110420/ed58346c/attachment-0001.html 


More information about the mauiusers mailing list