[torqueusers] Changes to qstat to generate valid XML

Svancara, Randall rsvancara at wsu.edu
Thu Jan 29 17:52:05 MST 2009


How many people use or depend on the xml formating for "other" uses?

I am curious to know since I have not seen this topic brought up before,
well at least in the few weeks I have been monitoring this list.

My patches are for version 2.3.6.  I think that is the latest release
version.  I would like to see some more testing on this by someone other
than me since I am not a proficient C programmer, however minor my
changes are.  Eventually I would like these changes to go into trunk,
but if there is a suitable branch that will accommodate this change that
will later be merged with trunk, then that is OK too.  I am not sure how
development process goes for Torque, so if anyone cares to enlighten me,
send me an email.

Thanks again,

Randall

On Thu, 2009-01-29 at 16:14 -0800, Garrick Staples wrote:
> Your changes make perfect sense to me.
> 
> Does anyone else have a reason to preserve the current XMLish format?
> 
> Is this suitable for fixes branches or just trunk?
> 
> On Wed, Jan 28, 2009 at 02:30:00PM -0800, Svancara, Randall alleged:
> > If you run qstat -x it does not generate valid XML.  The XML it
> > generates is:
> > 
> > <Data><Job>3834.local-gw.mainlab<Job_Name>test_job</Job_Name>.....
> > 
> > The string, 3834.local-gw.mainlab should be enclosed like this:
> > 
> > <Data><Job><Job_Id>3834.local-gw.mainlab</Job_Id><Job_Name>test_job</Job_Name>....
> > 
> > That is the first fix.  The second fix involves the <sched_hint>
> > element.  You can produce an error where you disable ssh RSA key
> > authentication, you will see ">>> error from copy" in the <sched_hint>
> > element.  So the entire message looks like this:
> > 
> > <sched_hint>Post job file processing error; job 3834.local-gw.mainlab on
> > host node2/1
> > 
> > Unable to copy file /var/spool/torque/spool/3834.local-gw.mainlab.OU to
> > randalls at local-gw:/share/grid/jobs/jobid_1234567890
> > >>> error from copy
> > Permission denied (publickey,password).
> > lost connection
> > >>> end error output
> > Output retained on that host
> > in: /var/spool/torque/undelivered/3834.local-gw.mainlab.OU
> > 
> > Unable to copy file /var/spool/torque/spool/3834.local-gw.mainlab.ER to
> > randalls at local-gw:/share/grid/jobs/jobid_1234567890
> > >>> error from copy
> > Permission denied (publickey,password).
> > lost connection
> > >>> end error output
> > Output retained on that host
> > in: /var/spool/torque/undelivered/3834.local-gw.mainlab.ER</sched_hint>
> > 
> > The string,  ">>> error from copy" causes some serious problems with XML
> > parsers.  Further more, it is not the correct way to deal with text that
> > may have >>> characters.  The best way would be to incorporate
> > <![CDATA[ STRING TEXT GOES HERE ]]>.  An alternative option is just to
> > eliminate the >>>.  I went with the later because I am not a very good C
> > programmer and it was easier for me to replace the >>> with ***.  I am
> > including my patches below if anyone is interested.  
> > 
> > 
> > diff -rup torque-2.3.6/src/cmds/qstat.c
> > torque-2.3.6_mod/src/cmds/qstat.c
> > --- torque-2.3.6/src/cmds/qstat.c	2008-12-11 12:05:42.000000000 -0800
> > +++ torque-2.3.6_mod/src/cmds/qstat.c	2009-01-09 12:52:25.000000000
> > -0800
> > @@ -1075,6 +1075,7 @@ void display_statjob(
> >    mxml_t *JE;
> >    mxml_t *AE;
> >    mxml_t *RE1;
> > +  mxml_t *JI;
> >  
> >    /* XML only support for full output */
> >  
> > @@ -1126,9 +1127,15 @@ void display_statjob(
> >  
> >          MXMLCreateE(&JE, "Job");
> >  
> > -        MXMLSetVal(JE, p->name, mdfString);
> > +        /*MXMLSetVal(JE, p->name, mdfString);*/
> >  
> >          MXMLAddE(DE, JE);
> > +
> > +        JI=NULL;
> > +        MXMLCreateE(&JI,"Job_Id");
> > +        MXMLSetVal(JI,p->name,mdfString);
> > +        MXMLAddE(JE,JI);
> > +
> >          }
> >        else
> >          {
> > diff -rup torque-2.3.6/src/resmom/requests.c
> > torque-2.3.6_mod/src/resmom/requests.c
> > --- torque-2.3.6/src/resmom/requests.c	2008-12-11 12:05:51.000000000
> > -0800
> > +++ torque-2.3.6_mod/src/resmom/requests.c	2009-01-28 13:52:58.000000000
> > -0800
> > @@ -2425,7 +2425,7 @@ static int del_files(
> >  
> >        default:
> >  
> > -        sprintf(log_buffer, ">>> failed to delete files, expansion of %
> > s failed",
> > +        sprintf(log_buffer, "*** failed to delete files, expansion of %
> > s failed",
> >                  path);
> >  
> >          add_bad_list(pbadfile, log_buffer, 1);
> > @@ -3403,7 +3403,7 @@ nextword:
> >  
> >        if ((fp = fopen(rcperr, "r")) != NULL)
> >          {
> > -        add_bad_list(&bad_list, ">>> error from copy", 1);
> > +        add_bad_list(&bad_list, "*** error from copy", 1);
> >  
> >          while (fgets(log_buffer, LOG_BUF_SIZE, fp) != NULL)
> >            {
> > @@ -3417,7 +3417,7 @@ nextword:
> >  
> >          fclose(fp);
> >  
> > -        add_bad_list(&bad_list, ">>> end error output", 1);
> > +        add_bad_list(&bad_list, "*** end error output", 1);
> >          }
> >  
> >  #ifdef HAVE_WORDEXP
> >   
> > 
> > 
> 
> 
> 
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
> 
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20090129/9315be48/attachment-0001.bin


More information about the torqueusers mailing list