[torqueusers] Changes to qstat to generate valid XML

Josh Butikofer josh at clusterresources.com
Fri Jan 30 10:22:38 MST 2009


 From a CRI perspective, we think this is probably suitable for branches as long 
as no one is depending on this invalid XML...which I highly doubt, since all 
*real* XML parsers would balk at this XML.

Josh Butikofer
Cluster Resources, Inc.
#############################


Garrick Staples wrote:
> Your changes make perfect sense to me.
> 
> Does anyone else have a reason to preserve the current XMLish format?
> 
> Is this suitable for fixes branches or just trunk?
> 
> On Wed, Jan 28, 2009 at 02:30:00PM -0800, Svancara, Randall alleged:
>> If you run qstat -x it does not generate valid XML.  The XML it
>> generates is:
>>
>> <Data><Job>3834.local-gw.mainlab<Job_Name>test_job</Job_Name>.....
>>
>> The string, 3834.local-gw.mainlab should be enclosed like this:
>>
>> <Data><Job><Job_Id>3834.local-gw.mainlab</Job_Id><Job_Name>test_job</Job_Name>....
>>
>> That is the first fix.  The second fix involves the <sched_hint>
>> element.  You can produce an error where you disable ssh RSA key
>> authentication, you will see ">>> error from copy" in the <sched_hint>
>> element.  So the entire message looks like this:
>>
>> <sched_hint>Post job file processing error; job 3834.local-gw.mainlab on
>> host node2/1
>>
>> Unable to copy file /var/spool/torque/spool/3834.local-gw.mainlab.OU to
>> randalls at local-gw:/share/grid/jobs/jobid_1234567890
>>>>> error from copy
>> Permission denied (publickey,password).
>> lost connection
>>>>> end error output
>> Output retained on that host
>> in: /var/spool/torque/undelivered/3834.local-gw.mainlab.OU
>>
>> Unable to copy file /var/spool/torque/spool/3834.local-gw.mainlab.ER to
>> randalls at local-gw:/share/grid/jobs/jobid_1234567890
>>>>> error from copy
>> Permission denied (publickey,password).
>> lost connection
>>>>> end error output
>> Output retained on that host
>> in: /var/spool/torque/undelivered/3834.local-gw.mainlab.ER</sched_hint>
>>
>> The string,  ">>> error from copy" causes some serious problems with XML
>> parsers.  Further more, it is not the correct way to deal with text that
>> may have >>> characters.  The best way would be to incorporate
>> <![CDATA[ STRING TEXT GOES HERE ]]>.  An alternative option is just to
>> eliminate the >>>.  I went with the later because I am not a very good C
>> programmer and it was easier for me to replace the >>> with ***.  I am
>> including my patches below if anyone is interested.  
>>
>>
>> diff -rup torque-2.3.6/src/cmds/qstat.c
>> torque-2.3.6_mod/src/cmds/qstat.c
>> --- torque-2.3.6/src/cmds/qstat.c	2008-12-11 12:05:42.000000000 -0800
>> +++ torque-2.3.6_mod/src/cmds/qstat.c	2009-01-09 12:52:25.000000000
>> -0800
>> @@ -1075,6 +1075,7 @@ void display_statjob(
>>    mxml_t *JE;
>>    mxml_t *AE;
>>    mxml_t *RE1;
>> +  mxml_t *JI;
>>  
>>    /* XML only support for full output */
>>  
>> @@ -1126,9 +1127,15 @@ void display_statjob(
>>  
>>          MXMLCreateE(&JE, "Job");
>>  
>> -        MXMLSetVal(JE, p->name, mdfString);
>> +        /*MXMLSetVal(JE, p->name, mdfString);*/
>>  
>>          MXMLAddE(DE, JE);
>> +
>> +        JI=NULL;
>> +        MXMLCreateE(&JI,"Job_Id");
>> +        MXMLSetVal(JI,p->name,mdfString);
>> +        MXMLAddE(JE,JI);
>> +
>>          }
>>        else
>>          {
>> diff -rup torque-2.3.6/src/resmom/requests.c
>> torque-2.3.6_mod/src/resmom/requests.c
>> --- torque-2.3.6/src/resmom/requests.c	2008-12-11 12:05:51.000000000
>> -0800
>> +++ torque-2.3.6_mod/src/resmom/requests.c	2009-01-28 13:52:58.000000000
>> -0800
>> @@ -2425,7 +2425,7 @@ static int del_files(
>>  
>>        default:
>>  
>> -        sprintf(log_buffer, ">>> failed to delete files, expansion of %
>> s failed",
>> +        sprintf(log_buffer, "*** failed to delete files, expansion of %
>> s failed",
>>                  path);
>>  
>>          add_bad_list(pbadfile, log_buffer, 1);
>> @@ -3403,7 +3403,7 @@ nextword:
>>  
>>        if ((fp = fopen(rcperr, "r")) != NULL)
>>          {
>> -        add_bad_list(&bad_list, ">>> error from copy", 1);
>> +        add_bad_list(&bad_list, "*** error from copy", 1);
>>  
>>          while (fgets(log_buffer, LOG_BUF_SIZE, fp) != NULL)
>>            {
>> @@ -3417,7 +3417,7 @@ nextword:
>>  
>>          fclose(fp);
>>  
>> -        add_bad_list(&bad_list, ">>> end error output", 1);
>> +        add_bad_list(&bad_list, "*** end error output", 1);
>>          }
>>  
>>  #ifdef HAVE_WORDEXP
>>   
>>
>>
> 
> 
> 
>> _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/torqueusers
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers


More information about the torqueusers mailing list