[torqueusers] Changes to qstat to generate valid XML

Joshua Bernstein jbernstein at penguincomputing.com
Wed Jan 28 15:54:24 MST 2009


Thanks Randall,

	I was just working on fixing up the XML output generation myself. Thank you for 
beating me to the punch. You should probably cross post this message to 
torque-dev as it may make it easier to be sure the code gets merged upstream.

-Joshua Bernstein
Software Engineer
Penguin Computing

Svancara, Randall wrote:
> If you run qstat -x it does not generate valid XML.  The XML it
> generates is:
> 
> <Data><Job>3834.local-gw.mainlab<Job_Name>test_job</Job_Name>.....
> 
> The string, 3834.local-gw.mainlab should be enclosed like this:
> 
> <Data><Job><Job_Id>3834.local-gw.mainlab</Job_Id><Job_Name>test_job</Job_Name>....
> 
> That is the first fix.  The second fix involves the <sched_hint>
> element.  You can produce an error where you disable ssh RSA key
> authentication, you will see ">>> error from copy" in the <sched_hint>
> element.  So the entire message looks like this:
> 
> <sched_hint>Post job file processing error; job 3834.local-gw.mainlab on
> host node2/1
> 
> Unable to copy file /var/spool/torque/spool/3834.local-gw.mainlab.OU to
> randalls at local-gw:/share/grid/jobs/jobid_1234567890
>>>> error from copy
> Permission denied (publickey,password).
> lost connection
>>>> end error output
> Output retained on that host
> in: /var/spool/torque/undelivered/3834.local-gw.mainlab.OU
> 
> Unable to copy file /var/spool/torque/spool/3834.local-gw.mainlab.ER to
> randalls at local-gw:/share/grid/jobs/jobid_1234567890
>>>> error from copy
> Permission denied (publickey,password).
> lost connection
>>>> end error output
> Output retained on that host
> in: /var/spool/torque/undelivered/3834.local-gw.mainlab.ER</sched_hint>
> 
> The string,  ">>> error from copy" causes some serious problems with XML
> parsers.  Further more, it is not the correct way to deal with text that
> may have >>> characters.  The best way would be to incorporate
> <![CDATA[ STRING TEXT GOES HERE ]]>.  An alternative option is just to
> eliminate the >>>.  I went with the later because I am not a very good C
> programmer and it was easier for me to replace the >>> with ***.  I am
> including my patches below if anyone is interested.  
> 
> 
> diff -rup torque-2.3.6/src/cmds/qstat.c
> torque-2.3.6_mod/src/cmds/qstat.c
> --- torque-2.3.6/src/cmds/qstat.c	2008-12-11 12:05:42.000000000 -0800
> +++ torque-2.3.6_mod/src/cmds/qstat.c	2009-01-09 12:52:25.000000000
> -0800
> @@ -1075,6 +1075,7 @@ void display_statjob(
>    mxml_t *JE;
>    mxml_t *AE;
>    mxml_t *RE1;
> +  mxml_t *JI;
>  
>    /* XML only support for full output */
>  
> @@ -1126,9 +1127,15 @@ void display_statjob(
>  
>          MXMLCreateE(&JE, "Job");
>  
> -        MXMLSetVal(JE, p->name, mdfString);
> +        /*MXMLSetVal(JE, p->name, mdfString);*/
>  
>          MXMLAddE(DE, JE);
> +
> +        JI=NULL;
> +        MXMLCreateE(&JI,"Job_Id");
> +        MXMLSetVal(JI,p->name,mdfString);
> +        MXMLAddE(JE,JI);
> +
>          }
>        else
>          {
> diff -rup torque-2.3.6/src/resmom/requests.c
> torque-2.3.6_mod/src/resmom/requests.c
> --- torque-2.3.6/src/resmom/requests.c	2008-12-11 12:05:51.000000000
> -0800
> +++ torque-2.3.6_mod/src/resmom/requests.c	2009-01-28 13:52:58.000000000
> -0800
> @@ -2425,7 +2425,7 @@ static int del_files(
>  
>        default:
>  
> -        sprintf(log_buffer, ">>> failed to delete files, expansion of %
> s failed",
> +        sprintf(log_buffer, "*** failed to delete files, expansion of %
> s failed",
>                  path);
>  
>          add_bad_list(pbadfile, log_buffer, 1);
> @@ -3403,7 +3403,7 @@ nextword:
>  
>        if ((fp = fopen(rcperr, "r")) != NULL)
>          {
> -        add_bad_list(&bad_list, ">>> error from copy", 1);
> +        add_bad_list(&bad_list, "*** error from copy", 1);
>  
>          while (fgets(log_buffer, LOG_BUF_SIZE, fp) != NULL)
>            {
> @@ -3417,7 +3417,7 @@ nextword:
>  
>          fclose(fp);
>  
> -        add_bad_list(&bad_list, ">>> end error output", 1);
> +        add_bad_list(&bad_list, "*** end error output", 1);
>          }
>  
>  #ifdef HAVE_WORDEXP
>   
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers


More information about the torqueusers mailing list