[torquedev] "Fixing" qsig -s USR1 and kill_delay on torque 2.5.x
Alan Wild
alan at madllama.net
Sun Apr 15 19:51:10 MDT 2012
I got some more time this evening to comapre my patch against the latest
3.0.5 snapshot. It turns out that the 2.x patch applies cleanly against
the 3.x tree. I think the last version of the 2.x patch was eaten by the
mail server during the outtage so here it is again.
-%<---%<--CUT
HERE---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<--
diff -rN -U4 torque-3.0.5-snap.201204051313-old/src/resmom/start_exec.c
torque-3.0.5-snap.201204051313/src/resmom/start_exec.c
--- torque-3.0.5-snap.201204051313-old/src/resmom/start_exec.c 2012-04-05
14:13:16.000000000 -0500
+++ torque-3.0.5-snap.201204051313/src/resmom/start_exec.c 2012-04-15
20:42:00.000000000 -0500
@@ -2048,8 +2048,13 @@
if (TJE->is_interactive == FALSE)
{
int k;
+ if (strlen(buf)+5 <= MAXPATHLEN) {
+ memmove(buf+5,buf,strlen(buf)+1);
+ strncpy(buf, "exec ", 5);
+ }
+
/* pass name of shell script on pipe */
/* will be stdin of shell */
close(TJE->pipe_script[0]);
@@ -3732,9 +3737,9 @@
{
arg[aindex] = malloc(
strlen(path_jobs) +
strlen(pjob->ji_qs.ji_fileprefix) +
- strlen(JOB_SCRIPT_SUFFIX) + 1);
+ strlen(JOB_SCRIPT_SUFFIX) + 6);
if (arg[aindex] == NULL)
{
log_err(errno,id,"cannot alloc env");
@@ -3745,9 +3750,10 @@
return(-1);
}
- strcpy(arg[aindex], path_jobs);
+ strcpy(arg[aindex], "exec ");
+ strcat(arg[aindex], path_jobs);
strcat(arg[aindex], pjob->ji_qs.ji_fileprefix);
strcat(arg[aindex], JOB_SCRIPT_SUFFIX);
arg[aindex + 1] = NULL;
-%<---%<--CUT
HERE---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<--
-Alan
On Wed, Mar 28, 2012 at 2:47 PM, Alan Wild <alan at madllama.net> wrote:
> We still don't have permission to install torque-4.0.0, even on our test
> systems. However, I thought I would take a look at the source for pbs_mom
> to see how it works. It appears, overall, very similiar to 2.5.11. So I
> attempted to port my patch to 4.0.0
>
> This does compile, but I can't comment on whether or not it will run. :)
>
>
> -%<---%<--CUT
> HERE---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<--
> diff -rN -U4 torque-4.0.0/src/resmom/start_exec.c
> torque-4.0.0-new/src/resmom/start_exec.c
> --- torque-4.0.0/src/resmom/start_exec.c 2012-02-21
> 17:43:51.000000000 -0600
> +++ torque-4.0.0-new/src/resmom/start_exec.c 2012-03-28
> 14:34:37.000000000 -0500
> @@ -2213,8 +2213,13 @@
> if (TJE->is_interactive == FALSE)
> {
> int k;
>
> + if (strlen(buf)+5 <= MAXPATHLEN) {
> + memmove(buf+5,buf,strlen(buf)+1);
> + strncpy(buf, "exec ", 5);
> + }
> +
> /* pass name of shell script on pipe */
> /* will be stdin of shell */
>
> close(TJE->pipe_script[0]);
> @@ -3881,9 +3886,9 @@
> {
> arg[aindex] = calloc(1,
> strlen(path_jobs) +
> strlen(pjob->ji_qs.ji_fileprefix) +
> - strlen(JOB_SCRIPT_SUFFIX) + 1);
> + strlen(JOB_SCRIPT_SUFFIX) + 6);
>
> if (arg[aindex] == NULL)
> {
> log_err(errno,id,"cannot alloc env");
> @@ -3894,9 +3899,10 @@
>
> return(-1);
> }
>
> - strcpy(arg[aindex], path_jobs);
> + strcpy(arg[aindex], "exec ");
> + strcat(arg[aindex], path_jobs);
> strcat(arg[aindex], pjob->ji_qs.ji_fileprefix);
> strcat(arg[aindex], JOB_SCRIPT_SUFFIX);
>
> arg[aindex + 1] = NULL;
> -%<---%<--CUT
> HERE---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<---%<--
>
>
> -Alan
>
>
> On Mon, Mar 26, 2012 at 11:15 PM, Alan Wild <alan at madllama.net> wrote:
>
>
>> Sorry, real world intruded the last couple of weeks and I haven't had a
>> chance to dive back into this. Yes, for users building Torque with
>> SHELL_USE_ARGV == 1, you would need to modify TMomFinalizeChild().
>> However, we don't build Torque this way, so I haven't had a chance to
>> really test this. Regardless, I took a stab at making the patch more
>> complete.
>>
>> This is against the released 2.5.11:
>>
>>
>> diff -rN -U2 torque-2.5.11/src/resmom/start_exec.c
>> torque-2.5.11-new/src/resmom/start_exec.c
>> --- torque-2.5.11/src/resmom/start_exec.c 2012-03-08
>> 15:34:57.000000000 -0600
>> +++ torque-2.5.11-new/src/resmom/start_exec.c 2012-03-26
>> 23:03:56.000000000 -0500
>> @@ -1997,4 +1997,9 @@
>> int k;
>>
>> + if (strlen(buf)+5 <= MAXPATHLEN) {
>> + memmove(buf+5,buf,strlen(buf)+1);
>> + strncpy(buf, "exec ", 5);
>> + }
>> +
>> /* pass name of shell script on pipe */
>> /* will be stdin of shell */
>> @@ -3641,5 +3646,5 @@
>> strlen(path_jobs) +
>> strlen(pjob->ji_qs.ji_fileprefix) +
>> - strlen(JOB_SCRIPT_SUFFIX) + 1);
>> + strlen(JOB_SCRIPT_SUFFIX) + 6);
>>
>> if (arg[aindex] == NULL)
>> @@ -3654,5 +3659,6 @@
>> }
>>
>> - strcpy(arg[aindex], path_jobs);
>> + strcpy(arg[aindex], "exec ");
>> + strcat(arg[aindex], path_jobs);
>> strcat(arg[aindex], pjob->ji_qs.ji_fileprefix);
>> strcat(arg[aindex], JOB_SCRIPT_SUFFIX);
>>
>> I would love to know if anyone other than me has played with this patch
>> and whether or not it's looking viable.
>>
>> -Alan
>>
>>
>> On Mon, Mar 19, 2012 at 4:52 AM, <torquedev-request at supercluster.org>
>> wrote:
>> >
>> > I definitely agree that exec'ing the script is the correct way to
>> > spawn it. I think the patch is reasonable.
>> >
>> > Would "exec " also need to be added to the shell command line in
>> > TMomFinalizeChild() in the SHELL_USE_ARGV == 1 case?
>> >
>> > Michael
>>
>> --
>> alan at madllama.net http://humbleville.blogspot.com
>>
>
>
> --
> alan at madllama.net http://humbleville.blogspot.com
>
>
--
alan at madllama.net http://humbleville.blogspot.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torquedev/attachments/20120415/77e9d272/attachment.html
More information about the torquedev
mailing list