[torqueusers] 2 problems with torque-2.0.0p7

David B Jackson jacksond at clusterresources.com
Mon Jan 30 21:15:24 MST 2006


Garrick,

  The snapshot released contained 'Buffer = calloc(1,BufSize)'.

Dave

> I think calloc() is still necessary.
>
> On Mon, Jan 30, 2006 at 08:55:25PM -0700, David B Jackson alleged:
>> Martin,
>>
>>   Your patch is exactly right.  The latest 2.1.0 snapshot corrects this
>> issue.  With this change in place, does your multi-homed host issue
>> disappear?
>>
>> Dave
>>
>> > On Mon, Jan 30, 2006 at 05:22:39PM -0800, Martin Siegert wrote:
>> >
>> >> 2) this problem has to do with multi-homed hosts and is by far more
>> >> serious as it stops me dead in my tracks:
>> >>
>> >> $PBS_HOME/server_name contains "b001"
>> >> $PBS_HOME/torque.cfg contains "SERVERHOST b001"
>> >>
>> >> When I submit a job with qsub it returns jobids of the form
>> >> 2345.<hostname>
>> >> instead of 2345.b001. This used to work in torque-2.0.0p3 (which is
>> the
>> >> last version I used before switching to 2.0.0p7)! Thus, this broke
>> >> somewhere in versions 2.0.0p4 - 2.0.0p7. The effect is that, e.g.,
>> >>
>> >> qdel 2345
>> >>
>> >> does not work anymore - I always have to enter the full jobid
>> >> 2345.<hostname>, which is rather annoying and more importantly
>> >> impossible to explain to users.
>> >> I suspect that the problem is with pbs_server
>> >
>> > It appears that "TLoadConfig(Buffer,sizeof(Buffer))" in pbsd_main.c,
>> > line 505, only reads the first 4 characters of the torque.cfg file.
>> >
>> > Consider the following code:
>> >
>> > #include <stdio.h>
>> > #include <stdlib.h>
>> >
>> > int main (int argc, char *argv[]){
>> > char *Buffer;
>> > int BufSize;
>> >
>> >    BufSize = 65536*sizeof(char);
>> >    Buffer = (char *)malloc(BufSize);
>> >    printf("BufSize=%i, sizeof(Buffer)=%i\n", BufSize, sizeof(Buffer));
>> > }
>> >
>> > When you run the corresponding program you get
>> >
>> > BufSize=65536, sizeof(Buffer)=4
>> >
>> > :-(
>> >
>> > In the older versions of torque Buffer was defined as
>> >
>> > char Buffer[65536];
>> >
>> > in which case sizeof(Buffer) has the desired result.
>> > Thus, we either
>> > 1) go back to the old version,
>> > 2) use the code from qsub.c (which is very similar to the old
>> version),
>> > or use something like the following:
>> >
>> > --- src/server/pbsd_main.c.orig	Mon Jan 30 18:49:59 2006
>> > +++ src/server/pbsd_main.c	Mon Jan 30 19:08:47 2006
>> > @@ -452,6 +452,7 @@
>> >    time_t last_jobstat_time;
>> >    int    when;
>> >
>> > +  int    BufSize;
>> >    char   *Buffer;
>> >
>> >    void	 ping_nodes A_((struct work_task *ptask));
>> > @@ -476,7 +477,8 @@
>> >
>> >    ProgName = argv[0];
>> >
>> > -  Buffer=calloc(65536,sizeof(char));
>> > +  BufSize=65536*sizeof(char);
>> > +  Buffer=(char *)malloc(BufSize);
>> >
>> >    /* if we are not running with real and effective uid of 0, forget
>> it */
>> >
>> > @@ -502,7 +504,7 @@
>> >
>> >    /* load/process config file first then override values with command
>> > line parameters */
>> >
>> > -  if (TLoadConfig(Buffer,sizeof(Buffer)) == 0)
>> > +  if (TLoadConfig(Buffer,BufSize) == 0)
>> >      {
>> >      char *ptr;
>> >      char *tptr;
>> >
>> >
>> > Cheers,
>> > Martin
>> >
>> > --
>> > Martin Siegert
>> > Head, HPC at SFU
>> > WestGrid Site Manager
>> > Academic Computing Services                        phone: (604)
>> 291-4691
>> > Simon Fraser University                            fax:   (604)
>> 291-4242
>> > Burnaby, British Columbia                          email:
>> siegert at sfu.ca
>> > Canada  V5A 1S6
>> > _______________________________________________
>> > torqueusers mailing list
>> > torqueusers at supercluster.org
>> > http://www.supercluster.org/mailman/listinfo/torqueusers
>> >
>>
>> _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/torqueusers
>
> --
> Garrick Staples, Linux/HPCC Administrator
> University of Southern California
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>



More information about the torqueusers mailing list