[torquedev] mom_priv/config line length limit, OS X struct differences

Neil Hodgson neil.hodgson at sirca.org.au
Thu Feb 5 17:01:28 MST 2009


    I should introduce myself first. I am working for SIRCA, a financial
services research organisation that uses TORQUE 1.2.0p5 (with local
patches) to run financial data queries on a set of nodes. A locally
developed scheduler written in TCL uses job type to schedule each job
onto a node that wants to receive that job type. Further, each node
specifies a relative priority for job types. The server and scheduler
run on a pair of machines using Red Hat clustering with a floating IP
address and name used to communicate with the primary. The local patches
to the scheduler implement a -h option for selecting the hostname
similar to the -H option to pbs_server. I worked for SIRCA for 6 months
last year, mostly on other areas but also made the TCL scheduler more
robust. I am back for around a month to upgrade TORQUE to 2.3.6 and to
rewrite the scheduler in C++ so it can be maintained by any of the
developers here.

    The first minor issue is that the code that reads mom_priv/config
uses fgets with a 120 character buffer. This has led to problems here
when a property for a node gets large. It would be better for SIRCA if
this buffer was larger - perhaps 250 to 1000 characters.

    On OS X, sockaddr_in is different to Linux, notably in starting with
a sin_len field.

struct sockaddr_in {
	__uint8_t	sin_len;
	sa_family_t	sin_family;
	in_port_t	sin_port;
	struct	in_addr sin_addr;
	char		sin_zero[8];		/* XXX bwg2001-004 */
};

    In TORQUE, most common code fills in the sin_family, sin_port, and
sin_addr fields which leaves the sin_len and sin_zero fields
uninitialized. This appears to be safe in standard TORQUE where
INADDR_ANY is often used but can cause failures when sin_addr is set to
something else. The failures stopped when the structs were fully
initialized to zero with memset(&a, 0, sizeof(a)). I think it would be
good defensive programming to always fully initialize these structures.
While initializing by assigning one field to zero like

     struct sockaddr_in a={0};

would be prettier in my opinion, recent GCC produces a warning for this
idiom.

    The manual page for pbs_server still shows the hostname parameter as
-h rather than -H.

    Neil






More information about the torquedev mailing list