Bugzilla – Bug 211
pbs_sched does not read TRQ_IFNAME
Last modified: 2012-08-22 13:18:21 MDT
You need to log in before you can comment on or make changes to this bug.
The issue: if TRQ_IFNAME is defined in torque.cfg, then this affects just pbs_server, but not pbs_sched daemon. I have created patch which allows pbs_sched to use TRQ_IFNAME interface as well as pbs_server already do. We have being testing the patch on 3.0.5, but it suits for 2.5.x as well (pbs_sched.c files are the same in the both versions). For 4.x it should be similar, but I have not tried it yet. This is my first patch for torque, so I hope you will review it thoroughly. --- ./torque-3.0.5/src/scheduler.cc/pbs_sched.c 2010-12-06 23:44:13.000000000 +0100 +++ ./torque-3.0.5_new/src/scheduler.cc/pbs_sched.c 2012-07-31 15:53:02.083264191 +0200 @@ -101,6 +101,8 @@ #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> +#include <sys/ioctl.h> +#include <net/if.h> #if defined(FD_SET_IN_SYS_SELECT_H) # include <sys/select.h> @@ -995,6 +997,26 @@ die(0); } + // If TRQ_IFNAME is set in torque.cfg then listen to it + char *if_name = trq_get_if_name(); + if(if_name) + { + struct ifreq ifr; + strncpy(ifr.ifr_name, if_name, sizeof(if_name)); + if(ioctl(server_sock, SIOCGIFADDR, &ifr) < 0) + { + fprintf(stderr, "can not resolve the network interface: %s\n", if_name); + if(if_name) + free(if_name); + die(0); + } + struct in_addr *if_addr = &((struct sockaddr_in*)&ifr.ifr_addr)->sin_addr; + char *if_addr_str = inet_ntoa(*if_addr); + memcpy(host, if_addr_str, strlen(if_addr_str)); + memcpy(hp->h_addr, if_addr, sizeof(struct in_addr)); + free(if_name); + } + if (setsockopt(server_sock, SOL_SOCKET, SO_REUSEADDR, (char *)&t, sizeof(t)) == -1) {
There's not a lot of change between the 2.5 pbs_sched.c and the 4.x, it introduces some locks via mutex's but the place where this goes appears to be outside of those.
Created an attachment (id=115) [details] Using TRQ_IFNAME in pbs_sched
Created an attachment (id=116) [details] Using TRQ_IFNAME in pbs_sched
(In reply to comment #1) > There's not a lot of change between the 2.5 pbs_sched.c and the 4.x, it > introduces some locks via mutex's but the place where this goes appears to be > outside of those. Ok, I've updated the patch for v2 and v3 and created the new one for v4. The patches are attached.
Before we do anything with this patch it needs to be understood that torque.cfg is intended to be used for qsub only. By allowing a torque.cfg parameter to used outside of qsub changes the paradigm. Comments?
Good catch Ken, that was a subtlety that had passed me by completely! It's been almost a decade since we used pbs_sched so I can't recall if it has any config options of its own..
I wonder is there any reason why pbs_sched does not listening to any address (as PBS Pro pbs_sched does)?
Any comments about listening to any address?
(In reply to comment #7) > I wonder is there any reason why pbs_sched does not listening to any address > (as PBS Pro pbs_sched does)? I suspect that's just something that the PBSPro people added after they took it closed source. I don't see any reason why pbs_sched shouldn't be able to do this but as Ken said the question is whether torque.cfg is the right place. I'd have thought so if Ken hadn't pointed out it's only for qsub (but then I'd have expected it to be called qsub.cfg). Reading the manual page for pbs_sched_cc (we didn't have it installed as we don't use it here) it does say it takes a -c option for a config file, but it won't read one at all if it's not passed to it at startup. I'd say that'd be the place to define it..
Ok, Chris, I got it about torque.cfg. We also don't use pbs_sched, but some our customers are using it on complex network configurations. So, my second question was why pbs_sched can not listen to ANY address ALL the time. In this case no config parameter is needed at all. Maybe there are any security reasons to not listen to any address?
(In reply to comment #10) > So, my second question was why pbs_sched can not listen to ANY address ALL the > time. In this case no config parameter is needed at all. Maybe there are any > security reasons to not listen to any address? That would be an understatement. :-) I'm sure Bright Computing has security people on staff. I bet one of them could go into more detail for you, but the short answer is that you never want private services listening publicly. It's an unnecessary risk.
Hi Michael, > I'm sure Bright Computing has security people on staff. I bet one of them > could go into more detail for you, but the short answer is that you never want > private services listening publicly. It's an unnecessary risk. Sure, of cause I agree (and secure people will agree as well) that this is a bad practice in general. But: 1. On computing clusters usually all ports (except several) are closed for external interfaces. 2. PBS Pro pbs_sched and pbs_mom listen to any (ok, lets suppose for now there are no security people in PBS Pro team). 3. TORQUE pbs_server listens to any: tcp 0 0 0.0.0.0:15001 0.0.0.0:* LISTEN 3682/pbs_server Could you, please, explain if you follow the rule "do not listen to any" then why TORQUE pbs_server does not follow this rule as well?
(In reply to comment #12) > > the short answer is that you never want > > private services listening publicly. It's an unnecessary risk. > > Sure, of cause I agree (and secure people will agree as well) that this is a > bad practice in general. > > But: > > 1. On computing clusters usually all ports (except several) are closed for > external interfaces. Closed by what? By definition, the ports aren't closed if there's something listening on them. And if you're referring to a firewall...well, they fail. :-) > 2. PBS Pro pbs_sched and pbs_mom listen to any (ok, lets suppose for now there > are no security people in PBS Pro team). Note the word "private" in my previous comment. pbs_mom is not a private service. In TORQUE, pbs_sched is. Maybe it's not in PBSPro; I have no idea. The two diverged a long time ago, and just because they share an ancestry doesn't mean one can make assumptions about commonalities of current behavior. If one could, we'd all be climbing trees and slinging poo like the other primates. ;-) > 3. TORQUE pbs_server listens to any: > tcp 0 0 0.0.0.0:15001 0.0.0.0:* LISTEN 3682/pbs_server See above. pbs_server needs to listen to other hosts. pbs_sched doesn't AFAIK. > Could you, please, explain if you follow the rule "do not listen to any" then > why TORQUE pbs_server does not follow this rule as well? See above. :-) As with most products, the defaults are configured for the general case. They won't cover every possible use case for every possible user. For the majority of users, pbs_sched listening on localhost only is the correct choice. Same for trqauthd. Sure, they could default to listening on 0.0.0.0, but that would violate the Principle of Least Privilege (see https://developer.apple.com/library/mac/#documentation/Security/Conceptual/Security_Overview/SecuritySvcs/SecuritySvcs.html#//apple_ref/doc/uid/TP40002650-SW4 for more).