Networking Problem: "The Lawnmower Man"

Summary

Update 2017-09-19 01:49 AM CST: PROBLEM SOLVED

The problem is essentially the same problem faced by the anti-hero in the ending of the movie "The Lawnmower Man" when he must test all the ports to find the one port that replies to his request with "Access Granted" so that he can escape into his desired network environment (in my analogy example, The Lawnmower Man is "seeking a DHCP address"). The Lawnmower Man will take the first port that responds with "Access Granted"; thus if the patch port has a trunk on it The Lawnmower Man effectively has TWO routes out to the nameserver olive that will cause the port to respond with "Access Granted" but we want The Lawnmower Man who is a VLAN 11 Lawnmower Man to only have a SINGLE key to port s1 on switch sw1 which is why port s1 must be set to VLAN 11.

Two OpenvSwitches are configured on a single physical host as shown below. This was the configuration that did not work for the purpose required. Shown in red are the changes that were required to get it working correctly.

Below, the key step was to realize that on swtich "sw1" which normally is for VLAN 10 traffic only, must in this configuration pass the packets arriving from Host 2 via the GRE tunnel to switch sx1 and from there the packets then get a DHCP address handed in the desired way arriving at nameserver "olive" via port "olivex" from openvswitch interface sx1, just as if there was only one physical host. Therefore, the key is to make port "s1" on switch "sw1" have the VLAN 11 tag of openvswitch "sx1" as shown below, as port "s1" is (loosely speaking) "the gateway" if you will, to openvswitch sx1 and port olivex..

Also, the patch ports must not be trunking ports; the patch ports should only pass the packets which have the desired destination, namely the openvswitch interface that corresponds to that VLAN, in this case VLAN 11. At the risk of stating the obvious, patch port "s1' is the way out of sw1 for the VLAN 11 packets, and importantly is the ONLY way out of sw1 for the VLAN 11 packets, and therefore patch port s1 on bridge sw1 must simply be tagged with VLAN 11 so that the VLAN 11 packets have one and only one port available for them to "go home" to their VLAN 11 home on switch sx1.

ubuntu@hub:~/Downloads/orabuntu-lxc-master/anylinux$ sudo ovs-vsctl show | egrep -A20 'sw1|sx1'

Bridge "sw1"

Port "s1"

tag: 10 <-- VLAN tag "10" is WRONG! Patch Port should be VLAN tag "11"

trunks: [10, 11] <-- Trunks is WRONG! Patch Port should not be a trunk port!

Interface "s1"

type: patch

options: {peer="a1"}

Port "ora73c12"

tag: 10

Interface "ora73c12"

Port "sw1"

tag: 10

trunks: [10, 11]

Interface "sw1"

type: internal

Port "s3"

tag: 10

Interface "s3"

Port "ora73c10"

tag: 10

Interface "ora73c10"

Port olivew

tag: 10

Interface olivew

Port "ora73c11"

tag: 10

Interface "ora73c11"

Port "s2"

tag: 10

Interface "s2"

Port "ora73c13"

tag: 10

Interface "ora73c13"

Bridge "sx1"

Port "a1"

tag: 11

trunks: [10, 11 <-- Patch Port should NOT trunk it only accepts VLAN 11 packets.

Interface "a1"

type: patch

options: {peer="s1"}

Port olivex

tag: 11

Interface olivex

Port "oel73c10"

tag: 11

Interface "oel73c10"

Port "sx1"

tag: 11

trunks: [10, 11]

Interface "sx1"

type: internal

ubuntu@hub:~/Downloads/orabuntu-lxc-master/anylinux$ ifconfig sw1

sw1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1420

inet 10.207.39.1 netmask 255.255.255.0 broadcast 0.0.0.0

inet6 fe80::7075:d1ff:fe60:f640 prefixlen 64 scopeid 0x20<link>

ether 72:75:d1:60:f6:40 txqueuelen 1000 (Ethernet)

RX packets 56784 bytes 3401798 (3.4 MB)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 1975 bytes 297889 (297.8 KB)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

ubuntu@hub:~/Downloads/orabuntu-lxc-master/anylinux$ ifconfig sx1

There is an LXC container called "olive" that is attached to BOTH sw1 and sx1 which delivers DHCP addresses on the appropriate subnet ranges (10.207.29.x for sx1-attached LXC containers) and (10.207.39.x for sw1-attached LXC containers) and all works perfectly as long as there are no "patch ports" between the two switches sw1 and sx1.

sx1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1420

inet 10.207.29.1 netmask 255.255.255.0 broadcast 0.0.0.0

inet6 fe80::f86a:51ff:fe9c:4549 prefixlen 64 scopeid 0x20<link>

ether fa:6a:51:9c:45:49 txqueuelen 1000 (Ethernet)

RX packets 871 bytes 48462 (48.4 KB)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 86077 bytes 120495776 (120.4 MB)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

ubuntu@hub:~/Downloads/orabuntu-lxc-master/anylinux$

The Correct Required Switch Configuration

And now here is the switch configuration on Host 2 that does work as required shown below. The switch configuration is exactly the same on both hosts, with the one exception of course that the GRE tunnel endpoints of course are different (pointing to the "other" physical host in the GRE tunnel pair).

The packets enter the GRE tunnel and go to an identical switch configuration on Host 1 (except for the GRE endpoint IP of course). The packets from switch sx1 on VLAN 11 can traverse into switch sw1 on VLAN 10 (because sw1 is set to trunks=10,11). Once the sx1 VLAN 11 packets are in switch sw1 they then enter the GRE tunnel and emerge on Host 1 in switch sx1 where they can traverse the switch sw1 because it is trunked, but cannot enter the nameserver "olive" via olivew port because olivew port is VLAN 10 only tagged thus doing as required by preventing lxc containers on switch sx1 on host 2 from being handed .39 addresses as we want sx1 containers to get .29 addresses . Thus the packets can only continue on into patch port "s1" on switch "sw1" which dumps them out into switch "sx1" VLAN 11 tag only on port "a1" VLAN tag 11 only and then they finally reach nameserver "olive" via port "olivex" VLAN 11 tag only which hands out a .29 address to the VLAN 11 packets.

All other VLAN 10 packets can only remain in switch "sw1" VLAN 10 tag only and those VLAN 10 packets reach nameserver "olive" only via port "olivew" VLAN 10 which hands out .39 addresses to the VLAN 10 packets and all is good - separation of sx1 and sw1 ip address ranges has been preserved on both host 1 and host 2 with no random mixing of ip addresses handed out..

The critical change that keeps the networks separated properly is shown in red bold.

Bridge "sw1"

Port "ora73c16"

tag: 10

Interface "ora73c16"

Port "ora73c21"

tag: 10

Interface "ora73c21"

Port "ora73c19"

tag: 10

Interface "ora73c19"

Port "sw1"

tag: 10

trunks: [10, 11]

Interface "sw1"

type: internal

Port "s2"

tag: 10

Interface "s2"

Port "ora73c17"

tag: 10

Interface "ora73c17"

Port "gre0"

Interface "gre0"

type: gre

options: {remote_ip="192.168.1.5"}

Port "s1"

tag: 11 <-- VLAN tag "11' is now CORRECT! Only pass VLAN 11 packets to switch sx1

Interface "s1"

type: patch

options: {peer="a1"}

Port "s3"

tag: 10

Interface "s3"

Port "ora73c20"

tag: 10

Interface "ora73c20"

Port "ora73c18"

tag: 10

Interface "ora73c18"

ovs_version: "2.6.1"

Bridge "sx1"

Port "a1"

tag: 11

Interface "a1"

type: patch

options: {peer="s1"}

Port "sx1"

tag: 11

Interface "sx1"

type: internal

Now I want to run a SECOND LXC physical host and I want ALL the traffic between the two physical hosts to go over a SINGLE GRE tunnel so I build a GRE tunnel on the sw1 switches that connect the physical NIC on the physical hosts (192.168.1.5 and 192.168.1.32). Here's what that looks like (here is how it looks on the 192.168.1.32 host 2 with the GRE endpoint pointing back to 192.168.1.5 host 1). Note I've also bolded the patch ports as well as bolding the GRE port.

Bridge "sw1"

Port "ora73c18"

tag: 10

Interface "ora73c18"

Port "gre0"

Interface "gre0"

type: gre

options: {remote_ip="192.168.1.5"}

Port "ora73c17"

tag: 10

Interface "ora73c17"

Port "s1"

tag: 10 11

trunks: [10, 11]

Interface "s1"

type: patch

options: {peer="a1"}

Port "ora73c16"

tag: 10

Interface "ora73c16"

Port "sw1"

tag: 10

trunks: [10, 11]

Interface "sw1"

type: internal

Port "s3"

tag: 10

Interface "s3"

Port "s2"

tag: 10

Interface "s2"

ovs_version: "2.6.1"

And here is the tunnel on host 1 pointing to host 2 GRE endpoint:

Bridge "sw1"

Port "s2"

tag: 10

Interface "s2"

Port "s3"

tag: 10

Interface "s3"

Port "sw1"

tag: 10

trunks: [10, 11]

Interface "sw1"

type: internal

Port olivew

tag: 10

Interface olivew

Port "s1"

tag: 10 11

trunks: [10, 11]

Interface "s1"

type: patch

options: {peer="a1"}

Port "ora73c13"

tag: 10

Interface "ora73c13"

Port "ora73c12"

tag: 10

Interface "ora73c12"

Port "ora73c14"

tag: 10

Interface "ora73c14"

Port "ora73c11"

tag: 10

Interface "ora73c11"

Port "gre0"

Interface "gre0"

type: gre

options: {remote_ip="192.168.1.32"}

Bridge "sw2"

Port "sw2"

tag: 80

Interface "sw2"

type: internal

ovs_version: "2.6.1"

Some Commentary Written Prior to Solution...

This GRE tunnel does a great job of handing the DNS/DHCP for the host 1 and host 2 LXC containers on both hosts but only for containers attached to sw1 switch on each host. The containers on the sx1 switch on host 2 cannot get a DHCP address from host 1 (although of course the sx1 attached containers on host 1 can also get addresses).

So my first solution for this was clunky and inefficient, but it did work reliably: I created a 2nd GRE tunnel on switch sx1 on each physical host which had as endpoints the IP 10.207.39.1 on host 1 sw1 switch, and IP 10.207.39.4 on host 2 on host 2 sw1 switch. This solution had several drawbacks, one being that MTU was cut down from 1500 not just to 1420, but further down to 1340 because the traffic had to effectively traverse 2 GRE tunnels. The other drawback is it's not scalable because it would be ridiculous to build a tunnel for each network that had to traverse between the hosts because MTU would be gradually degraded to nothing and moreover it would become excessively complex to manage so many tunnels.

Therefore, I looked for a solution that allows to push all traffic from multiple openvswitches over the single GRE tunnel on the single pair of sw1 switches. So after some research and study I discovered openvswitch "patch ports" which can be used to connect two openvswitches (on the same host). And they worked very well. I patch connect sx1 and sw1 on LXC host 2, and this sends all the sw1 and sx1 traffic from host 2 over the GRE tunnel to the host 1 sw1 switch.

The problem arises though that because there are two subnets I need the switches on host 1 sw1 and sx1 connected also, so I patch port them too so that I can get DHCP addressed handed out for both 10.207.29.x and 10.207.39.x on host 1 that are sent back to containers on host 2. I've tested and for this particular solution, to get both .29 and .39 addresses handed out the patch ports have to be used on both sides of the GRE tunnel on host 1 and host 2.

The problem is that the DHCP addresses with this patch port configuration are being handed out randomly and the separation of the switches and ip subnets breaks down. I get something like this each time now (a random mix on the first column ip addresses (which are the dhcp ip's) sometimes on .29 and sometimes on .39:

UPDATE 2017-09-19 The below mixup of IP addresses occurred because some but not all VLAN 10 packets were leaking into sx1 switch and getting .29 addresses (actually it seems that that path was the preferred path because more .29 addresses were handed out than .39 addresses. The problem has been fixed - see above.

ubuntu@hub:~/Downloads/orabuntu-lxc-master/anylinux$ sudo lxc-ls -f

[sudo] password for ubuntu:

NAME STATE AUTOSTART GROUPS IPV4 IPV6

oel73c10 RUNNING 0 - 10.207.29.10 -

olive RUNNING 0 - 10.207.29.2, 10.207.39.2 -

ora73c10 RUNNING 0 - 10.207.29.11, 172.230.40.10, 172.231.40.10, 192.220.39.10, 192.221.39.10, 192.222.39.10, 192.223.39.10 -

ora73c11 RUNNING 0 - 10.207.29.12, 172.230.40.11, 172.231.40.11, 192.220.39.11, 192.221.39.11, 192.222.39.11, 192.223.39.11 -

ora73c12 RUNNING 0 - 10.207.39.13, 172.230.40.12, 172.231.40.12, 192.220.39.12, 192.221.39.12, 192.222.39.12, 192.223.39.12 -

ora73c13 RUNNING 0 - 10.207.29.14, 172.230.40.13, 172.231.40.13, 192.220.39.13, 192.221.39.13, 192.222.39.13, 192.223.39.13 -

ubuntu@hub:~/Downloads/orabuntu-lxc-master/anylinux$

I get a mix of DHCP addresses handed out. I've tried taking the patch ports out on host 1 but then DHCP fails for containers on host 2 on the sx1 10.207.29.x network. Then I put the patch port back on host 1 and now the sx1 containers on host 2 can get IP addresses, but sometimes they are on 10.207.29.x and sometimes they are on 10.207.39.x.

Is there a way to tell the nameserver olive on host 1 that containers on VLAN 11 (sx1) are to ONLY get 10.207.29.x ip addresses, and those on VLAN 10 (sw1) are to get only 10.207.39.x ip addresses?

Note that this problem is intrinsic to host 1 with or without the GRE tunnel. It's a problem with the design of the networking on host 1 - i.e. the patch ports is the only solution I've found so far (I'm looking for others) to get DHCP addresses for the host 2 lxc containers from the dns/dhcp on host 1, but even without the GRE tunnel, the lxc containers on host 1 start getting mixed up ip addresses when the patch ports are active on host 1.

That is, I can take away the patch ports on host 1, and everything works fine then, no mixing of the subnets and switches randomly and life is good on host 1. Containers on sx1 get .29 addresses and containers on sw1 get .39 addresses for host 1 containers..

But then when I add the patch ports on host 1, (and the patch ports are added because sw1 is the switch connected to the GRE tunnel on each physical host) so I need the patch ports on both sides of the tunnel on both physical hosts, so that all the traffic on both switches on both hosts can traverse the GRE tunnel, and then I get into the mixing problem.

Is there an obvious solution to fix the problem? UPDATE: Yes, I just found it - the solution.