Pros and Cons of Overloading a Subnet to increase the efficiency of IP use in an ISP network and my implementation

Pros and Cons of Overloading a Subnet to increase the efficiency of IP use in an ISP network and my implementation.

So this blog post is based off of some work that I did for one of my customers implementing https://web.archive.org/web/20210115215112/http://66.250.49.4/image/jimm/NewISP-IPV4conceptV1.1.pdf (Note, the original document doesn’t seem to be accessible anymore). This is a concept that I guess has been around the internet for a while, but I have not seen implemented at scale before. (For good reasons)

Now, first off, subnet overloading is the term I am using since that’s the phrase that was given to me by Jeremy Austin in a conversation in The Brothers WISP slack group (Join us by donating on Patreon) where we discussed this idea. I have done a bit of googling and have not found the phrase elsewhere, so I don’t know if it’s really the proper term, but it sounds right. Just to be clear though, I am using subnet overloading to mean “reusing the same IP scope in multiple layer 2 networks on separate routers”.

This works because you can have a larger subnet as a connected route on a router (meaning the router has an interface in that subnet most of the time) and smaller routes to IPs in that subnet will still override that connected route. So you could have hosts connected to two different routers and still work just fine for most intents and purposes. (Note: there are exceptions and I will talk about those later)

What are we trying to solve? Well, my customer got himself allocated a /22 for his ISP of just over 600 customers. He didn’t want to implement anything terribly complicated, but also didn’t want to waste a bunch of IPv4 addresses on subnetting. So he chose this options, having heard me offer plenty of alternatives and recommendations for the alternatives instead of subnet overloading.

Okay, so why would you ever want to use subnet overloading?

So, first off I am not going to recommend that anyone else follow in my footsteps, this is purely “I have done all this work to implement this weird system and by golly it seems to be working”. If you want to duplicate what I have done, do so at your networks own risk. I do not think this is a good idea.

Pros

But reasons to go this route:

There is no need for a network overlay/underlay. This is purely a trick of routing/subnetting. You can use any equipment and almost any network will support this.
Minimal IP loss. (You only “lose” 3 IPs per subnet you overload, one for the network ID, one for the Gateway and one for the broadcast address) all the other IP addresses in the subnet are available to be used anywhere you have setup this overloaded subnet.
Without any other network configuration, you will provide -some- isolation between your various customers.
You can assign any number of IPv4 addresses to any POP, they do not need to be contiguous and can change as quickly as your DHCP leases.
Works with just about any kind of routed network redundancy out there. No matter how you keep your network “online”. This shouldn’t mess with it too much. (This is a big plus for my customer since he has multiple paths to many of his POPs)
If you have migrated from a bridged network to a routed network you have probably done this, possibly by accident.
I am not sure, it’s simple?

Cons

Alright, but what are the issues? Well there’s lots.

Definitely not standard. If someone steps into your network and looks at this they are going to be confused.
You are breaking end to end connectivity. If you are handing out a public IPv4 it’s assumed that that IP will be able to reach any IPv4 address that’s on the accessible internet. That that IP will seemingly randomly not be able to reach other IPv4 addresses in it’s subnet will be a confusing frustration for your customers. (Unless you implement some kind of Proxy ARP)
Depending on what routes and how you inject them into your network, you could pretty easily overload your routers with updates/routes. This is not a good system to expose to a network that is short on CPU resources in its routers.
You need to be extremely careful about how you allocate your IPv4 addresses out of the subnet. It may be rather difficult to identify duplicate IP allocations and the issues that will cause will not be easy for level 1-2 customer support staff to diagnose.
Every self-respecting network engineer will tell you this is a bad idea.
Some stuff will probably break in weird ways. Network monitoring especially will not appreciate subnet overloading.

The Actual implementation

With all that in mind, here’s how I went about making this work:

The network was entirely OSPF, not advantageous would be better with BGP of some kind.

Here’s the order of operations I went through:

Check for static routes we don’t want advertised throughout the network.
Enabled redistribution of static routes in OSPF
Added the /22 subnet to the interface facing the customers.
Created a DHCP server and configuration for the /22
Created an IP-Pool for the DHCP server to hand out (be careful to not overlap these)
Added a script to the DHCP server to add a static route for the /32 IPv4 address that is handed out. Also added a line in OSPF to tell it to redistribute the /32 IPv4 address. The script also would remove the IPv4 address from being advertised if the lease expired or was otherwise removed.

Note, you could just statically advertise the /32 routes out of each router, I did the DHCP scripting at the request of my customer.

Here’s some example code of what I was doing in MikroTik v7:

/routing/ospf/instance
add disabled=no name=default-v2 redistribute=static router-id=172.11.11.3
/ip/route/add address=192.0.2.1/24 interface=vlan1000-public
add add-arp=yes address-pool=pool-v1000-ipv4 interface=vlan1000-public lease-script=":if (\$leaseBound = 1) do={ /routing/ospf/interface-template/add  area=backbone-v2 auth-id=1 auth-key=\"\" cost=10 disabled=no interfaces=vlan1000-public networks=\$leaseActIP passive priority=64; }\r\
    \n:if (\$leaseBound = 1) do={ /ip/route/add disabled=no dst-address=\$leaseActIP gateway=vlan104-public routing-table=main scope=10 suppress-hw-offload=no; }\r\
    \n\r\
    \n:if (\$leaseBound = 0) do={ /routing/ospf/interface-template/remove numbers=[find networks=\$leaseActIP]; }\r\
    \n:if (\$leaseBound = 0) do={ /ip/route/remove numbers=[find dst-address=\$leaseActIP]; }" lease-time=1h name=dhcp-v1000-ipv4

Note: the above code is from an export here’s the script in a more readable format.

:if ($leaseBound = 1) do={ /routing/ospf/interface-template/add  area=backbone-v2 auth-id=1 auth-key="" cost=10 disabled=no interfaces=vlan1000-public networks=$leaseActIP passive priority=64; }
:if ($leaseBound = 1) do={ /ip/route/add disabled=no dst-address=$leaseActIP gateway=vlan1000-public routing-table=main scope=10 suppress-hw-offload=no; }

:if ($leaseBound = 0) do={ /routing/ospf/interface-template/remove numbers=[find networks=$leaseActIP]; }
:if ($leaseBound = 0) do={ /ip/route/remove numbers=[find dst-address=$leaseActIP]; }

/ip/pool/add name=pool-v1000-ipv4 ranges=192.0.2.2-34

And that’s pretty much it. Now if you wanted to allow your customers to reach each other when they are behind separated routers in your network you will want to enable proxy-arp on each router. This might look different depending on your setup. The only way I found in my implementation was to add in static ARP entries with “advertise” enabled on each router for every IP in the subnet. So that would look like this:

ip/arp/add disabled=no interface=vlan104-public published=yes address=192.0.2.2
ip/arp/add disabled=no interface=vlan104-public published=yes address=192.0.2.3

and so on, tedious, but it seems to work… side note, you can add all addresses as static and don’t need to keep track of which ones are on the router and which are not. Least it works in MikroTik v7 YMMV.

In conclusion. As I have said, this is not an endorsement. I made this work, I might get around to updating this article sometime to let you know if we found something really bad and required us to pull this out of the network. I might not. Implement at your own peril. I would highly recommend having BGP be your distribution protocol instead of OSPF. Unless you have some beefy CPUs on your POP routers this could bog them down quite a bit. You could probably save yourself some pain by statically advertising the /32s out of each of your routers instead of the DHCP trick I pulled, that way your network isn’t hammered every time there’s a bunch of CPE devices that go down because of a power outage or whatever.

If you have done similar in your network or would like to give me some feedback feel free to reach out to me and I would love to chat about this topic.