Integrating Common Sense and Technology to Meet your Goals

Port Forwarding to Uncooperative Hosts

NAT is not a good thing, but it is frequently a necessity. Here are some of my quick tips to diagnosing the problems and some duct tape solutions with Mikrotik routers.

This last week I ran into a somewhat interesting problem and I figured it might be helpful for people for their own diagnostics. The company has a half dozen monitoring devices that they want to monitor from remote locations. These devices all require you to “pull” the data (which means that some app or service needs to log into the device and get the data) so each device needs to have a port forwarded to it from the router.

Not a big deal in Mikrotik just add a rule like this:

ip firewall nat add chain=dstnat dst-address=192.0.2.89 protocol=tcp dst-port=8888 action=dst-nat to-addresses=192.168.1.20 to-ports=22

Make sure to have the correct protocol, external IP and Port as well as the correct internal IP and Port and you are done. (Probably a good idea to statically assign the IP of whatever device you are port forwarding to) At least most of the time. If you want to be a bit more secure about this you can have add “src-address=” or if you have multiple addresses “src-address-list=” and limit what addresses will be port forwarded from. (Not great security, but better then opening up the port for the whole wide world.)

Sometimes this doesn’t work. I was called this week and the company was saying that many of the port forwards they had me doing on their hosted Mikrotik router were not working.

My first question is always if the network service can be reached from a device that is on the same network as the device hosting the service. If that’s not the case then it’s normally a firewall or port issue on the device and I leave that to whoever is setting up the device to figure out. If local devices can access the service, then it’s time to knuckle down and figure out where the problem is.

Since I run windows and I am cheap, I like to use Packet Sender to just check to see if from a management IP (which I normally allow everything to/from, not great for security, but is worth it in time saved in diagnostics) just to see if the device is accessible. That will indicate where the issue could be, either in the firewall or probably some sort of network issue. I am sure you could possibly use the Packet Generator in Mikrotik to help with diagnostics, but I don’t know it well enough to do so.

Either result will still mean a cursory glance at the firewall rules just to make sure standard firewall rules are in place and working. Firewall rules that could be blocking. Input and Output rules do not matter because DST-NAT happens in Mikrotik packet flow before Input and Output firewall rules. But Forwarding rules can get you into trouble.

Next up is to check to make sure your port forward rule is setup and working as expected. Checking to see if the NAT rules are building up packets and bytes can let you know if you might have goofed up somewhere in your configuration. You can also use Packet Sender and it should increment your counters each time you send a packet, make sure the right counters are getting incremented, you could have another DST-NAT rule picking up your traffic.

ip firewall nat print stats

If you are seeing the correct rule incrementing at the correct times, then it’s likely a network issue of one type or another, especially if you are like me and don’t do a lot of blocking of traffic on your network. Though a good idea might be to just torch the interfaces and make sure you see the traffic coming and going correctly.

Sometimes you might want to do this earlier, especially if you don’t trust your customers to do network configuration correctly, but at this point you really ought to try pinging the device and see if it’s at least at the correct IP.

Sometimes the device won’t respond to Pings, because of an overactive firewall policy, at which point you should check in your routers ARP table to make sure the device is showing up there (Note, make sure to have an active Ping going to the device so you will know that the ARP table entry will be populated even if the device doesn’t respond to ICMP queries.)

ip arp print

Once you see the ARP entry, it’s not a bad idea to check to make sure what your router is seeing for the MAC address is the correct MAC address of the device. Plenty of bad network configs can happen with Static IP addressing.

At this point we are back to some sort of problem in the device hosting the service. If you can ping the device from the router’s LAN IP then the next step is to check to see if the device itself can send packets back out to the wider network, if it can’t access the wider internet then it’s diagnostics on that. Unfortunately, some devices are pretty stupid and don’t give you access to a utility for testing this. Fortunately for us, we have a Swiss Army knife for a router.

ping src-address=192.0.2.89 address=192.168.1.20

Make sure to use the WAN IP of your router and make sure you allow ICMP packets to your WAN IP on the router, but this can indicate to you whether or not the device’s network is setup correctly. If this fails, but you just want to get things working, there is one more trick you can use:

ip firewall nat add chain=srcnat dst-address=192.168.1.20 protocol=tcp dst-port=8888 src-address=!192.168.1.0/24 to-addresses=192.168.1.1

Make sure to use the LAN IP of your router in the “to-addresses” field. Again the packet flow diagram will help us here, DST-NAT chain happens way before SRC-NAT chain. So what happens is the router DST-NATs the packets the packets go through the forwarding firewall chain, then right before it sends out the packet it SRC-NATs the packet. Input and Output rules for the router can be completely skipped, most of the time. If you still are having issues you can put in specific input and output rules to allow this traffic through, but that’s for another time.

Now, this could cause some issues with the device in question depending on how smart it is about connection tracking, some devices won’t appreciate it that there are multiple connections coming from a single IP, in which case issues can arise. At that point you would have to either get creative with multiple IP addresses you SRC-NAT to, or fix whatever the networking issue is with the device limiting it to only being able to send traffic to the local network.

Another option, which would be more secure would be to setup a VPN that will make all the traffic appear to be on the local network. This probably would be preferable in a lot of cases, and then you can up sell to a more expensive router to handle the increased CPU load.

Going from Static Routing to Dynamic routing from Edge to Towers. (Part 2)

In My last post I went over the issues I was facing and the first steps I took to implement BGP to my upstreams. In this post I will look more at the internal issues I was seeing and the decisions I made to handle them.

Game plan: Network Isolation, Vlans. Routing Protocol, eBGP:

First off, there are many different ways to skin this cat of dynamic routing. Many people would point to OSPF as a solution, MPLS or some other system that gives you similar end products with better results because of one thing or another. I am not recommending my solution to other people, I am merely documenting my thought processes and the issues involved with switching over your network. The way I am running BGP isn’t ideal for many networks, OSPF was my initial choice, but because of restrictions in my abilities, my networking hardware and the time frame I was working with I decided to go this route. So, have mercy in your considerations I don’t even have a degree in this stuff and I am working with what I have.

First off, a year ago I had experimented with implementing OSPF over the network. I took weeks of learning and testing, and got some really good results, but when it came to actually implementing it into the network for a single tower back to my core network, I couldn’t get it to work. So the operation was scrapped for another day.

So, I did seriously consider setting up OSPF instead of BGP, but;

  • I was gun-shy of trying to set it up again
  • I knew I had most of the knowledge already to do the project with BGP
  • I would lessen the amount of training I would have to do when showing someone else how the network runs
  • I didn’t see the benefits of OSPF being as helpful to my network. I mostly want the dynamic routing to pick the path I want and not use something else unless it must.
  • Finally, I had just recently been promoted and was given the task to make a decision quickly and implement it, so I did.

I considered looking at other protocols, but I didn’t feel I had time to start from scratch and I certainly didn’t have time to both learn a protocol and learn how Mikrotik “implemented” it

So I went back into my lab, and emerged with a game plan, but I also realized something. I couldn’t do this in the middle of the night. There was minimally 4 years of poorly documented firewall/NAT/route rules in each of a dozen routers, which I had been fixing, but never had the time to take care of completely. There were routes that went places even I didn’t know, un-commented and undocumented firewall rules, NAT rules that were for IPs we never used, and some of the fiber was acting up. Making a situation where I ended up having to do open heart surgery on the network. Never a good idea, but lot’s of fun.

Here’s what I was going for: a unique layer 2 between each router that had a direct connection to another router. Isolated layer 2s between each tower and it’s sub tower(s). Isolation anywhere and everywhere I could justify it. eBGP would the system that enabled each router to know where/how to forward traffic. If I could just run a cable between routers I pretty much did that, but most of the time I used VLANs to pass traffic and VLAN filtering on my switches to prevent unexpected traffic from moving around.

“eBGP?” you say, yes, I wanted to very tightly control my network and I didn’t want to have to setup a BGP peering between every single one of my routers or setup route reflectors in my network. I was going to have each tower be treated like just another network entity on the internet, it talked to it’s peers and share what routes were connected/peered to it. Although I didn’t want to push the entire internet routing table to each of my towers. I figured if eBGP works well enough for the internet, it would be fine for my uses. A side benefit to this was that as I added towers to the BGP network, I only had to change settings on routers that were going to be directly connected to each other. Note, to actually do eBGP, I used private AS Numbers and had to make sure I didn’t leak them outside of my network using Aggregation on my edge/core routers.

Actual implementation, the Edge:

First off I had to connect my two core routers together through iBGP, (Still managed to find a spot for it) have them share all their routes to each other and let them route things out whichever path was more efficient. This took forever, mostly because I didn’t understand exactly how Mikrotik implemented BGP. A side issue was that we were doing our NAT translation on our core/edge routers. While this was fine for our statically routed network, this needed tuning to work with the new dynamic routing system. (The edge routers were NATing to different IPs, which would cause issues for our customers that have resources in different network ranges thereby introducing the possibility that a single customer might present as two different public IP addresses to a service if they routed out both of our upstreams)

The best option, and probably where we will end up in the future, would be to move where we are doing our NAT from the edge to a location slightly inside our network. Instead, what I did was to change our firewall rules from:

chain=srcnat action=src-nat to-addresses=X.X.X.17 src-address=172.16.4.0/24 out-interface=sfp1

To:

chain=srcnat action=src-nat to-addresses=X.X.X.17 src-address=172.16.4.0/24 dst-address-list=!Internal-Networks

I used an address list called “Internal-Networks” that listed every single subnet we used internally. A handy address list that I had previously created in my purge of my firewall rules. This way, the packet would get NATed then sent out whichever upstream had the preferred route. I also did not need to buy/add more equipment to my network, always a plus in the boss’s book. Now, this isn’t as efficient as the previous solution, every single connection is having to be compared to the address list and that is much slower then just checking which interface the connection is heading out of. But I have a lot of processing power not being utilized on my core routers so I made the sacrifice for now.

With that implemented I connected my two edge/core routers together and tried to have them share routes so their respective BGP instances could decide which path would be more efficient for traffic to head out over. After a week of fighting with my config (not straight, I had other things to deal with as well) I figured out I had two different instances of BGP running on my routers, one for between my routers and one for my upstreams. I needed to have a single instance so all the public routes would be compared by BGP before being inserted into the routing table. Thanks to Greg Sowell who was helping me at the time I figured out my mistake.

When I finally have the iBGP system up and running I saw the real benefits, particularly in that I was utilizing my cheaper upstream more because they had better peering agreements then my main upstream. Not only was I saving myself money, but various services saw significant performance improvements.

This is because not only was my customer traffic following a, likely, more optimized route to it’s destinations, but it could now come back over a, likely, more optimized route. I say “likely” because BGP doesn’t promise the lowest latency or largest bandwidth path to your destination, but instead the path that traverses the least number of networks… Most of the time. While this is good enough “most of the time” I have already done some route optimization so my traffic gets a little guidance in how it gets out to the internet. (More on this later)

Even with those improvements, I was still stuck in my layer 2 statically routed network. The issue was compounded because we were about to introduce a loop into our network. I will talk about how I handled that next time.

Next article I will deal with connecting my towers to my core network and how I am controlling my network traffic.

The Brother’s WISP Ep. 109

I had the privileged of being on The Brothers WISP again.

http://thebrotherswisp.com/index.php/the-brothers-wisp-109-hidden-master-dns-mtk-hap-ac3-5-9ghz-fcc-sta/

This week Greg, Mike and Tommy play name that tune and Hollywood squares…match game next time.  

This week we talk about: 

Touchless access control that takes your temp. 

Unimus 2.0.0-Beta2 released 

Hidden Master DNS 

Greg’s MUM 2020 presentation – Ansible Mikrotik Mass Configuration Fast 

Nathan P gave me a muuuch better reboot method on MTK using the execute script command. 

MTK hap AC3 – built in LTE, bigger case

Dave was asking how to graph stats on pure IPSec traffic(Miller said create a simple queue and graph that). 

What subnet sizes to use when assigning public IPs to infrastructure 

My first week with a Mac…only task was reloading it, and that took forever LOLOL 

5.9 GHz FCC STAs (Special Temporary Authority)

IPv6 and Mimosa/AirSpan

After many hours of hair pulling frustration, I figured out why I couldn’t keep my IPv6 working to my house. Mimosa (owned by AirSpan who also has the iBridge network equipment which I presume is running the same software) does not support Multicast packets on it’s PTMP hardware as of firmware version 2.5.2 (the A5/A5c is where it would need to be implemented, but there may be stuff that would need updating on the C5# line)

So, this lead to really weird and frustrating issues with IPv6 for me, IPv6 requires Multicast for Neighbor Discovery which is pretty important for IPv6 to work. Luckily I am in The Brothers WISP Patreon and when I asked for help Mike Hammett and Nick Buraglio came to my rescue and diagnosed the problem which I then confirmed with Mimosa. You cannot even use static routing to get IPv6 to work, Multicast and Neighbor Discovery are built into the protocol.

That did not deter me from figuring out a workaround, I had been playing around with BGP and discovered that I could forward an IPv6 network route over an IPv4 BGP connection. (Thank God for protocols that are designed to be other protocol agnostic.)

I just so happened to be running a Mikrotik hEX S router at my house and my tower routers are all CCR1009s from Mikrotik and because Mikrotik isn’t a PITA licensing hog for every feature in their system, you know who I am talking about, I could setup a BGP peering session between my home and tower router. Forward the IPv6 route across the BGP session and kablam! IPv6 could pass traffic without Neighbor Discovery actually working.

Here is an example config I used, though you may need to adjust your firewall on each router to meet your needs.

Home router config:
#Set the BGP instance so that it has a unique AS Number inside your network
#Note, you will want to pick out your own AS Number from the private pools: #64512-65534 or  4200000000-4294967294, whatever floats your boat.
routing bgp instance add disabled=no name=IPv6BGPPatch router-id=172.16.45.2 as=65536 

#Tell BGP what networks you want to send, make sure this is at least has a #black hole route in your router, or configured on an interface on your #router.
routing bgp network add network=2001:db8:1234:56::/56 synchronize=yes

#Create some filters so we don't get ourselves into trouble
#Don't forget to change up the filter so it matches your IPv6 address and #prefix length
/routing filter
add action=discard address-family=ip chain=out prefix=0.0.0.0/0 prefix-length=0-32
add action=accept address-family=ipv6 chain=out prefix=2001:db8:1234::/56 prefix-length=56
add action=discard address-family=ipv6 chain=out prefix=::/0 prefix-length=0-128
add action=discard address-family=ipv6 chain=in prefix=::/0 prefix-length=0-128
add action=discard address-family=ip chain=in prefix=0.0.0.0/0 prefix-length=0-32


#Where the real work gets done is here, make sure you use the correct #information for your peer
routing bgp peer add address-families=ipv6 disabled=no name=towerRouter remote-address=172.16.45.1 remote-as=65537 tcp-md5-key=ChangeME!!! instance=IPv6BGPPatch out-filter=out in-filter=in  

The tower’s config is pretty similar, you do have to do the rest of the IPv6 setup ahead of time and I would recommend that you verify that it is working there first as well.

/routing bgp instance add disabled=no name=IPv6BGPPatch router-id=172.16.45.1 as=65537

/routing filter
add action=discard address-family=ip chain=IPv6BandaidINCustomer1 prefix=0.0.0.0/0 prefix-length=0-32
add action=accept address-family=ipv6 chain=IPv6BandaidINCustomer1 prefix=2001:db8:1234::/56 prefix-length=56
add action=discard address-family=ipv6 chain=IPv6BandaidINCustomer1 prefix=::/0 prefix-length=0-128
add action=discard address-family=ipv6 chain=out prefix=::/0 prefix-length=1-128
add action=discard address-family=ip chain=out prefix=0.0.0.0/0 prefix-length=0-32

/routing bgp peer add address-families=ipv6 disabled=no name=customer1 remote-address=172.16.45.2 remote-as=65536 tcp-md5-key=ChangeME!!! instance=IPv6BGPPatch out-filter=out in-filter=in default-originate=if-installed 

Give it a couple seconds and as long as you entered the commands correctly you should have a working IPv6 route without having to rely on Multicast. Note, this is only meant to be a guide. I have been told by multiple people to not rely on this system and that stuff might break in really weird ways. Use at your own discretion. Let me know if you run into issues or if you succeed.

Going from Static Routing to Dynamic routing from Edge to Towers. (Part 1)

Setting up BGP on the edge.

So, this is going to be my thoughts, preparations and experiences going through a network that was statically routed from customer to upstream. At the end we now have BGP running on our two core routers, isolated layer 2 links between routers and eBGP managing routes from/to my core and towers. I did most of this work during normal operating hours as well, not recommended, but I did so without significant interruptions to my customers and I was able to keep a sane sleep schedule.

Disclaimer, I am not recommending that anyone actually follows what I did at any point in these articles. There are many good options and arguably some of my solutions are not one of them, but I feel like the experience was good for learning and demonstrating cool things about networking. I will try to mention other good ideas as I go along and I will try to point out how my decisions might not be best for everyone. Secondly, I won’t be going into exact details of my config, unless it illustrates what I

Situation and Goals

Now that we have the disclaimers out of the way, I want give you a brief overview of how the network was setup and why I felt, and my boss knew, we needed to change to a different way of telling our routers how to route. This network has been built over 12 years, initially using hacked Linksys WiFi Routers to be both Access Points and routers, there have been several intervening steps, but 8 years ago we converted almost entirely to Mikrotik routers at both our towers and in our core network and that is what I have to work with today. Each tower is running a CCR-1009 and in our core we have two CCR-1036 routers, one for each of our upstreams. You can see a simplified diagram that shows mostly layer 2 links across the network. What is missing are lots of customer links that are inside the massive layer 2 and various bonded links that are not really important to this discussion.

This network also came with massive firewall lists, a year ago we had ~500 firewall rules in each of the core routers. Very little documentation and some of the rules were made without knowledge to how the Mikrotik firewall actually works. I have been pairing them down to a more manageable number, but there is a massive amount of effort and stress in doing so. Sometime I will write up some notes on working on that, but this article is going to be long enough as is. Pretty much, no change could be implemented without a lot of testing to figure out which rules were breaking things.

My goals were pretty simple, increase network redundancy, simplify network layout (more PTP layer 2 connections, fewer/none large layer 2/3 networks connecting my core to towers), increase our benefit from having multiple connections out to the internet, allow us to utilize more different types of connectivity options and enable us to have a more flexible network architecture for growth.

With those requirements, my only real option for my edge routers was to get myself a pubic AS number and setup eBGP to each of my upstreams and iBGP between my core routers.

Actually implementing BGP:

So, my first step was to get BGP running on our “core routers”. To do that I had to do a lot of learning, I knew nothing besides generalities. Luckily for me, the resources of the internet are plentiful. I want to send a huge thanks out to Greg Sowel and his BGP lab tool, being able to get a full internet feed was invaluable for understanding the breadth of effects to my network. It also let me test my filtering rules and my overall BGP setup and find lots of flaws in my assumptions. Unfortunately, my test lab did not have the same equipment as my production network, it actually couldn’t take a single full BGP feed, and the sheer number of managed switches and different types of links made duplicating the network, even generally, impossible for my budget. So testing in a lab only had so much benefit. None the less it let me get the basics figured out and made the implementation into the real network go much more smoothly.

Next up was to get my company an AS number from ARIN, (Not technically necessary, if I wanted to I could have worked out a private AS for each of them and gotten similar results, but that won’t work for all situations) then contact my upstreams and let them know what I was doing. They had us fill out some paperwork, pretty straightforward stuff and probably the easiest portion of this whole process. My upstreams each sent me the information that I would need for setting up the BGP session and requested their needed info from me. (port for BGP, and other basic stuff.)

You don’t really just “turn on” BGP, you have to work with your upstreams and schedule times for both of you to be on the phone as you turn up the connection. Nobody likes to do work during service windows, (aka, middle of the night) so, we didn’t. Honestly, if you do your stuff right you are not going to break anything, it’s all in your prep work, the rest is handled by your routers. The biggest problem that can happen is with your filters, or I guess your router could tip over while loading the full BGP table, but you should be able to test/verify that your router will do it’s job ahead of time.

The transitions went pretty smoothly on my end, all things considered. I prepared ahead of time by testing out my config and making sure the general details were set correctly. Even so, my upstream and I confirmed the settings we were using to each other just before we turned everything on. Note, we decided to have 2 separate BGP sessions running, one for IPv4 and one for IPv6, while it wasn’t necessary, it felt like a good idea in case one protocol acted up.

Things I learned:

  1. Routers route to the smallest active route that contains the destination IP address, that means you can keep your default gateway route up on your routers and let BGP load it’s IPs and the router will push traffic right along until BGP loads the routes into the table at which point it will push the IPs to their respective destination. This can be handy to use if your routers take a while to load all their BGP routes. Let your upstream know if you are keeping the default route.
  2. BGP probably doesn’t work the way you think it does especially in the case of your certain router manufacture. Keep a close eye on the documentation and how routes are loaded.
  3. Filter, then test your filters to make sure they are working. I won’t repeat what is already probably said very often about what and why you should filter.
  4. It’s totally cool to bring up your BGP session with your incoming/outgoing filters set to block your own subnets and everything else at first. You get to verify that the BGP session comes up, then move on to allowing your subnets out and letting subnets in. Then just verify with your upstream that they see the routes coming from you. Maybe even consider blocking everything under the size of a /22 or /23 even as well. Then, when you have verified everything is working allowing more and more of the internet in. (If you do block everything under /23 or /22, make sure you have a default route setup, not all networks can be reached if you block /24 networks or larger)
  5. If you have not dealt with ARIN (or whoever your regional registry is) before, their rules may look hard to understand at first and there is a lot of old/outdated/incorrect information out there. If you are unsure it helps to chat with someone who deals with them regularly or you can just ask them yourself.
  6. You are not the only one who can screw up a config, be very clear with your upstream with what you are doing. Mine forgot a step and broke their network a bit when I started announcing subnets they had been announcing for me. There isn’t a lot you can do and don’t be a jerk, but it might help.
  7. Watch your bandwidth going through your routers when you make any significant change, if there is a substantial drop then normally something has gone wrong. My upstream saw a drop and figured it was a blip, that blip turned into a multi hour head ache.
  8. There are a bunch of online “looking glasses” which are basically companies that allow you to check out their routing table and see what they see. It’s pretty fun to see your subnets going out across the internet. Also useful to make sure your upstream is actually announcing all of your subnets correctly.
  9. If your router supports BFD, you might consider using it for your BGP session. Ask your upstream provider if they support BFD and will turn it on for your connection. (In my experience, unless you are using an unreliable link you will see benefits in using BFD.
  10. It’s a good idea to separate different protocols into different sessions, I separated IPv4 and IPv6 into their own sessions. This “could” allow one of the protocols to continue working if there was an issue with the other session/protocol. Not a lot of backup, but it does give you a little bit of protection and some flexibility for future configs. (Thanks Nick Buraglio for the tip)

Once I got my two core routers running eBGP to my upstreams I didn’t see many big changes in my network performance, of course. Effectively, I had not changed anything of the flow of my network besides how traffic could get back to me. Of course there were some changes, slightly lower latency to a couple services and a little bit better utilization of one of my upstreams bandwidth to me.

My next step was to connect my two edge routers together and have them share routes between each other so they could each send traffic out the better route. (aka, setup iBGP) I also needed to decide how I would connect my towers back to my core network so they could stay running in case one of my core routers went down.