The standard approach is be liberal in what you accept and be specific in what you emit.
You could
1) Filter the broken message
2) Drop the broken message
3) Ignore the broken attributes but pass them on
4) Break with the broken attributes
To me, only 4 (Arista) is the really unacceptable behaviour. 3 (Juniper) isn't desirable but it's not a devastating behaviour.
EDIT: Actually rereading it, Arista did 2 rather than 4. I think it just closed the connection as being invalid rather than completely crash. That's arguably acceptable, but not great for the users.
There is already RFC 7606 (Revised Error Handling for BGP UPDATE Messages), which specifies in detail how broken BGP messages should be handled.
The most common approach is 'treat-as-withdraw', i.e. handle the update (announcement of a route) as if it was a withdraw (removal of previously announced route). You should not just drop the broken message as that whould lead to keeping old, no longer valid state.
> The standard approach is be liberal in what you accept and be specific in what you emit.
What you're paraphrasing here is the so-called "robustness principle", also known as "Poestel's law". It is an idea from the ancient history of the 1980s and 09s Internet. Today, it's widely understood that it is a misguided idea that has led to protocol ossification and countless security issues.
Postel's Law certainly has led to a lot of problems, but is it really responsible for protocol ossification? Isn't the problem the opposite, e.g. that middleboxes are too strict in what they accept (say only the HTTP application protocol or only the TCP and UDP transport protocols)?
Postel's law is absolutely great if you want to make new things and get them going in a hurry, and I think it was one of the major reasons the TCP/IP stack beat the ISO model. But as you say, it's a disaster if you want to build large robust systems for the long term.
The problem is that folks took advantage of the behavior of BGP where it would forward unknown attributes that the local device didn't understand, as a means to do all sorts of things throughout the network. People now rely on that behavior.
Now, we're experiencing the downside of this "feature"
BGP has classes attributes that it forwards. While it is true that it forwards route attributes it doesn't know about, this was an attribute that it DID know about and knows it shouldn't forward.
In fact it's a bit strange just how lenient Juniper's software was here. If a session is configured as IBGP on one end and EBGP on the other end, it should never get past the initial message. Juniper not only let it get past the connection establishment but forwarded obviously wrong routes.
At a glance this “feature” seems like an incredibly bad idea, as it allows possibly unknown information to propagate blindly through systems that do not understand the impact of what they are forwarding. However this feature has also allowed widespread deployment of things like Large Communities to happen faster, and has arguably made deployment of new BGP features possible at all.
Being that prescriptive is fundamentally unworkable in practice. Propagating unknown attributes is fundamentally what made the deployment of 32-bit AS numbers possible (originally RFC 4893; unaware routers pass the `AS4_PATH` attribute without needing to comprehend it), large communities (RFC 8092), the Only To Customer attribute (RFC 9234) and others.
A BGP Update message is mostly just a container of Type-Length-Value attributes. As long as the TLV structure is intact, you should be able to just pass on those TLVs without problems to any peers that the route is destined for.
The problem fundamentally is three things:
1. The original BGP RFC suggests tearing down the connection upon receiving an erroneous message. This is a terrible idea, especially for transitive attributes: you'll just reconnect and your peer will resend you the same message, flapping over and over, and the attribute is likely to not even be your peer's fault. The modern recommendation is Treat As Withdraw, i.e. remove any matching routes from the same peer from your routing table.
2. A lack of fuzz testing and similar by BGP implementers (Arista in this case)
3. Even for vendors which have done such testing, a number of have decided (IMO stupidly) to require you to turn on these robustness features explicitly.
That's in conflict with the philosophy behind the internet. If you'd just drop anything because some part of it you don't understand, you lose a lot of flexibility. You have to keep in mind that some parts of the internet are running on 20 year old hardware, but some other parts might work so much better if some protocol is modified a little. Just like with web browsers, if everything is a little bit flexible in what they accept, you both improve the smoothness of the experience and create room for growth and innovation.
I would perhaps argue that juniper's behavior is the preferable one.
Remember the definition of this "drop the message I think is broken" not inherently "drop the broken message," it's entirely plausible that the message is fine but you have a bug which makes you THINK it's a broken message.
There is also a huge difference between considering it a broken message and a broken session, which is what Arista did.
Arista did 2, but it also dropped the whole connection which was probably bad.
IMHO, just drop the broken attributes in the message and log them, and pass on the valid data if there's any left. If not, pretend you did not receive an UPDATE message from that particular peer.
Monitoring will catch the offending originator and people can deal with this without having to deal with any network instability.
In case you want to calibrate your sense of armchair-ness: you have completely missed the point that discarding an individual attribute can quite badly change the meaning of a route, and since we're talking about the DFZ here, such breakage can spread around the planet to literally every DFZ participant. The only safe thing you can do is to drop the entire route. Maybe there was a point to this being discussed at quite some length by very knowledgeable people, before 7606 became RFC ;)
(I haven't downvoted your comment, but I can see why others would — you're making very simple and definite statements about very complicated problems, and you don't seem to be aware of the complications involved. Hence: your calibration is a bit off.)
I still remember the mad scramble we had to fix CVE-2023-4481 across our entire network. This class of bugs is going to be an absolute nightmare to deal with, and because of the way BGP has been designed & implemented, it is going to take a _long_ time to fix these kinds of behaviors.
There was certainly a period of time where folks like AT&T alongside Juniper and Cisco drove BGP into crazytown by way of MPLS and VPN related features. Terrifyingly complex (imo) but lucrative for some.
Our IOS XR chassis' have gotten some of these packets. Corresponding with high bgp route advertisements. No idea what equipment upstream uses tbh.
Makes me wonder if the BGP protocol is properly fuzzed. Perhaps its one of those things that everyone is scared to try to knock over given it's so important.
I suppose it would be easy to write a fuzzer for bgp but very hard to diagnose crashes?
Is it just me or BGP is something I never learnt about until I heard about it causing issues? It seems it's essential to the internet, just like TCP/IP, but nevertheless I learnt about the latter in the university, during my career, I read many books about TCP/IP... but nothing about BGP (not in the university, not at work, not in books, nothing).
I can "play" with TCP/IP at home in dummy projects and learn more about it... but I have no idea how to "play" with BGP. In that regard, how does one learn about it at home?
Buy some routers that have BGP implementations (there are some cheap ones, Mikrotik for example), or use open source implementations. The article lists bird, another very popular one is FRR (free range routing). You can trivially stand up two docker containers, stand up a BGP session between them, and - for example - propagate static routes you set up within them.
It lets you setup multiple containers with direct connections between them in whatever topology you want. It allows you to run both Linux containers (with FRR for example) and emulated versions of popular router platforms (some of the ones mentioned in the article).
DESCRIPTION
bgpd is a Border Gateway Protocol (BGP) daemon which manages the network
routing tables. Its main purpose is to exchange information concerning
"network reachability" with other BGP systems. bgpd uses the Border
Gateway Protocol, Version 4, as described in RFC 4271.
At least to me one of the challenges is relating to the problems that BGP solves. You can get pretty far in network complexity before BGP (or OSPF etc) really does anything for you. What would be good scenarios one could encounter in "homelab" situation where BGP would be beneficial?
DN42[1] provides a playground for routing technologies. I wouldn't recommend digging in if you don't want to dedicate a lot of time into it. As someone fairly well versed in networking, WAN routing is still confusing to me.
GNS3 is probably the easiest way to get hands on experience with any networking technologies.
The other is to make a couple of virtual machines with a couple of network interfaces, make some sort of network betweeen them and then use some bgp routing deamon, eg:
My undergraduate networking course didn't touch BGP, my graduate networking course did touch BGP. We used a python package that acted as a simulator for different AS but I can't remember which one.
My undergrad networks course discussed a little BGP stuff but only on the blackboard.
To experiment with BGP you could use a network simulator like what the author of this blog did. In my class we used something called gini[1] which I think my profs grad student wrote but the author apparently used gns3 which seems to be a cisco specific ns3 version. I used ns3 once and found it had a steep learning curve. The gini simulator has a more basic user interface but is probably less powerful.
I think you need to manage a real (and large) network that's connected to global internet traffic in order to "play" with BGP. Well, you can tinker with it at home, but only by using a network simulator.
You can set up local BGP routers and peer them and play with it.
Another fun thing is to log into publicly available looking glass servers. Most ISPs (including very, very, very large ones) operate routers that have their full view of the BGP routing tables. They either run web interfaces that let you query those tables (more common) or make public ssh or telnet credentials to log in with roles that have very limited access to the available commands, but have read rights to those tables.
I've used BGP internally at my company for a decade, using AS65xxx range. At home I use BGP between the house, garage and shed, I much prefer it to OSPF.
You don’t need a large network to participate to BGP. You just need a /24 (IPv4) or /48 (IPv6) allocation, AS number, and a business class Internet connection that can do BGP. Might be out of reach for most hobbyists but not impossible.
When has BGP not been implicated in causing issues though?
The first widespread incident I found was from 1997 [1], but I didn't look too hard.
I don't think there's really a satisfying way to play with BGP as a small network. Traffic engineering is where I think the fun would be, but you've got to have lots of traffic and many connections for it to be worthwhile. Then you'd be trying to use your announcements to coax the rest of the internet to push packets through the connections you want. As well as perhaps adjusting your outgoing packets to travel the connections you prefer when possible. Sadly, nobody lets me play with their setup.
One of the ways to get a sense of emergent routing behavior is if you have hosting in many places, you'll likely see a lot of differences in routes when you start going to far off countries. If you run traceroutes (or mtr) from your home internet and your cell phone and various hosting, and if you can trace back... you'll likely see a diversity of routes. Sometimes you'll see things like going from west coast US to Brazil, where one ISP will send your packets to florida, then Brazil, and one ISP will send your packets to Spain, then Brazil, with a lot more latency.
You can play with BGP by joining https://dn42.eu/ - a fake internet with a few thousand participants who are mostly as clueless as you, and none of whom will lose millions of dollars per hour if it breaks (which is not infrequently).
TCP/IP affects every networked application and endpoint on the internet.
BGP runs the internet routing "in the background" and you only need to know it if you're an internet service provider or work in a large org managing the network. If you didn't learn network routing, you aren't going to learn BGP.
Put two or three VMs (OpenBSD has OpenBGPD daemon) onto a shared virtual switch and addresses in 172.31.255.0/24, connect the VMs. Also each of the VMs should have at least one other interface onto unique virtual switches with their own network (172.31.1.0/24, 172.31.2.0/24, etc).
So back when I did Wisp stuff I'd set up simulates networks between multiple machines with real and virtual networks. VyOS which was similar to the UBNT equipment we were using is light weight and supports multiple protocols.
In my opinion, containerlab is one of the easier tools to setup a lab environment for networking. You define a network with yaml which consists of nodes and links between them and it creates these using docker. They also have a BGP peering example lab: https://containerlab.dev/lab-examples/peering-lab/
Well, what did you study in the university? I did learn about BGP and routing in university since one of my subjects was information networks and protocols. But haven't really used it outside of some lab exercises since there's been no need at work nor at home.
I learned (and later taught) BGP (and routing in general), albeit at a superficial level, in high school already. Then I actually got to work with it during labs in university.
I remember Helsinki CS having quite a bit of BGP, TCP and both ipv4 and ipv6. No guarantees that every student aced those classes, but the teaching definitely was there
Cisco offers some simulator tooling. It basically virtualizes a lot of networking devices and allows you to play LEGO/SimCity with them: Cisco Packet Tracer
Now, we built toy networks from scratch while I was working toward my certification. Surely larger-scale simulation files could be loaded into Packet Tracer. And perhaps, vendors have simulators on a larger scale than the free downloads?
When I worked at a regional ISP, my supervisor was the BGP wizard. He referred to exterior routing as "a black art". Even more, the telcos were deploying their own technologies like Frame Relay and SMDS, which are Layer 1/Layer 2 protocols beyond the standard "point-to-point" leased lines.
We once experienced a fiber cut on our T-3 backbone (construction workers didn't dial 811). So my supervisor arranged the BGP routes to send everything over a 56k line, IIRC. He gloated about it. The packet loss rate was absurd, but our customers had connectivity!
if you're a linux person consider a routing on host setup with FRR with /32s. As every host is a /32 network you can focus more on the aspects of BGP rather than TCP/IP.
Given the impact of such bugs, I'm surprised there isn't a consortium with an interoperability test suite. Or maybe there is, and this specific issue isn't in the test suite. In which case, I'm surprised test cases aren't generated with a fuzzer and/or machine-generated full exploration of possible packet errors. I mean, it's fine if the suite takes hours or even days to run.
I guess the author of the article here has written a fuzzer with some coverage, and has come across similar issues before. Astonishing that the vendors don't pick up on this work hungrily.
You could
1) Filter the broken message
2) Drop the broken message
3) Ignore the broken attributes but pass them on
4) Break with the broken attributes
To me, only 4 (Arista) is the really unacceptable behaviour. 3 (Juniper) isn't desirable but it's not a devastating behaviour.
EDIT: Actually rereading it, Arista did 2 rather than 4. I think it just closed the connection as being invalid rather than completely crash. That's arguably acceptable, but not great for the users.
The most common approach is 'treat-as-withdraw', i.e. handle the update (announcement of a route) as if it was a withdraw (removal of previously announced route). You should not just drop the broken message as that whould lead to keeping old, no longer valid state.
Dead Comment
What you're paraphrasing here is the so-called "robustness principle", also known as "Poestel's law". It is an idea from the ancient history of the 1980s and 09s Internet. Today, it's widely understood that it is a misguided idea that has led to protocol ossification and countless security issues.
Widely claimed by some but certainly not "widely understood" because such phrasing implies a lack of controversy regarding the claim that follows it.
Now, we're experiencing the downside of this "feature"
In fact it's a bit strange just how lenient Juniper's software was here. If a session is configured as IBGP on one end and EBGP on the other end, it should never get past the initial message. Juniper not only let it get past the connection establishment but forwarded obviously wrong routes.
At a glance this “feature” seems like an incredibly bad idea, as it allows possibly unknown information to propagate blindly through systems that do not understand the impact of what they are forwarding. However this feature has also allowed widespread deployment of things like Large Communities to happen faster, and has arguably made deployment of new BGP features possible at all.
A BGP Update message is mostly just a container of Type-Length-Value attributes. As long as the TLV structure is intact, you should be able to just pass on those TLVs without problems to any peers that the route is destined for.
The problem fundamentally is three things:
1. The original BGP RFC suggests tearing down the connection upon receiving an erroneous message. This is a terrible idea, especially for transitive attributes: you'll just reconnect and your peer will resend you the same message, flapping over and over, and the attribute is likely to not even be your peer's fault. The modern recommendation is Treat As Withdraw, i.e. remove any matching routes from the same peer from your routing table.
2. A lack of fuzz testing and similar by BGP implementers (Arista in this case)
3. Even for vendors which have done such testing, a number of have decided (IMO stupidly) to require you to turn on these robustness features explicitly.
Remember the definition of this "drop the message I think is broken" not inherently "drop the broken message," it's entirely plausible that the message is fine but you have a bug which makes you THINK it's a broken message.
There is also a huge difference between considering it a broken message and a broken session, which is what Arista did.
IMHO, just drop the broken attributes in the message and log them, and pass on the valid data if there's any left. If not, pretend you did not receive an UPDATE message from that particular peer.
Monitoring will catch the offending originator and people can deal with this without having to deal with any network instability.
(I haven't downvoted your comment, but I can see why others would — you're making very simple and definite statements about very complicated problems, and you don't seem to be aware of the complications involved. Hence: your calibration is a bit off.)
Still think BGP is too complex and people keeping add new features and vendors keeping implement it based on RFC standard or draft.
And it seems BGP will never be deprecated so this sort of bugs will continue be found again and again...
https://en.wikipedia.org/wiki/HGC_Global_Communications
Makes me wonder if the BGP protocol is properly fuzzed. Perhaps its one of those things that everyone is scared to try to knock over given it's so important.
I suppose it would be easy to write a fuzzer for bgp but very hard to diagnose crashes?
Yes, this is exactly what I did in the post I linked to: https://blog.benjojo.co.uk/post/bgp-path-attributes-grave-er...
This is great research!
I can "play" with TCP/IP at home in dummy projects and learn more about it... but I have no idea how to "play" with BGP. In that regard, how does one learn about it at home?
If you like guided tutorials, https://blog.ipspace.net/2023/08/bgp-labs-basic-setup/ is rather good and has been extended to somewhat advanced topics. Everything needed to follow along is free software.
It lets you setup multiple containers with direct connections between them in whatever topology you want. It allows you to run both Linux containers (with FRR for example) and emulated versions of popular router platforms (some of the ones mentioned in the article).
GNS3 is probably the easiest way to get hands on experience with any networking technologies.
[1]: https://wiki.dn42.us/home
One way to play with it is something like this: https://www.eve-ng.net/
The other is to make a couple of virtual machines with a couple of network interfaces, make some sort of network betweeen them and then use some bgp routing deamon, eg:
https://bird.network.cz/
https://www.nongnu.org/quagga/
etc.
To experiment with BGP you could use a network simulator like what the author of this blog did. In my class we used something called gini[1] which I think my profs grad student wrote but the author apparently used gns3 which seems to be a cisco specific ns3 version. I used ns3 once and found it had a steep learning curve. The gini simulator has a more basic user interface but is probably less powerful.
[1] https://citelab.github.io/gini5/ [2] https://docs.gns3.com/docs/
Check out this blog (not me, I just remember it from years back): https://blog.thelifeofkenneth.com/2017/11/creating-autonomou...
Another fun thing is to log into publicly available looking glass servers. Most ISPs (including very, very, very large ones) operate routers that have their full view of the BGP routing tables. They either run web interfaces that let you query those tables (more common) or make public ssh or telnet credentials to log in with roles that have very limited access to the available commands, but have read rights to those tables.
The first widespread incident I found was from 1997 [1], but I didn't look too hard.
I don't think there's really a satisfying way to play with BGP as a small network. Traffic engineering is where I think the fun would be, but you've got to have lots of traffic and many connections for it to be worthwhile. Then you'd be trying to use your announcements to coax the rest of the internet to push packets through the connections you want. As well as perhaps adjusting your outgoing packets to travel the connections you prefer when possible. Sadly, nobody lets me play with their setup.
One of the ways to get a sense of emergent routing behavior is if you have hosting in many places, you'll likely see a lot of differences in routes when you start going to far off countries. If you run traceroutes (or mtr) from your home internet and your cell phone and various hosting, and if you can trace back... you'll likely see a diversity of routes. Sometimes you'll see things like going from west coast US to Brazil, where one ISP will send your packets to florida, then Brazil, and one ISP will send your packets to Spain, then Brazil, with a lot more latency.
[1] https://en.m.wikipedia.org/wiki/AS_7007_incident
BGP runs the internet routing "in the background" and you only need to know it if you're an internet service provider or work in a large org managing the network. If you didn't learn network routing, you aren't going to learn BGP.
Put two or three VMs (OpenBSD has OpenBGPD daemon) onto a shared virtual switch and addresses in 172.31.255.0/24, connect the VMs. Also each of the VMs should have at least one other interface onto unique virtual switches with their own network (172.31.1.0/24, 172.31.2.0/24, etc).
Then set up BGP to redistribute connected routes.
It depends on what you study.
I did more of a sysadmin track, you (probably?) did comp sci/dev and would not encounter BPG in a dev job (probably).
/Live in Ericsson lands
A lab wont ever reflect the complexity of a carrier environment.
That said, just bang a couple of mikrotiks together if you want to play with it.
Cisco offers some simulator tooling. It basically virtualizes a lot of networking devices and allows you to play LEGO/SimCity with them: Cisco Packet Tracer
https://www.netacad.com/learning-collections/cisco-packet-tr...
Now, we built toy networks from scratch while I was working toward my certification. Surely larger-scale simulation files could be loaded into Packet Tracer. And perhaps, vendors have simulators on a larger scale than the free downloads?
https://developer.cisco.com/modeling-labs/
When I worked at a regional ISP, my supervisor was the BGP wizard. He referred to exterior routing as "a black art". Even more, the telcos were deploying their own technologies like Frame Relay and SMDS, which are Layer 1/Layer 2 protocols beyond the standard "point-to-point" leased lines.
We once experienced a fiber cut on our T-3 backbone (construction workers didn't dial 811). So my supervisor arranged the BGP routes to send everything over a 56k line, IIRC. He gloated about it. The packet loss rate was absurd, but our customers had connectivity!
Is the real issue that each vendor wants lock-in, so won't standardise?
DISCLAIMER: My understanding of BGP is hollow and shallow, I am not an expert.
I guess the author of the article here has written a fuzzer with some coverage, and has come across similar issues before. Astonishing that the vendors don't pick up on this work hungrily.