Thursday, September 3, 2009

Understanding BGP Misconfiguration; Mahajan, Wetherall, & Anderson

This paper examines the globally visible impacts of small-scale misconfigurations in BGP routers. The types of faults studied are accidental insertion and propagation of routes to global BGP tables. To study the frequency and effects of these faults, the authors looked at route updates in 19 ASes (it would have been interesting to know exactly which ASes these include) and decided to classify a route change as a misconfiguration if it only lasted a brief amount of time.

This idea is slightly flawed in that it doesn't take into account misconfigurations that aren't discovered for a long time, and it's also possible that legitimate route changes can occur that last for less than a day. However, the authors try to deal with the latter problem by contacting ISP operators for information regarding each potential misconfiguration (though ironically enough, 1/3 of these e-mails didn't reach their intended target due to stale data).

The authors describe the origins of new routes as being self-deaggregation, related origin, and foreign origin; the general thought regarding foreign origins was that they were more likely to be an address space hijack than a legitimate update. However, I would suspect that legitimate route updates of foreign origin would be more prevalent today than seven years ago when this paper was written, as tech companies like Google and Facebook with equipment all over the world implement policies to accommodate for failures in other locations by transferring traffic elsewhere.

There were a number of interesting findings from this study. The authors found that for all the BGP misconfigurations that occur, they noticeably increased load but didn't significantly hinder connectivity. In looking at the causes of these misconfigurations, some slightly disturbing facts arose, such as a bug in a certain type of router that leaked entries during reboot and also how much manual configuration is required from operators; misconfigurations can be due to underlying causes one might not have even thought involved in the process. This paper is of particular interest because it actually studied routing changes in the wild and then took steps to uncover the cause of misconfigurations.

1 comment:

Randy H. Katz said...

Of course the paper is a few years old, but it is interested to note how many underpowered routers there appears to be out there.