Thursday, November 12, 2009

X-Trace: A Pervasive Network Tracing Framework; Fonseca, Porter, Katz, Shenker, & Stoica

This paper discusses X-Trace, a network diagnostic tool that works across a range of domains and applications by adding metadata to all network operations that originate from certain tasks. By keeping track of which tasks spawn new tasks, X-Trace ends up keeping track of task trees, whose steps it can replay to troubleshoot failures. Nodes push IDs to their next hop and establish these task trees through pushDown and pushNext, functions for lower and same layer hops, respectively.

X-Traces design was driven by 3 core ideas: trace requests should be sent in-band, collected traces should be sent out-of-band, and originators and receivers of requests should be decoupled. These principles have good reasons behind them, namely that trace requests need to be able to follow the same path as the original flow, that cted data should avoid the troublesome spots that it will help diagnose, and that administrative domains (ADs) that are unwilling to disclose some of their information need not fear this tool doing just that.

A few scenarios for using X-Trace are described including how traces could pinpoint problems; these scenarios include web site requests from an Apache server, web hosting services for photos, the i3 overlay network, tunneling, and troubleshooting ISP connections. A big open issue for X-Trace is security, for its metadata and for generating reports. I liked how frankly the authors acknowledged problems with X-Trace, that is, how difficult it was to deploy and how easy adoption/retrofitting of it would be.

No comments: