The Grad School Saga Continues...: October 2009

Thursday, October 29, 2009

Active network vision and reality: lessons from a capsule-based system; Wetherall

This paper describes a framework for Active Networks called ANTS. ANTS allows users to implement network functionality across an unrestricted set of nodes. Applications send special packets called capsules, in essence an extended IP packet which contains the actual service code to run on it, which are passed around via active node extensible routers.

ANTS decouples capsules from service code, caching service code for particular streams of packets and requesting code from a capsule's previously-visited active node if the current node doesn't have the service code cached. Capsules are kept from misbehaving by being greedy with resources or interfering with other payloads in the network by the ANTS sandbox in which their code is run.

The authors show graphs of ANTS latency and throughput versus capsule payload size and compare these results to impelementations of forwarding in Java and C. I had always heard people complain about how slow Java is compared to compiled languages like C, but I had never seen any results to verify that sentiment; these graphs were enough for me, as the C implementation can get about 4x the throughput of ANTS/Java. Hence, if the authors are more interested in ease of portability, they can stick with their Java implementation of ANTS, but if they want better performance, Java is not the answer.

Active Networks definitely seem to be in violation of the E2E principle: trying to implement functionality at the link layer for not much gain. Apparently Active Networks have applications such as multicast that they would be useful for distributing over the wide-area, but short of these few applications, it's difficult to see the real need for them.

Resilient Overlay Networks; Andersen, Balakrishnan, Kaashoek, & Morris

This paper discusses Resilient Overlay Networks (RON), which was created as a solution to the connectivity problems created by BGP's taking too much time to recover from faults. RON nodes continually monitor path (or virtual link) quality between themselves and other nodes and exchange this information amongst themselves to determine optimal routes based on some quality metric.

As different applications have different route needs in terms of latency, loss, and throughput, each group of clients (program communicating with node's RON software) can use their own metrics to determine paths. The entry node of a packet into the network identifies the needs of the incoming flow and selects a path from its routing table; it tags the packet with a flow ID and sends it on its way. Subsequent RON nodes can use flow ID tags to speed up the sending process for future packets.

RON uses link-state routing, where each node in the overlay needs to send information to every other node, which seems like a lot of wasteful overhead. However, since the problem being tackled deals with convergence time, RON floods are perhaps more desirable than lack of connectivity in the network. The three metrics a RON nodes keeps track of for each link are latency, loss, and throughput, and nodes frequently probe virtual links to ensure they're still usable.

In evaluating RON, the authors found that in one scenario RON was able to fully overcome outages (defined as packet loss rates above some given threshold for a certain length of time) and up to 60% of outages in the second set, the difference for which the authors don't discuss much, brushing it off as differences in the setup and also claiming that the 40% were due to sites not being reachable by any other RON site. It's not clear to me that the problem of connectivity posed by BGP's slow recovery justifies the use of RON, which will not scale well and also might not play well with NATs.

Tuesday, October 27, 2009

On Inferring Autonomous System Relationships in the Internet; Gao

This paper discusses ASes and inferring the relationships between them. It formally defines the concepts of peering, provider-customer, and sibling ASes based on the notion of transiting on behalf of other ASes. The author uses the fact that the selective export rule provides valley-free paths to establish 6 possible patterns an AS path can fit (e.g., uphill, downhill, uphill->downhill, uphill->p2p, p2p->downhill, uphill->p2p->downhill).

The author introduces maximal uphill and downhill paths, which are respectively the longest uphill path in an AS path (customer->provider and sibling->sibling edges only) and the remaining path that isn't the maximal uphill path. Relationships between pairs of ASes on a path can be determined if the top (topmost) providers for the uphill and downhill directions are known for these maximal paths.

The degree of an AS can be estimated by its size, allowing us to find the top provider. Prior to the top provider, all relationships are customer->provider or sibling relationships, and all following it are provider->customer or sibling relationships; these can be further refined based on transiting knowledge obtained from the ordering of ASes in a path.

Using heuristic algorithms, the author finds that out of all AS relationships from BGP routing tables taken at 3 different dates, few (1-2%) are siblings, a few more (~8%) are peering, and the remainder are customer-provider. He verifes his inferences with information from AT&T and the WHOIS lookup service, finding that their customer-provider inferences are correct, their peering inferences are generally correct, and sibling inferences are sometimes off.

This paper presented a cool, simple method for inferring AS relationships. However, it all hinges on the notion that a provider is larger in size than its client, which seems like a generally safe assumption to make, though it would be nice if the author had done some verifications of his own.

Monday, October 26, 2009

On Power-Law Relationships of the Internet Topology; Faloutsos, Faloutsos, & Faloutsos

This paper presents empirically-discovered power laws that can shine some light on traits of the Internet topology including neighbors and inter-domain connectivity. These laws are based on data about the number of ASes present between 1997 and 1998.

A few power laws were uncovered. The first involves the rank exponent, which states that the outdegree (number of edges incident to a node) is proportional to the rank of the node (index of node sorted by decreasing outdegree) raised to the constant R (slope of outdegree vs. rank of nodes, which can differ per graph family). Through this power law, the authors derive an estimation for the number of edges in a graph, which in practical terms can give an idea of connections between domains in the Internet.

A second power law states that the frequency of an outdegree (number of nodes with that outdegree) is proportional to the outdegree raised to the constant O (slope of frequency vs. outdegree). Apparently the O value is fairly constant regardless of the graph family. I found this law pretty amazing, as I would have expected much more randomness in outdegree frequency. In particular, the authors make a fairly strong statement about using this law to test realism of a topology, that if the exponent is far off or if the topology doesn't follow the law, "it probably does not represent a real topology."

A less powerful "law" (deemed an approximation because it consists of only 4 data points) deals with neighboring connectivity, stating that the number of pairs of nodes within some number of hops is proportional to the number of hops raised to the power of H (slope of pairs of nodes within some number of hops vs. number of hops). One of the four graphs actually had enough data to show any evidence of this approximation; the others were pretty worthless.

The final power law states that the eigenvalues of a graph are proportional to the order (in decreasing eigenvalue sequence) raised to the power of E (slope of sorted eigenvalues vs. order). This law is much more abstract to me; I'm not quite sure what the practical ramifications of it are.

I didn't think I'd like this paper initially, thinking it would be heavy on the math and light on applicability, but with each power law, the authors presented some ways to use the law, such as for estimating the number of edges, neighbors, and diameter of a graph. Additionally, these laws can be used to verify the realism of proposed topologies. In using these laws to make predictions for the future, however, they are basing estimates on a little over one year's worth of data from the late 90s. My intuition is that their guesses are far too low, but I would be interested in seeing current numbers for the Internet.

Sunday, October 25, 2009

White Space Networking with Wi-Fi like Connectivity; Bahl, Chandra, Moscibroda, Murty, & Welsh

This paper discusses WhiteFi, a Wi-Fi like protocol meant for use in UHF white spaces. The difficulty in configuring networks in the UHF realm is efficient channel assignment and Access Point (AP) discovery without causing interference from packet transmissions. WhiteFi must deal with spatial/temporal variation (different channels available in different locations/times) and spectrum fragmentation (non-contiguous channels).

WhiteFi uses special hardware called KNOWS which consists of a UHF translator that translates Wi-Fi signal to the proper UHF band, scanner that searches for signals, and processor that performs a FFT on the incoming signals. APs select the channel to operate on by finding the channel that maximizes the metric MCham, their expected share of the channel as calculated from all associated clients; channel selection happens periodically.

Variable channel widths complicate the process of clients finding an AP. Instead of scanning every possible channel at every channel width, clients scan a subset of channels and use the SIFT technique to detect an AP in hardware. Moving this process from software to hardware and reducing the number of scans improved performance significantly in simulation. One important aspect of channel assignment the paper neglects is incumbent detection; the authors bank on future research solving this problem and assume that part is dealt with.

When an AP or client senses an incumbent on the channel, it disconnects. WhiteFi handles these disconnections via a chirping protocol. In the APs normal beacons, it advertises a backup channel. In the case of disconnection, AP and client switch to the backup channel and transmit chirps until a new channel can be decided selected.

This paper had a number of aspects that were mysterious to me. First off, I didn't have much previous knowledge of the UHF spectrum, and the paper's introduction didn't really help that much. (Thanks to Wikipedia for a very helpful summary and description of UHF.) Secondly, why all this focus on wireless microphones? Out of all the electronic equipment out there, are these really the main concern in terms of wireless transmitters that could be impacted by interference from packets? Maybe it's just that in particular they are prevalently used with difficult-to-predict usage. An unfortunate fact of research on the UHF spectrum is that full-scale evaluations cannot be done without FCC approval, so simulations have to suffice.

Interactive WiFi Connectivity for Moving Vehicles; Balasubramanian, Mahajan, Venkataramani, Levine, & Zahorjan

This paper describes ViFi, a protocol for minimizing interruptions in connectivity of WiFi in moving vehicles. The authors describe six policies for handoffs between base stations: RSSI (highest signal strength BS), BRR (highest beacon reception ratio), Sticky (stay with BS until connectivity drops too low), History (BS with historically best performance), BestBS (BS that will be best in future), and AllBSes (utilize all BSes in range). AllBSes represents an upper bound to performance; if packets are lost for one BS, there will likely be another that received them, particularly in situations involving bursty loss.

ViFi works by having the vehicle designate a BS as its anchor, acting as the vehicle's gateway to the Internet, and other BSes in range of the vehicle are auxiliaries. The vehicle periodically broadcasts beacons that contain info about the current anchor and auxiliaries as well as the previous anchor. ViFi is actually deployed in VanLAN, and the authors use traces collected from DieselNet for simulations as well.

ViFi employs the techniques of diversity and salvaging to improve session connections; diversity is the use of multiple basestations to deliver packets, and salvaging is the transmitting of unacknowledged data from the Internet held in previous anchors. Some of the difficulties in ViFi are similar to those in the ExOR system, including coordination between nodes.

The algorithm for transmission works as follows: if the intended destination receives a packet, it sends an ACK. Any auxiliaries that overhear the packet and don't hear the ACK within a certain period probabilistically relay the packet. These relaying auxiliaries must take into account other auxiliaries that might be relaying as well as quality of connectivity, all while trying to keep the expected number of relays to a minimum. Packet reception rates are based on the beacons the vehicle sends.

ViFi doesn't do any packet reordering, as the authors claim it's not necessary for good performance, but it might have been nice to see some numbers about how frequent reordering is. The graphs involving session length took me a little longer to understand than I would have liked; I think there may be more intuitive ways of showing how good the connections were.

On the whole, ViFi seems like a pretty good idea and gets good results for short TCP transfers and VoIP sessions. I wonder how representative of a typical vehicle workload these tests are, though. Certainly short TCP transfers as from web surfing makes sense, but how often are people Skyping on a Microsoft shuttle bus? I liked that the paper talked about the limitations of ViFi; unfortunately, it sounds like large-scale deployment would be very difficult, and if ViFi were only implemented in large islands, maintaining connectivity could be complex.

Wednesday, October 21, 2009

ViFi + White Space Papers

Will post about these papers later when wireless access is less flaky...

Thursday, October 15, 2009

XORs in the Air: Practical Wireless Network Coding; Katti, Rahul, Hu, Katabi, Medard, & Crowcroft

This paper describes COPE, a forwarding architecture that employs the wireless coding technique of XORing multiple packets together at a router to reduce the number of total transmissions. COPE sits between the IP and MAC layers and decides which packets it can encode together to maximize the number of packets sent in one transmission while at the same time ensuring that nexthops have enough information to actually decode the packet intended for it.

Nodes know which packets its neighbors have via reception reports that nodes broadcast frequently; however, this method fails in congestive periods, at which point nodes are left to guess which packets neighbors have based on routing metrics like ETX. The topmost principle used by COPE is to never delay packets waiting around for other packets it can code with that at the head of its queue. Additionally, COPE tries to send packets of the same length, which equates to padding shorter packets with zeroes if needed and queueing short and long packets in two different virtual queues per nexthop.

COPE adds a packet header that includes IDs of the native packets and their nexthops, as well as reception reports and cumulative ACKs. Even with this additional overhead, COPE can provide an increase in goodput in addition to an increase in coding gain. The hidden terminal scenario of wireless hurts COPE's TCP goodput, but when hidden terminals are disallowed (hence removing lots of collisions) COPE can improve TCP's goodput by over 30%.

I was surprised to see that packets were coded between 20-50% of the time based on guessing. Since COPE seems to perform well with guessing, it makes me wonder if reception reports are even really necessary at all. An interesting point of this coding scheme is that because multiple packets destined for the same nexthop can't be coded together, it opens up the fairness realm for packets destined towards a neighbor that might not oterhwise have been able to capture the media for a while. I like the simple notion behind this paper of using XOR operations to reduce the number of packets transmitted, though as the paper admits, this technique will not work so well for a network that is energy-limited.

ExOR: Opportunistic Multi-Hop Routing for Wirless Networks; Biswas & Morris

ExOR is a routing technique that broadcasts batches of packets to some set of nodes, out of which the "best" receiver proceeds to send the packets it received, with other receivers taking turns to fill in the packets the "best" one left out. In this way, ExOR can send packets with a smaller number of retransmissions than via traditional routing. Since a packet is forwarded by a single node, ExOR works with existing radio hardware, unlike many cooperative diversity schemes, which is why I assume the paper didn't attempt to compare any cooperative diversity techniques to ExOR.

ExOR uses a forwarder list that contains an ordering of when receivers should send their packets. Receivers "closer" to the destination send their packets sooner than those farther away, and nodes further down in the forwarder list decide when to transmit as-of-yet-unsent packets based on a predicted time of how long it takes higher priority nodes to send. Nodes pass around a batch map with each packet that contains information about which packets have been transmitted by which nodes, so packets won't be retransmitted unnecessarily. This method adds a bit of overhead since it's a map of all packets in a batch, sent in every packet, but this overhead doesn't seem to be all that detrimental to throughput based on an experiment that varied batch size.

ExOR only guarantees 90% packet arrival and cannot be paired with reliable transport such as TCP because its batching scheme might not work with TCP's chosen window size. After 90% of packets have reached the destination, ExOR finishes the batch using traditional routing.

The authors compare ExOR to traditional source routing using the ETX metric in Roofnet. They enforce the "store and forward" model at the level of entire files to prevent multiple nodes from sending at a time. A strength ExOR has over ETX-based source routing is that ExOR can make use of asymmetric links because the batch map doesn't have to flow back over the same path the packets used, though how the path is chosen in that case the authors don't say. ExOR has over 2x the throughput with mostly fewer packet transmissions; its hops can travel longer distances, hence bypassing some intermediate transmissions.

One complaint I have about the paper is an aesthetic one: the figures were laid out out-of-order, which made it difficult to quickly find the figure the text was describing.

Thursday, October 8, 2009

A Performance Comparison of Multi-Hop Wireless Ad Hoc Network Routing Protocols; Broch, Maltz, Johnson, Hu, & Jetcheva

This paper presents performance comparisons of 4 routing protocols for ad hoc wireless networks: DSDV, TORA, DSR, and AODV. It simulates each in ns-2 for use in a network of 50 moving nodes.

Destination-Sequenced Distance Vector (DSDV) is a distance vector protocol that must frequently broadcast its route updates. It uses sequence numbers for choosing new routes such that if a route comes in with a greater sequence number or equal sequence number and lower metric, it selects the new route. Temporally-Ordered Routing Algorithm (TORA) works to quickly establish a route with little overhead using a "reverse link" algorithm, which can sacrifice route optimality.

Dynamic Source Routing (DSR) places the route a packet should take into the actual packet's header, keeping intermediate nodes from actually needing to maintain any routing info, though nodes can cache routes they've seen or overheard to prevent too many route requests from flooding the network. Ad Hoc On-Demand Distance Vector (AODV) is a cross between DSR and DSDV. It sends out route requests which return with number of hops and sequence numbers, and state is maintained on a hop-by-hop basis, with each node remembering next hop info only.

Simulations were run for different amounts of movement of the 50 nodes (from continuous movement to no movement at all) with 10 different patterns of movement for each of 7 different "pause time" values. The authors also varied the movement speed, sending rates, packet size, and number of CBR sources. Also of note is that TCP was not used in the simulation. Three metrics of comparison were used for the different scenarios: packet delivery ratio, routing overhead (total transmitted packets), and path optimality (difference of length for path taken and known shortest path).

In dealing with mobility, DSDV does poorly with short (< 300 s) of pause time, but DSR and AODV provide high packet delivery ratios. DSR generally has the lowest amount of overhead, with TORA having the most, and DSDV has a constant amount regardless of mobility because it is periodic in its broadcasts. The number of CBR sources only appears to affect TORA in terms of packet delivery, though there is a large amount of variability that is not captured by the graphs.
DSR has the least overhead from increased sources in terms of packets, though since each of its packets contains more bytes to deliver route information, it loses to AODV with regards to least sized overhead. I liked this paper a lot because it really took into account a lot of different scenarios for comparing these different protocols; unfortunately, since it was all in simulation, it's hard to say if these tests are really representative of these protocols in the wild.

A High-Throughput Path Metric for Multi-Hop Wireless Routing; De Couto, Aguayo, Bicket, & Morris

This paper presents expected transmission count (ETX) as a metric for routing in wireless networks. ETX attempts to minimize the total number of transmissions (including re-transmits) required to deliver a packet. It also accounts for asymmetric links. The hope is that a minimum number of transmissions equates to maximum possible throughput.

The authors actually test this routing metric on a real setup in an office building. They compare ETX's performance to to that of min-hop count. In fact, they spend a significant amount of time trying to convince the reader that minimum hop count doesn't always do the best job, which I thought was handled succinctly enough by the first example they presented. I assume they were having to fight very hard to get min-hop count enthusiasts to come over to their side.

ETX works by having each node calculate the forward and reverse delivery ratios along each of its links. Nodes broadcast link probes once every set period of time, and nodes at the other end of said link keep count of received probes. There is a danger of probes not being able to be sent along the link, affecting the neighbors' ETX values.

The evaluation of ETX using the DSDV routing protocol shows that ETX does only a little better than min-hop count in the context of 1 mW radio with small packets. When larger packets (1,386 instead of 134 bytes) are used, ETX doesn't outperform min-hop count, apparently because it overestimates link quality because it uses small packets for the probes, and these smaller packets are more likely to be delivered successfully. Transmitting at a higher power also hurts ETX's relative performance compared to min-hop since nodes require fewer hops to reach one another. I wasn't particularly impressed by the performance benefits of ETX over min-hop count, and especially not since we already read about ETT which was developed by essentially the same group of authors after this metric and seemed to do well enough.

Tuesday, October 6, 2009

Architecture and Evaluation of an Unplanned 802.11b Mesh Network; Bickets, Aguayo, Biswas, & Morris

There are 2 families of community wireless networks: well-planned ones with good throughput and connectivity, and unplanned ones with minimal setup required but with worse coverage. This paper proposes the architecture for a wireless network that combines the best of both worlds, namely a network that's quick and easy to setup but that still gets good performance; this architecture consists of unconstrained nodes, omni directional antennae, and link-quality-aware multi-hop routing optimized for throughput. It examines the feasibility of this architecture as implemented in Roofnet, consisting of 37 roof-mounted antennae nodes over 1.5 square miles of Cambridge, MA.

Each node has a non-globally routable IP address and assigns IP addresses via DHCP to hosts attached to its Ethernet ports. Hosts can communicate with one another via their nodes due to the NAT between the Ethernet and Roofnet at each node. There are 4 nodes serving as gateways and acting as NATs from Roofnet to the Internet.

Routes are chosen through Roofnet by Srcr, which chooses based on lowest estimated transmission time (ETT), which presumably corresponds to highest throughput. Srcr tends to favor more short fast hops over longer-distance hops. Additionally, SampleRate selects the best bit-rate to send data at over each link.

The authors evaluate the throughput of Roofnet based on all pairs of nodes within the network. The average throughput was around 600 kbits/sec, with throughput being very closely tied to number of hops between the two. Interestingly, 10% of node pairings failed to find routes between them due to link losses that weren't overcomable by Srcr's flooded queries repeated after 5 seconds. Use of omni-directional antennae appears to be useful because most nodes seem to vary their first hop. While Roofnet is fairly robust to loss of links, removing only two of the best-connected nodes cuts throughput in half. However, in evaluating Roofnet, the authors did not affect Internet access of Roofnet users, meaning that non-experiment traffic could have skewed the findings.

Sunday, October 4, 2009

Modeling Wireless Links for Transport Protocols; Gurtov & Floyd

This paper describes models for wireless links. The 3 types of links considered are cellular (low bandwidth, high latency), wireless LANs (low latency), and satellite (high latency, low bandwidth). The topology given consideration is one in which only one endpoint of a path is a wireless link. The authors say that wireless links on both ends is an unlikely scenario except for the "future" possibility of VoIP being used for telephone calls, which is clearly the case 5 years later.

In considering different types of links, it's also important to keep in mind that each link has a different usage pattern. For example, cellular users will likely be involved in smaller transfers of data, and the ratio of downlink to uplink will vary across the 3 link types.

The paper points to four problems that exist in current modelling techniques. The first problem is that many models are simply unrealistic. Use of TCP over high-loss wireless links, deprecated versions of TCP (possibly due to ease of use in simulators), and retransmission constants for TCP are all representative of practices that just aren't carried out in wireless networks.

The second problem is that some models are realistic but only explore a limited parameter space. For instance, the evaluation might be done using a single flow without regard for the possibility of multiple flows, or the model might employ reverse channel allocation for ACKs without taking into account effects on performance if the ACKs are delayed. On a different extreme, some models are overly realistic in that some use bad implementations with known flaws of components for the simulation. While this practice would be fine for evaluating the badness of said component, it does not generalize well.

The last problem listed is the lack of reproducibility of some models. This problem consists of researchers not fully specifying their experimental settings, which is not so much problem of the model as of poor documentation on the researcher's part.

The paper suggests ideas about how to model a number of different tricky scenarios, including error losses/corruption, delay variation, packet reordering, bandwidth changes and assymetry, and mobility/handover implementation. Graphs showing the effects of mobility show the extent to which small changes like buffer size can affect the accuracy of a graph; in fact, none of the 3 simulations really capture what happens in the actual measurements, which almost seems to suggest that all this modeling is a waste of time. It seems like modeling is only a good idea when a researcher is trying to get a general idea of how new changes will affect a network but certainly not when trying to get an accurate picture of performance.

Thursday, October 1, 2009

A Comparison of Mechanisms for Improving TCP Performance over Wireless Links; Balakrishnan, Padmanabhan, Seshan, & Katz

This paper examines three different ways to improve TCP performance for wireless networks. Loss might equate to congestion in wired networks, but with wireless, the story is different. Assuming that loss is due to congestion can hinder performance by causing the connection to revert back to slow start and undertake recovery mechanisms. As such, there are three different techniques to improve wireless performance with TCP: end-to-end in which the sender is forced to handle the losses gracefully, split-connection in which the wireless link is hidden from the sender by connecting it to the base station, and link-layer solutions which also attempt to hide losses from the sender but via local retransmission and error correction.

This paper uses a wireless testbed of PC base stations with laptop mobile hosts. A sort of odd choice made for the tests was for bit-errors to be randomly chosen using an exponential distribution. Apparently this choice was made to allow for multiple packets to be lost in close proximity to one another, allowing the authors to really compare the techniques in this sort of worst-case loss scenario, and they actually state that this distribution was not intended to actually model a network.

Some options that improve end-to-end include partial and selective acks (SACKs) along with explicit loss notification (ELN), which prevents large congestion window fluctuations, with the greatest improvements combining SACKS and ELN. SACKs give a sender more information about which packets have actually made it through and ELN involves marking acks with the note that a loss happened so that retransmission of lost packets can occur without moving into congestion-control mode.

Split-connection protocols use a wired connection between a TCP source and a base station and a wireless connection between the base station and TCP receiver; two TCP connections are actually established, and since the wired connection doesn't experience loss, the base station need be the only one doing any retransmitting. However, since the wireless connection is the bottleneck for the completion of transmission, it still requires that some special mechanisms be used to improve performance.

The authors find that in general, link-layer protocols tend to give the best performance improvements, but particularly so if they actually do fine-grained retransmissions (and hence actually keep the sender from having to retransmit as well) and if they deliver packets in-order after a loss, thus preventing duplicate acks from being sent and causing fast retransmit/recovery. Interestingly, in face of low error rates, all techniques seem to do equally well.