Monday, November 2, 2009

Looking Up Data in P2P Systems; Balakrishnan, Kaashoek, Karger, Morris, & Stoica

This article discusses the lookup problem in P2P networks, that is, scalable and efficient lookup. A few approaches include centralized databases, which have scalability and fault-tolerance issues, and DHTs, which involve passing around of key-value pairs. P2P systems do not employ hierarchical lookups, treating each host equally in the algorithm. Broadcasts between neighbors can be used when a node doesn't have a key, but too many broadcasts can quickly take down a system by using up all available bandwidth.

The authors then go into more depth about the DHTs, describing how each step in a lookup for some key or ID should progress to a node with a closer value to ID, based on some definition of closeness such as numeric difference or number of similar prefix bits. Chord is a DHT that uses a skip-list to allow binary searches for keys, whereas Pastry, Tapestry, Kademlia, and CAN all use tree-based structures.

The article is brief, having been published in communications of the ACM, but it seems to go into too much detail about the wrong things. I would have been more interested in an organized comparison of the different DHT implementations and an analysis of their strengths and weaknesses. Instead, I got a long discussion about P2P systems, skip lists, and hash table lookups.

2 comments:

Randy H. Katz said...

Such a critic! I was looking for a quick read paper that would talk about the different p2p systems of the time, and thought this was the one.

Priyanka Reddy said...

I think this paper did a good job at the start, explaining the differences between structured and symmetric lookup protocols and the pros/cons of each. But, once they went into the the various protocols, I got a little lost in the details. Their discussion at the end of open questions was helpful though.