Bram Cohen (bramcohen) wrote,


A bunch of people have been pestering me about Avalanche recently, so I'll comment on it.

First of all, I'd like to clarify that Avalanche is vaporware. It isn't a product which you can use or test with, it's a bunch of proposed algorithms. There isn't even a fleshed out network protocol. The 'experiments' they've done are simulations.

It's a bad idea to give much weight to simulations, especially of something so hairy as real-world internet behavior. I spent most of my talk at stanford explaining why it's difficult to benchmark, much less simulate, BitTorrent in a way which is useful. But we can look at their simulation to see if it might at least be ballpark.

Let's see, here's their simulation of 'tit-for-tat':

To simulate a tit-for-tat scenario, the simulator keeps track of the difference between uploaded blocks minus downloaded blocks from a user S (source) to a user D (destination). If the difference is larger than the pre-configured value (typically 2 when we study tit-for-tat), then the source S will not send any block to D even if there is spare upload capacity at S and spare download capacity at D.

I can't fathom how they came up with this. Researching either the source code or the documentation on the BitTorrent web site would have shown that the real choking algorithms work nothing like this. Either they just heard 'tit-for-tat' and just made this up, or they for some odd reason dredged up BitTorrent 1.0 and read the source of that. You see, BitTorrent work this way when it was nowhere near functional yet, and the first test among multiple peers (6 if my memory serves) showed that it sucks. It was promptly rewritten, way back in late 2001. This gaffe alone makes their simulation completely worthless, but it isn't the only one:

Whenever a user joins the system it picks four nodes at random and makes them its neighbors (provided that they have not exceeded the maximum number of neighbors, which is set to six in most of our experiments). The simulator supports dynamic user populations with nodes joining and leaving the system, and topology reconfigurations. In fact, at the end of each round, if a node determines that the utilization of its download capacity in the most recent rounds drops below a certain threshold (10% in most of our experiments), then it tries to discover and connect to new neighbors. Similarly, if the user has exceeded its maximum number of neighbors, then it will drop some of the old neighbors at random.

So in their simulations peers have 4-6 neighbors with a strong preference for 4. BitTorrent in the real world typically uses 30-50. Since BitTorrent depends entirely on its neighbors for information related to piece selection, this limitation has ratcheted the amount of useful information BitTorrent gets to the absolute minimum possible without making the system not work at all.

They also don't simulate varying transfer rate abilities, transfer rate abilities varying over time, or endgame mode.

In other words, intentionally or not, the simulation is completely rigged against BitTorrent.

The central idea here is basically 'Let's apply error correcting codes to BitTorrent'. This isn't a new idea, everybody comes up with it. In fact I saw fit to mention that it's a dubious idea before. (Some people will point out that 'error correcting codes' isn't the right term for the latest and greatest of this sort of technology, to which I say 'whatever'.) The main reason that this is a popular idea is that recent work in error correcting techology is very cool. While it is very cool, and very applicable to sending information across lossy channels, the case for using it in BitTorrent is unconvincing.

What error correction can in principle help with is that it the chances that any given peer has data which is of interest to another peer. In practice this isn't really a problem, because rarest first does a very good job of piece distribution, but error correction can in principle do as well as is theoretically possible, and rarest first is in fact less than perfect in practice.

One thing badly missing from this paper is back-of-the-envelope calculations about all of the work necessary to implement error correction. Potential problems are on the wire overhead, CPU usage, memory usage, and disk access time. Particularly worrisome for their proposed scheme is disk access. If the size of the file being transferred is greater than the size of memory, their entire system could easily get bogged down doing disk seeks and reads, since it needs to do constant recombinations of the entire file to build the pieces to be sent over the wire. The lack of any concrete numbers at all shows the typical academic hand-wavy 'our asymptotic is good, we don't need to worry about reality' approach. Good asymptotics are one thing, but constant multipliers can be killer, and it's necessary to work out constant multipliers for all pontentially problematic constants, not just the easy ones like CPU.

The really big unfixable problem with error correction is that peers can't verify data with a secure hash before they pass it on to other peers. As a result, it's quite straightforward for a malicious peer to poison an entire swarm just by uploading a little bit of data. The Avalanche paper conveniently doesn't mention that problem.

As you've probably figured out by now, I think that paper is complete garbage. Unfortunately it's actually one of the better academic papers on BitTorrent, because it makes some attempt, however feeble, to do an apples to apples comparison. I'd comment on academic papers more, but generally they're so bad that evaluating them does little more than go over epistemological problems with their methodology, and is honestly a waste of time.

If you're interested in doing more fleshed out research on error correction in BitTorrent, I suggest starting with a much less heavyweight approach. Having peers transfer the xor of exactly two pieces could potentially get most of the benefits of full-blown network coding.
  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded  

← Ctrl← Alt
Ctrl →Alt →
← Ctrl← Alt
Ctrl →Alt →