Log in

No account? Create an account

Mon, Jun. 20th, 2005, 08:51 am

A bunch of people have been pestering me about Avalanche recently, so I'll comment on it.

First of all, I'd like to clarify that Avalanche is vaporware. It isn't a product which you can use or test with, it's a bunch of proposed algorithms. There isn't even a fleshed out network protocol. The 'experiments' they've done are simulations.

It's a bad idea to give much weight to simulations, especially of something so hairy as real-world internet behavior. I spent most of my talk at stanford explaining why it's difficult to benchmark, much less simulate, BitTorrent in a way which is useful. But we can look at their simulation to see if it might at least be ballpark.

Let's see, here's their simulation of 'tit-for-tat':

To simulate a tit-for-tat scenario, the simulator keeps track of the difference between uploaded blocks minus downloaded blocks from a user S (source) to a user D (destination). If the difference is larger than the pre-configured value (typically 2 when we study tit-for-tat), then the source S will not send any block to D even if there is spare upload capacity at S and spare download capacity at D.

I can't fathom how they came up with this. Researching either the source code or the documentation on the BitTorrent web site would have shown that the real choking algorithms work nothing like this. Either they just heard 'tit-for-tat' and just made this up, or they for some odd reason dredged up BitTorrent 1.0 and read the source of that. You see, BitTorrent work this way when it was nowhere near functional yet, and the first test among multiple peers (6 if my memory serves) showed that it sucks. It was promptly rewritten, way back in late 2001. This gaffe alone makes their simulation completely worthless, but it isn't the only one:

Whenever a user joins the system it picks four nodes at random and makes them its neighbors (provided that they have not exceeded the maximum number of neighbors, which is set to six in most of our experiments). The simulator supports dynamic user populations with nodes joining and leaving the system, and topology reconfigurations. In fact, at the end of each round, if a node determines that the utilization of its download capacity in the most recent rounds drops below a certain threshold (10% in most of our experiments), then it tries to discover and connect to new neighbors. Similarly, if the user has exceeded its maximum number of neighbors, then it will drop some of the old neighbors at random.

So in their simulations peers have 4-6 neighbors with a strong preference for 4. BitTorrent in the real world typically uses 30-50. Since BitTorrent depends entirely on its neighbors for information related to piece selection, this limitation has ratcheted the amount of useful information BitTorrent gets to the absolute minimum possible without making the system not work at all.

They also don't simulate varying transfer rate abilities, transfer rate abilities varying over time, or endgame mode.

In other words, intentionally or not, the simulation is completely rigged against BitTorrent.

The central idea here is basically 'Let's apply error correcting codes to BitTorrent'. This isn't a new idea, everybody comes up with it. In fact I saw fit to mention that it's a dubious idea before. (Some people will point out that 'error correcting codes' isn't the right term for the latest and greatest of this sort of technology, to which I say 'whatever'.) The main reason that this is a popular idea is that recent work in error correcting techology is very cool. While it is very cool, and very applicable to sending information across lossy channels, the case for using it in BitTorrent is unconvincing.

What error correction can in principle help with is that it the chances that any given peer has data which is of interest to another peer. In practice this isn't really a problem, because rarest first does a very good job of piece distribution, but error correction can in principle do as well as is theoretically possible, and rarest first is in fact less than perfect in practice.

One thing badly missing from this paper is back-of-the-envelope calculations about all of the work necessary to implement error correction. Potential problems are on the wire overhead, CPU usage, memory usage, and disk access time. Particularly worrisome for their proposed scheme is disk access. If the size of the file being transferred is greater than the size of memory, their entire system could easily get bogged down doing disk seeks and reads, since it needs to do constant recombinations of the entire file to build the pieces to be sent over the wire. The lack of any concrete numbers at all shows the typical academic hand-wavy 'our asymptotic is good, we don't need to worry about reality' approach. Good asymptotics are one thing, but constant multipliers can be killer, and it's necessary to work out constant multipliers for all pontentially problematic constants, not just the easy ones like CPU.

The really big unfixable problem with error correction is that peers can't verify data with a secure hash before they pass it on to other peers. As a result, it's quite straightforward for a malicious peer to poison an entire swarm just by uploading a little bit of data. The Avalanche paper conveniently doesn't mention that problem.

As you've probably figured out by now, I think that paper is complete garbage. Unfortunately it's actually one of the better academic papers on BitTorrent, because it makes some attempt, however feeble, to do an apples to apples comparison. I'd comment on academic papers more, but generally they're so bad that evaluating them does little more than go over epistemological problems with their methodology, and is honestly a waste of time.

If you're interested in doing more fleshed out research on error correction in BitTorrent, I suggest starting with a much less heavyweight approach. Having peers transfer the xor of exactly two pieces could potentially get most of the benefits of full-blown network coding.

Mon, Jun. 20th, 2005 07:47 pm (UTC)
(Anonymous): The authentication problem _is_ addressed.

You state:
The really big unfixable problem with error correction is that peers can't verify data with a secure hash before they pass it on to other peers. As a result, it's quite straightforward for a malicious peer to poison an entire swarm just by uploading a little bit of data. The Avalanche paper conveniently doesn't mention that problem.

You really should learn to read papers to the end Bram. They do address the problem. Directly. The last paragraph of the paper addresses the problems of authenticating blocks in a rateless coding system. It references the public literature on the subject. It describes the computation cost of performing the check using the extant disclosed methods. It states that the authors have devised a scheme which does not have this computational cost that they are not disclosing at the moment.

Mon, Jun. 20th, 2005 08:28 pm (UTC)
uf_arachnid: Re: The authentication problem _is_ addressed.

This paper describes a homomorphic hash algorithm that can be used to verify individual encoded blocks. Unfortunately, it's extremely slow to compute and verify, the resultant hashes are generally large, and it expands encoded blocks by 1 bit per sub-block (not much overhead, but generally in the "what a pain" category).

If the authors of this paper have come up with something better, I wish they'd hurry up and show us!

Tue, Jun. 21st, 2005 01:05 am (UTC)
ocker3: Error correction paragraph correction

What error correction can in principle help with is that it the chances that any given peer has data which is of interest to another peer.

Surely saying "What error correction can in principle help with is that it increases the chance . . ." makes more sense?

And the rest of the paragraph makes it unclear with method you prefer, error correction or rarest first (perhaps neither?).

Tue, Jun. 21st, 2005 04:30 pm (UTC)
(Anonymous): Re: Error correction paragraph correction

I think it's obviously rarest first.

Tue, Jun. 21st, 2005 01:59 am (UTC)
(Anonymous): Angry Geek?


There is too much vernacular, execration and self-aggrandisement in your response to consider it seriously from an academic perspective. Rework.

Tue, Jun. 21st, 2005 02:25 am (UTC)
(Anonymous): Re: Angry Geek?

Good thing an academic perspective doesn't have much application to reality =)
Re: Angry Geek? - (Anonymous) - Expand
Re: Angry Geek? - (Anonymous) - Expand
Re: Angry Geek? - (Anonymous) - Expand
Re: Angry Geek? - (Anonymous) - Expand
Re: Angry Geek? - (Anonymous) - Expand
Re: Angry Geek? - (Anonymous) - Expand
Re: Angry Geek? - (Anonymous) - Expand
Re: Angry Geek? - (Anonymous) - Expand
Re: Angry Geek? - (Anonymous) - Expand
Re: Angry Geek? - (Anonymous) - Expand
Re: Angry Geek? - (Anonymous) - Expand
Re: Angry Geek? - (Anonymous) - Expand
Re: Angry Geek? - (Anonymous) - Expand
Re: Angry Geek? - (Anonymous) - Expand
Re: Angry Geek? - (Anonymous) - Expand
Re: Angry Geek? - (Anonymous) - Expand

Tue, Jun. 21st, 2005 02:15 am (UTC)
(Anonymous): ipv6 checksum

hi i'm watching your standford talk - it's interesting - first thing - there must be a reasonable way to simulate torrent setups - if scientists can model airflow, nuclear reactions, QCD lattice etc..

also you expressed concern about the checksum in ipv4 being only 4 bytes and hoped that ipv6 has more - i just did a cursory check - it seems to have less - 2 bytes - do they know something more about error correction here? I don't know, but probably they do??



Tue, Jun. 21st, 2005 08:06 pm (UTC)
(Anonymous): Re: ipv6 checksum

do they know something more about error correction here?

Perhaps they are merely doing error detection, 16-bits was always sufficient for sectors on a floppy disc, where the length of data over which the CRC is calculated is relatively short. Doing correction requires significantly more bits, and will be related to the size and number of holes you're trying to fix.

The problem would seem to be preventing poisoned or damages blocks from further propogation. For example using public-private key signing of blocks so the torrent can route around the "damage". The "correction" process is to discard the block, not try to fix it.

Re: ipv6 checksum - (Anonymous) - Expand
Re: ipv6 checksum - (Anonymous) - Expand

Tue, Jun. 21st, 2005 03:05 am (UTC)
(Anonymous): Focus.

Greetings Bram, Ignore the haters. Don't bog yourself down considering every single bit of input. Stay focused on the important things. Good luck.

Tue, Jun. 21st, 2005 03:21 am (UTC)
(Anonymous): Simple Question

Um... It seems legit to me... Then again, what do I know. meh.

P.S. Logging in would mean that I would have to remember a password and stuff. Much easier to just spam posts that sound like an ex-boyfriend that just can't get over the fact that your little lady actually found something that treats her like a woman...

Happy Cheese

Tue, Jun. 21st, 2005 06:58 am (UTC)
hydrocrates: Simply...thanks

Bram, BitTorrent has and is useful. Thank you for writing it, slaving over it to make it better and sharing it with the world.

Thanks for the stanford video link, it was helpful to watch and then read the rest of your commentary.

Would that we could all get along and devote more of our energies into building better than bickering.

Keep the faith.
aka Tragen

Tue, Jun. 21st, 2005 08:41 am (UTC)

Nice Stuff Bram. I'm going a little off topic looking for some torrent help. The web server i use currently block all forms of torrent activity thorough packet shaping(mangling). I really need to use torrents because thats how i share big files with other family members. I tried explaining to the network director at my college but she wouldnt even consider. I was wondering if theres any upgrade or new technology thats enables torrents to work even though on packet shaper technology.

Thanks. Appreciate what your doing for the community and nice presentation at Stanford!!!

Tue, Jun. 21st, 2005 07:20 pm (UTC)
(Anonymous): Secure tunnels?

I'm at the same stage as you: the University I work in has traffic-shaped everything. I don't know whether it would be a good idea for bittorrent clients to support encrypted sessions a la ssh. That way, it would look like supposedly legit traffic ;-)
Now, I haven't used torrents for almost a year, so perhaps this is already supported.
HTTPTUNNEL - (Anonymous) - Expand

Tue, Jun. 21st, 2005 09:37 am (UTC)
hunkymouse: The importance of Skype andIM and VoIP

You'll notice http://www.theikew.co.uk/2005/06/avalanche_vapou.html this piece, by an old colleague of Dvorak's, suggests a reason why Microsoft might be trying to get its own P-2-P technology.

Essentially, it points out that IM servers are big and expensive, and MSN runs one of the biggest - and points out, also, that Skype is actually a very successful IM. An interesting point!

Wed, Oct. 26th, 2005 06:30 am (UTC)
fallibledragon: Re: The importance of Skype andIM and VoIP

I think the most obvious reason for MS wanting to get involved in P2P is that they hope they can dominate it with superior technology that no one can give up, and then introduce DRM.

Also -- and this just occurred to me when reading Bram's article -- BitTorrent doesn't have a built in search system, which makes it hard to swamp with fake information, like Gnutella gets from a few bad nodes injecting junk. For a company like MS, keeping track of popular sites that kids use, and injecting fake stuff in a way that's convincing is probably much tougher.

Tue, Jun. 21st, 2005 01:30 pm (UTC)

This article is indirectly linked from Slashdot. I recommend editing it to block or screen anonymous comments...

Tue, Jun. 21st, 2005 02:25 pm (UTC)
(Anonymous): Me too...

ciphergoth is right...I, also, recommend editing it to block or screen anonymous comments.
(no subject) - (Anonymous) - Expand

Tue, Jun. 21st, 2005 02:59 pm (UTC)
(Anonymous): Disk access

I believe that they can solve the poisoning problem. But I suspect that the disk access problem can be fateful for the technology. Their paper lacks any detail about how to pick the pockets to recombine. It suggests that they combine every available pocket whenever they construct any one outgoing pocket. Of course, this is not feasible. Bram, what's your guess, how do they choose the pockets to combine? Maybe they use a queue of outgoing pockets, randomly recombined with each other and incoming pockets? Would this work?

For the conspiracy-theory crowd: Avalanche is a Microsoft Research project. Most probably the only reason for its existence is that the Microsoft researcher thinks that p2p is a sexy research topic. Hundreds of researchers work at Microsoft Research on projects that will never be commercialized. I know some of them, and most of the time there is no pressure on them from the business side.

Daniel Varga

Tue, Jun. 21st, 2005 07:47 pm (UTC)
secondshadow: Re: Disk access

it might be kindof interesting if the paper described how to effectively pick pockets... :)
Re: Disk access - (Anonymous) - Expand
Re: Disk access - (Anonymous) - Expand

Tue, Jun. 21st, 2005 03:35 pm (UTC)
(Anonymous): Good job

why are you saying stuff like: "The really big unfixable problem with error correction is that peers can't verify data with a secure hash before they pass it on to other peers. As a result, it's quite straightforward for a malicious peer to poison an entire swarm just by uploading a little bit of data"?

You should never point the flaws of your system to your ennemies...

Tue, Jun. 21st, 2005 05:20 pm (UTC)
(Anonymous): Re: Good job


we're still talking about data transfer protocols and not international warfare, right?
Re: Good job - (Anonymous) - Expand
Re: Good job - (Anonymous) - Expand
Re: Good job - (Anonymous) - Expand

Tue, Jun. 21st, 2005 04:26 pm (UTC)

Potential problems [with Avalanche] are on the wire overhead, CPU usage, memory usage, and disk access time. Particularly worrisome for their proposed scheme is disk access

[MS-bashing]I thought these were design specs for most MS code anyhow. [/MS-bashing]

MS's catchup strategy has been hit and miss. They hit (hard) with IE, and they hit low and painfully with MSOffice, but MSN Search remains non-competitive with Google, this smells like the same -- bt is too embedded into the file sharing crowd (not to mention the large-file-distribution crowd) for anything but a true successor to knock it out, and Avalanche is not that, certainly now, and dubiously ever.

Fri, Apr. 14th, 2006 07:08 am (UTC)

Unfortunately, if Avalanche comes with Windows (Vista or next service packs) then it will become more used, especially if it also handles real torrents, but it will only be able to create its own version of .torrent files thus making everyone merge over, without realising the overhead and complexity they are creating.
just like the Netscape vs IE war: IE comes with windows, Netscape vanishes, security problems for years, firefox finaly triumphs.
while OSS will probably eventually win, it is still a hit to widespread usage (and therefore widespread accomodation, such as for what browser webmasters design their pages)
there are some nice ideas in Avalanche, but IMHO not enough to make it worth the overheads it would create.

Tue, Jun. 21st, 2005 04:43 pm (UTC)
(Anonymous): More Microsoft FUD

Let the information warfare puzzle begin! Can you estimate how many postings were written by MSFT agents?

Tue, Jun. 21st, 2005 06:14 pm (UTC)

Uh, the public docs on your site state that BT uses "tit for tat" still. Can't blame them for using it!

Tue, Jun. 21st, 2005 06:49 pm (UTC)

BitTorrent does use tit-for-tat, but not they braindead way they implemented it in the simulation. BitTorrent is based on an instantaneous transfer rate rather than total block counts.
(no subject) - (Anonymous) - Expand

Tue, Jun. 21st, 2005 06:24 pm (UTC)
(Anonymous): Have fun trying to patent anything with Avalance M$

Microsft may also be breching a few pending Patents here
Murld Media the developer of LX Systems and Peer Impact has already applied for patents on a similar model to what the M$ paper proposes although they dont seem to use any "rateless" protocol and as Bram stated it has to be TCP .


Tue, Jun. 21st, 2005 08:22 pm (UTC)
(Anonymous): Avalanche

Excellent job keep up the good work and don't bother with them.

Tue, Jun. 21st, 2005 08:29 pm (UTC)
capveg: Apples to Apples

If you want an Apples to Apples comparison against BitTorrent, check out Slurpie (http://slurpie.sourceforge.net). We implemented our code (no simulations) and it out performed BitTorrent(we ran against the actual BT code) in both our test LAN and on the wide area Internet experiments via planetlab.

The code was a proof of concept, so it was not robust (it gags on HTTP tags it doesn't recognize) or maintained in the last 2 years, but there are some good ideas in it that might be worth adding into BitTorrent. For more info, check out the paper linked too off the website or contact me.

- Rob

Tue, Jun. 21st, 2005 10:04 pm (UTC)
adinb: Re: Apples to Apples


While I think that showing other protocols is germane to this discussion, I'd be a lot more interested to see a highlight of error-correction techniques.

There's almost always a tradeoff for error-correction vs speed (wire, local resources, etc). I know that there's a lot going on in the industry about tweaking Reed-Solomon error correction (which can have upto a 2X-1 overhead on the wire), but reliability significantly decreases as the overhead of the RS code goes down.

Maybe I'm a little beyond the times, but is anything really showing a better result than RS these days?
Re: Error Correction - (Anonymous) - Expand

Tue, Jun. 21st, 2005 11:28 pm (UTC)
azraphael: A modest proposal

I was wondering whether using an idea like the FEC Encoded Free Network Projects splitfiles would work? I have a little idea doc on my site.

P.S. You probably hear this from everyone, but excellent job Bram!


Tue, Jun. 21st, 2005 11:46 pm (UTC)
(Anonymous): Tracker load

Have you considered the impact of network coding on tracker scalability?
From what I understand, BitTorrent trackers need to track what chunks each peer has so far. But Avalanche doesn't care about tracking individual chunks, only which peers are current connected.

Thu, Jun. 23rd, 2005 10:52 pm (UTC)
(Anonymous): Re: Tracker load

Actually, all the tracker does is keep a count of who is a peer and who is a seed. Theoretically you could even stop tracking that, just give a list of IPs and the clients would sort it between themselves.

Tue, Jun. 21st, 2005 11:51 pm (UTC)
(Anonymous): Well Put Mate

Very Well Put Mr Cohen,
I personally do not beleave that a system that is still in theroy can rival your Bittorent.Microsofts way of contructing the network has most probebly been tryed by other developers with no real sucsess rate.
I look forward to then "avalanche" goes live and we can start finding bugs:)

Wed, Jun. 22nd, 2005 12:55 am (UTC)
(Anonymous): Network Coding Explained

Allright...there has been way too much rhetoric on how this thing is done. Firstly, network coding is NOT error correcting codes in any way, so all statements pertaining to how cool but useless network codes are is external to this discussion.

For a good intro paper on the particular kind of network coding -- caled Random Linear Coding -- that Avalanche uses go here (http://web.mit.edu/medard/www/allertonf.pdf).

Bram, you can have a hash outside the network coded block for secure message passing. The hash code does not need to be inside the coded block. So the "really big problem" is actually quite little.

But perhaps the wierdest part of your post is:
"Having peers transfer the xor of exactly two pieces could potentially get most of the benefits of full-blown network coding"
Dude, this IS network coding. There is nothing like full-blown network coding and small-time-lets-hack-it network coding. You just have a smaller Galois field (GF2) instead of the MSFT simulations. Now, that is hand-waving that would even make academics blanch.

You could make a cogent argument about decoding complexity. Network coding requires a matrix inversion and we are talking about very big matrices here. So that is definitely a problem -- but then that is about O(n^2.5) complex --- sorry, I have to use asymptotics here.

So in summation, the paper is not 'garbage', it's merely a step in the right direction. Not all the results are in -- it is likely that MSFT is not much better. But it is equally likely Avalanche will kick ass - then you just end up looking foolish for screaming out hoarse about the wrong things :)


Fri, Jun. 24th, 2005 12:10 am (UTC)
(Anonymous): Re: Network Coding Explained

Bram, you can have a hash outside the network coded block for secure message passing. The hash code does not need to be inside the coded block. So the "really big problem" is actually quite little.

I think you have no clue what the problem is. The hash is associated to the file (that's basically what is in .torrent), and what Bram wants to know is how they make sure that a packet (which can't be decoded without the whole file - this sucks btw) really comes from the original file. In most protocols, it's very simple, since there is only one file : the signing is done simply using a hash of the file. In Avalanche, don't forget that on top of the network coding, the packets are mostly created by the clients, and thus the clients have to digitally sign their packets.

Basically, they state in their article that some very competent people have proved that in simpler cases, it was possible using very complex computing; but in their case, they have discovered "secure hash functions that survive network coding operations and consume very little computational resources", and which is published in an internal MS Tech report. This looks like a serious load of bs to me, but you are free to believe them.

Wed, Jun. 22nd, 2005 02:04 am (UTC)
(Anonymous): Three obvious(?) ideas

1. Instead of transmitting chunks in order, give each chunk an
index number, and randomise the order you send the chunks.
This should avoid the "rarest piece" problem as all chunks
will be equally common. It means you can't stream content,
though, and does require use of temp files before they're
all glued together in the right order at the end.
2. For error correction, subdivide each chunk internally into,
say, 1kb pieces, which is signed after error correction is
3. For error correction, applied at the 4-byte level, 3 bits
of ecc allow 1 bit of error to be corrected. I.e. an
overhead of about 10%: this sounds affordable.


Wed, Jun. 22nd, 2005 02:12 am (UTC)
urban_terrorist: Bit Torrent - Quick Books - Netscape - Symphonhy etc.

Haven't I heard this song before?

If Microsoft actually produced something that was worth using, I'd have a different opinion, they haven't produced anything since Basic V2.0 on the C64 that's been all that useful.

Dvorak did a nice write up on it:


Wed, Jun. 22nd, 2005 03:20 am (UTC)
(Anonymous): Cohen, take it easy. The world is moving.

I think it is a great idea to have competition. We all see how torrent change the p2p in a drastic way (simular ideas were emule and eDonkey, etc). I would say Mr. Cohen, take it easy and don't blame the "money bag"'s interests in the market because i've seen not much improvement on user-interests from you. Apparantly, you forgot about the users' convenient. (eg. I did not see any improvement on user interface except a guy holding his baby daughter in asking for donation. If no donation, the banner will reappear again and again... Also, you neglected unicode developments which limited your torrent client's protential to be used worldwide.) Look around other clients bitcomet, abc, and even the Azura in java, I am sorry to say that you wasted your chance. And it is about the time for someone to make a change of it.

A Comp. Eng. Student of PSU

Wed, Jun. 22nd, 2005 03:40 pm (UTC)
(Anonymous): Re: Cohen, take it easy. The world is moving.

That's stupid. It's the protocol we're talking about, not the client itself.