? ?

Wed, Jan. 4th, 2012, 12:38 am
Moving

I've moved my blog over to bramcohen.com. See you all there!

Thu, May. 19th, 2011, 09:16 am
Practical Cryptography Corrected

The book 'Practical Cryptography' is perfectly good for giving an overview of basic concepts in cryptography, but its immediate practical advice to implementers is not terribly to the point or accurate. Here is much more to the point and accurate advice.

  • For a block cipher you should use AES-128. If you don't understand your protocol well enough to know whether there are birthday attacks on your keys, you have bigger issues. (Shame on Schneier for still trying to revisit the AES design competition by yammering on about Twofish.)

  • For an encryption mode, you should always use CTR, and always use a nonce of zero, and never reuse keys.

  • For a hash function, you should use sha-256 until the winner of the sha-3 design competition is announced, and then you should use sha-3.

  • You should always do encryption as a layer outside of authentication.

  • For entropy, you should always do whatever the Python os.urandom() call does on the local platform.

  • For a data format, you should use JSON. (Not XML!)

  • For an RSA exponent, you should always use 2. Technically that's Rabin-Williams, and requires slightly different implementation, but that actually works in its favor. Rabin-Williams has a reduction to factoring, RSA does not.

  • You should never use the same key for both encryption and authentication. If you need both encryption and authentication, you should use two keys. One for encryption, and one for authentication.

  • If you're going to be using RSA, you should learn about encoding for it! This is by far the most historically error-prone part of crypto protocols, and Practical Cryptography bizarrely doesn't even mention it as an issue.

  • You should not parameterize your protocols! That just creates compatibility problems. If necessary you can parameterize it by having two values, one good for the next twenty years, and one good until the end of time, but key sizes have gotten big enough that skipping that first one should be the default.


Maybe someday Schneier will write a book which I can recommend to people who are getting started in cryptography, but I'm not holding my breath.



Are you a good programmer? Try this coding challenge.

Sun, Apr. 17th, 2011, 06:56 pm
Git Can't Be Made Consistent

This post complains about Git lacking eventual consistency. I have a little secret for you: Git can't be made to have eventual consistency. Everybody seems to think the problem is a technical one, of complexity vs. simplicity of implementation. They're wrong. The problem is semantics. Git follows the semantics which you want 99% of the time, at the cost of having some edge cases which it's inherently just plain broken on.

When you make a change in Git (and Mercurial) you're essentially making the following statement:

This is the way things are now. Forget whatever happened in the past, this is what matters.


Which is subtly and importantly different from what a lot of people assume it should be:

Add this patch to the corpus of all changes which have ever been made, and are what defines the most recent version.


The example linked above has a lot of extraneous confusing stuff in it. Here's an example which cuts through all the crap:

  A
 / \
B   B
|
A


In this example, one person changed a files contents from A to B, then back to A, while someone else changed A to B and left it that way. The question is: What to do when the two heads are merged together? The answer deeply depends on the assumed semantics of what the person meant when they reverted back to A. Either they meant 'oops I shouldn't have committed this code to this branch' or they meant 'this was a bad change, delete it forever'. In practice people mean the former the vast majority of the time, and its later effects are much more intuitive and predictable. In fact it's generally a good idea to make the separate branch with the change to B at the same time as the reversion to A is done, so further development can be done on that branch before being merged back in later. So the preferred answer is that it should clean merge to B, the way 3 way merge does it.

Unfortunately, this decision comes at significant cost. The biggest problem is that it inherently gives up on implicit cherry-picking. I came up with some magic merge code which allowed you to cut and paste small sections of code between branches, and the underlying version control system would simply figure out what you were up to and make it all work, but nobody seemed much interested in that functionality, and it unambiguously forced the merge result in this case to be A.

A smaller problem, but one which seems to perturb people more, is that there are some massively busted edge cases. The worst one is this:

  A
 / \
B   B
|   |
A   A


Obviously in this case both sides should clean merge to A, but what if people merge like this?

  A
 / \
B   B
|\ /|
A X A
|/ \|


(The 'X' here isn't a point in history. It's ascii art of two lines crossing over each other.)

Because of the cases we just went over, they should clean merge to B. What if they are then merged with each other? Since both sides are the same, there's only one thing they can merge to: B

  A
 / \
B   B
|\ /|
A X A
|/ \|
B   B
 \ /
  B


Hey, where'd the A go? Everybody reverted their changes from B back to A, and then via the dark magic of merging the B came back out of the ether, and no amount of further merging will get rid of it again!

The solution to this problem in practice is Don't Do That. Having multiple branches which are constantly pulling in each others's changes at a slight lag is bad development practice anyway, so people treat their version control system nicely and cross their fingers that the semantic tradeoff they made doesn't ever cause problems.

Thu, Apr. 14th, 2011, 05:34 pm
Python wish list

Now that the moratorium on Python language features is over, I'll put in my thoughts on what new stuff the language could use. I don't have much to suggest, and what I do have to suggest is fairly minor. This is because I'm happy with the language.

new on default parameters

One of the gotchas in python is that default parameters are reused, so if you say:
def spam(eggs = []):

then eggs will get set to the same list every time, and modifications will get carried over between calls. This can be hacked like this:
def spam(eggs = None):
    if eggs is None:
        eggs = []

This works, but is ugly, and prevents passing in None as a value for eggs. It would be better to be able to simply say:
def spam(eggs = new []):

which should do exactly what you expect.

^ on bytes

A strange oversight in Python3 is that bitwise operators don't work on byte arrays. The ^, & and | operators should work on bytes of equal length, doing exactly what they obviously should. Trying to apply them to bytes of unequal length should probably result in an error. It's easy enough to write functions to do these things, but they're slow, and there's only one reasonable semantics for what those operators should do on byte arrays anyway.

raw binary conversion

Maybe this has been added to the standard library and I just haven't heard about it, but a longstanding annoying missing piece of functionality is simple conversion between ints and big or little endian representations of them as bytes. Again, this is easy enough to implement, but is slow when done in Python and is hardly an obscure piece of functionality.

dictionary scrambling

This might be an obscure piece of functionality, but I'd like the ability to change the hashing function which dictionaries use, because I write tests which depend on my code behaving the same way every time it's run, and I'd like to be able to test that it doesn't have any dependencies on the order of iterating over dictionary keys or values.

Tue, Sep. 14th, 2010, 10:55 am
Censorship resistance attacks and counterattacks

Related to the recent Haystack hubbub, here's a basic overview of censorship resistance tools, of which Haystack was an example (unfortunately a fairly broken one).

For the purposes of these notes, by 'censorship resistance tools', I'll be referring to ones for browsing the web from inside of countrywide firewalls which are trying to limit access, such as Freegate, Ultrasurf, and the like. Obviously there are other forms of censorship and resistance to it, but that's what's being discussed for now.

The usage pattern for censorship resistance tools goes something like this:
  1. system sends information about proxies to users

  2. users use proxies to browse the web freely

  3. firewall operator finds out IPs of proxies and blocks them by IP

  4. go back to step 1

It's an ongoing cat and mouse game involving cycling through a lot of IPs and a lot of careful secrecy.

An attacker might also, instead of outright blocking an IP, artificially create a very high packet loss rate going to it, which might make users conclude that the anti-censorship system doesn't work very well and give up on it. That could be countered by trying to guess when there's an artificially high packet loss rate, but that's potentially an insidious game - the attacker might, for example, determine where the machines developers use for testing are, and not artificially drop packets to those.

There's considerable concern about the threat model of the censor finding out which users are using the proxies and doing bad things to them. I'll just cut to the chase on that issue - the resistance to attacks of that form is inherently weak. The censor can simply record the destinations of all outgoing connections, and retroactively correlate them to discovered proxies, unveiling the IP of a user. This is a vicious attack which can't be completely eliminated. Possession of the tool might also be incriminating.

High level methods of avoiding detection include:

  • Have lots of cover traffic - that is, lots of users, so attacking them all is impractical. This is probably the ultimate solution, because a tool which doesn't have enough users to provide cover traffic isn't successful, and a successful tool implicitly provides lots of cover traffic.

  • Have user use shared/ephemeral IPs. This is a low tech approach having little to do with the protocol.

  • Use no software, that is, http/https proxies. This makes the user have no recurring evidence, but can expose what the user is doing to snooping.

  • Use ephemeral or easy to dispose of software. This is a good idea, but the techiques for doing it are tricky or rely on physical security.

  • Run proxies on web sites running other services which are also used by users within the target area. This is a great approach, but requires cooperation of a web site which has the willingness to be (or confidence it won't be) blocked.

  • Use actual skype connections. This is an interesting approach which has the benefit of lots of cover traffic, but suffers from limitations on the bandwidth skype intermediaries will provide, and could be attacked by an attacker running lots of high quality skype nodes and noticing the very suspicious traffic.

  • Dial down the level of paranoia. In the end a certain amount of this may be necessary.


Censors have multiple ways of finding IP addresses which are used by the anti-censorship system:

  • Use the same methods as the software. This is a very insidious approach, putting the anti-censorship system in a position of trying to simultaneously publish new IPs and keep their distribution limited.

  • Correlation attacks on existing known IPs. This is also a very insidious attack - the attacker simply takes IPs which are known to use the anti-censorship tool, and looks for otherwise unpopular IPs which a lot of those are connecting to.

  • Probing - an attacker can connect to suspected proxies and try to get them to give themselves away by doing a handshake. Depending on the type of proxy connection used, this can be very effective, sometimes in combination with reverse DNS.

  • Trick proxy users into hitting a web site and observe what IPs the connections come from, observing the IPs of the proxies directly.

  • Deep packet inspection and traffic pattern analysis, including packet sizes, connection number and duration, etc. These can be extremely effective, but can be extremely expensive for an anti-anti-censor to set up. Connection number and duration are probably the most telling pieces of information, and the cheapest to implement, as well as the easiest for the anti-censor to manipulate.


There are several ways for an anti-censor to make it hard to find their IPs:

  • Use lots of IPs. If each user can be given their own dedicated IP then the system is extremely hard to attack. Problem is, this approach requires procument of lots of IPs, which isn't easy.

  • Limit how many users info is given to. This is a good idea, but difficult to do.

  • Encrypt info with not widely circulated keys. This moves the problem to key distribution and management, which is a good idea.

  • Distribute fake IPs including stuff the censor would regret blocking. I think this is kind of fun.

  • Have clients only connect to one IP. This is a very good idea! Should be followed as closely as possible.

  • Make traffic go through more than one hop, masking the IPs of proxies to connections on the outgoing side. While clearly a good idea, this doubles the bandwidth used, which kind of sucks.

  • Rely on deep packet inspection being hard. Less unreasonable than you might imagine - deep packet inspection systems are very expensive and take a while to upgrade, and intelligence on what the deep packet inspection can do is sometimes available.

  • Steganographically encode connections to proxies - this obviously must be done, although it isn't obvious what the best approach is.


There are several things proxy connections could be made to look like -

  • HTTP - while there's plenty of cover traffic for HTTP, deep packet inspection and probing can probably be very effective in recognizing patterns in it, making it not very appealing for stego connections

  • SSL/TLS - there's a decent amount of cover traffic for TLS connections in the form of HTTPS, and using the HTTPS port is probably a good approach, especially since the traffic patterns are going to match http anyway, since that's what it is. There's some concern that man in the middle attacks might be launched, although those are difficult, and an attacker might get suspicious if reverse DNS doesn't return believable information. Still, this may be the best option, and is certainly the simplest to implement.

  • BitTorrent - BitTorrent has lots of cover traffic, and the obfuscated version of the protocol looks fairly generic, although its traffic patterns are very distinctive and wouldn't be closely matched by anti-censorship web browsing.

  • utp - utp is a udp-based TCP-alike originally designed for BitTorrent. It has the advantage that some deep packet inspection systems just plain don't support UDP, and it's easy to use as a swap-in replacement for TCP. It has some of the same cover traffic problems as regular BitTorrent.

  • SSH - while tunneling over SSH is not uncommon, making using SSH connections no more suspicious than having long-lived high-throughput SSH connections is to begin with, that's already a high level of suspiciousness, so this probably isn't a great approach.

  • skype - skype traffic has good cover traffic, but is a very poor match in terms of usage patterns.

  • noise - a TCP connection which has just plain garbage going over it is a surprisingly reasonable approach. Lots of weird miscellaneous things on the internet are hard to classify, and obfuscated BitTorrent provides a decent amount of cover.


There are several methods a censorship resistance system can use to get IP addresses out -

  • offline - this is the most secure way, but it's very slow and expensive

  • spam cannon - a spam blast can be sent out containing addresses of proxies. This works but is moderately slow and moderately expensive. It's also potentially very easy to intercept.

  • to existing users - client software can be sent IPs of failback proxies when it makes a proxy connection. This works and is fast, but has the problem that an attacker can run client software and use it to find proxies as well.

  • via web stego - this technique hasn't been used yet, but IPs could be encoded steganographically in real web traffic. Given the tremendous popularity of censorship resistance tools in the west, it might be possible to enlist the help of lots of web sites, and make it essentially impossible to filter them all out. I'm working on technology for this.

Sun, Apr. 4th, 2010, 09:06 pm
Rebasing

Rebasing is a controversial feature, with some people declaring it the greatest thing since sliced bread and others declaring it the spawn of the devil (okay I exaggerate, but you get the idea). I will now set the record straight by explaining The Truth.

Rebase is a good feature.

But it's hard to implement properly on a version control system using the standard Hg/Git architecture.

There are a number of problems with the standard architecture, but the one important for rebasing is that it treats data like a roach motel - changes can get in, but they can't get out. Rebasing is fundamentally about removing some changes in a controlled way, and the closest you can get is to rewrite some history which never made it off the local machine and pretend it never happened.

I'll now describe an architecture which supports rebasing very well. Whether similar techniques can be used to add rebasing to the standard architecture will have to be left for another day.

First, we need a concept of a branch. A branch is a snapshot of the codebase which changes over time. For rebasing to work well, there needs to be a very strong notion of branches, and the simplest way to get that is to have a single centralized repository with a list of branches whose values change over time and whose old values are remembered. To have relationships between branches, we dispense completely with the complex notion of history which hash-based architectures have, and introduce a concept of 'parent'. Each version of each branch has a parent specified, although it could be null. A version of a branch represents two things simultaneously:

(1) A snapshot of the codebase

(2) a patch off of its parent

Individual branches are modified with CVS/svn style update/commit.

Rebasing is now straightforward. You take a diff from the parent to the current snapshot, and apply that to the new parent. This can be done for any potential parent, including the latest version of the branch which the current parent is from, the latest version of another branch, and even older versions of other branches or the current parent. Any reparenting will propagate to all child branches, and changing the parent back will re-propagate nicely as well. This approach allows you to nicely specify what goes where, without having that roach motel feel of code committing.

There would be a number of practical benefits to this architecture beyond allowing nice rebasing, although writing a new version control system from scratch today seems like an extreme approach. In the future I'll comment about the possibility of using parents in Hg/Git, after I've given some thought to it.

Tue, Mar. 30th, 2010, 11:10 am
Patience Diff Advantages

There's no coherent explanation of what the advantages of Patience Diff are, so I'll explain now. First, a quick overview of how Patience Diff works -
  1. Match the first lines of both if they're identical, then match the second, third, etc. until a pair doesn't match.
  2. Match the last lines of both if they're identical, then match the next to last, second to last, etc. until a pair doesn't match.
  3. Find all lines which occur exactly once on both sides, then do longest common subsequence on those lines, matching them up.
  4. Do steps 1-2 on each section between matched lines
I've previously described it with the ordering a bit different and a recursive step at the end, but this ordering always gives the same result and performs much better, and the performance hit of doing the recursion isn't worth it, because it rarely if ever finds any more matches, and even when it does it isn't clear whether the extra matches produce a functionally superior diff. A much more detailed and more graphical explanation is here. A side by side of how that particular diff gets mangled by standard diff algorithms is here. This is the example which motivated patience diff in the first place. If you make extensive changes to a file and remove a function from the end of it and add a function to the beginning, there will be a tendency for an LCS based diff algorithm to match up all of the curly brackets instead of the functions, so if it's matching F1 F2 F3 to F4 F1 F2 instead of viewing F4 as new and F3 as deleted, it will match the curly brackets of F1 to F4, F2 to F1, and F3 to F2. The result is every bit as gross and useless as it sounds, and can easily force a merge conflict in cases which could have been resolved completely cleanly. This has a particularly strong tendency to happen if you put curly brackets on lines by themselves or are religious about putting curly brackets around single-line blocks even though it isn't technically necessary or even if you just put lots of blank lines between your functions.

Another advantage of patience diff is that it frequently doesn't match lines which just plain shouldn't match. For example, if you've completely rewritten a section of code it shouldn't match up the blank lines in each version, as this example shows. Finally, there's this example:


 void func1() {
     x += 1
 }

+void functhreehalves() {
+    x += 1.5
+}
+
 void func2() {
     x += 2
 }
Which is straightforward and obvious, but frequently diff algorithms will interpret it like this:


 void func1() {
     x += 1
+}
+
+void functhreehalves() {
+    x += 1.5
 }
 
 void func2() {
     x += 2
 }
While technically that works, it's obviously inferior and is a bit of a pet peeve of mine. In principle one could tweak an LCS-based diff to always do the right thing here, but the algorithmic complexity of LCS make implementations of it essentially unmaintainable. That's another another advantage of patience diff - it's simple and understandable that it can be modified and extended reasonably, and why it performs correctly in all these examples can be easily analyzed.

Sat, Mar. 6th, 2010, 03:42 am
Enterprise Chatroulette

Chatroulette is an interesting concept. On the plus side, it reliably provides people to talk to (especially if you're a hot babe), and leads to some actual conversations. On the downside, it has too many (literal) wankers, and feels lonely and disconnected while at the same time feeling like it's always just about to give you some real social interaction in the same way IRC does. I don't feel myself getting sucked in, but then I stopped feeling the allure of IRC a few years ago.

What the world needs is a service which is to chatroulette what yammer is to twitter - the same service, but only for people who have accounts on the same domain. That would allow it to be un-anonymized, since it could display the user's login, and solve the wankers problem, because HR and a lack of anonymity takes care of that. For companies above some size, it would be a great way for people to get the sort of random interaction with coworkers that they normally do in the office while working remotely. It would also probably have an easy upsell to an expensive premium service for any company which started using it a lot, even if they didn't mean to.

Tue, Mar. 2nd, 2010, 07:04 am
A new card game

I came up with a new card game the other day. It's been play-tested a bunch, and is a lot of fun.

The Rules:

There are two players. A standard poker deck is shuffled, and sixteen cards are dealt out face up, eight to each player. Both players get to stare at the cards for a while.

Each player then gathers up their eight cards and sets aside four of the eight cards for the other player. They then exchange the cards they've set aside for each other. Each player then forms the best poker hand they can with any five of the eight cards in their hand, and the one with the better hand wins.

That's the entirety of the rules. What makes play interesting is that if you can guess what cards your opponent will keep there's almost always a set of cards you can set aside which will make you win.

Thu, Feb. 4th, 2010, 11:42 am
Freemium results in terrible games

Game designers in general think of themselves as performing a valuable service for humanity. Aside from simply entertaining, they have the goal of making their users smarter, more disciplined, and all around better than they were before playing. In large part they can succeed in doing this. Unfortunately the freemium business model doesn't encourage game mechanics which enhance the user's cognitive skills. In fact in directly fights against them.

Freemium at first blush doesn't seem like a bad idea. Players get to try out a game for free, and generally can keep playing for free, but if they're really into then they can pay more for some game extra. This results in cheap games, which make money by servicing their most die-hard fans, with continuous development and, above all, more games. On the surface this sounds like a huge improvement over the flat fee model of trying to sucker as many people as possible into buying a game they only play a few times. The problem comes in with the psychology of players, and it's one rooted deeply in the human psyche which I don't have any good solutions to.

There are essentially two kinds of game players - those who view achievement at a game as validation of their skills, commitment, and endurance, and those who see game achievement as an end in and of itself. The first kind will not on principle ever pay any money to get a leg up in a game, because that would permanently cheapen whatever success they might later have. It would be, to use a vulgar term, cheating. Players who are just interested in progressing, on the other hand, are perfectly happy to do that, and hence are much easier to monetize - just offer them in-game gold for a few real bucks, and on a day when they're feeling frustrated, they'll happily fork it over.

The people who design games, and most of the hard-core gamers, are from the skill camp, and instinctively fight the pay to cheat model, sometimes to a ludicrous degree. When Ultima Online first experienced a large market for in-game items, a phenomenon which had never been seen before, the designers banned the practice of selling items and with to great lengths to stamp out the aftermarket, rather than simply selling in-game items themselves, which would have resulting in them making a lot more money and their players being a lot happier. I personally was very excited when the game Bloons Tower Defense 4 came out, because I'd made it through all the levels of Bloons Tower Defense 3 and greatly enjoyed them. I happily played through most of the levels of 4 on the hardest setting, until I got to the fifth one. That level is so ridiculously difficult that it's clearly just there to get people to pay for power-ups, a form of monetization which simply wasn't there for 3. After banging my head against it for a while, I simply stopped playing. The whole franchise feels tainted to me now, a cheap form of grabbing money rather than a legitimate intellectual pursuit. I realize how silly this is, especially for a game whose pinnacle of weaponry is a Super Monkey Storm, but it's fundamental to my motivation for game playing - if the reward for succeeding has lost it's meaning, then I don't get a positive pavlovian response to it, and I lose my interest.

Once a game starts to do freemium monetization, the natural tendency is to simply abandon aspects of gameplay which require greater cognitive skills altogether. The people who are attracted to such things will on principle never pay any money, and the ones who remain - the ones who really just care about getting ahead in the game by any means, will find anything which absolutely requires developing skills overly frustrating, and will likely just get scared off by it. Gameplay descends into repeatedly clicking next, with enough cute animations and just enough having to move around of the mouse between clicks to keep the user entertained, and the monetization scheme is that if you pay a buck you can skip ahead a few thousand clicks.

I don't have a good solution to this problem. Freemium is a much more compelling games business model than anything else anybody's come up with, and absent any real alternative, the phenomenon of dumbing down of gameplay will continue.

Sat, Jan. 30th, 2010, 12:04 am
Freenode sucks

I logged into freenode, which I haven't logged into in a while, to help stop some trolling on #bittorrent. I couldn't identify as my nick any more, so I went to ask for help...

<mquin> when did you register it?
 <bramm> uh, circa 2000
 <mquin> the current registration is a little over 6 months old - if you did register it in 2000 it must have expired since
 <bramm> so someone just stole my nick?
 <bramm> since when does this thing expire registrations?
 <bramm> I'm pretty sure I've logged in within the last year
 <bramm> given that I've always had this nick, and that there are only two people named Bram which are recognizable names in open source, and that my name is one of the most well known in open source, I'd like my nick back :-P
 <mquin> nicks are considered expired after 60 days of inactivity, after which they can be dropped either on request or when we ocassionally clean up the services database
 <bramm> also, there's a problem that I'm an op on a channel, and need to give access in it to other people
 <bramm> that policy is completely retarded
 <bramm> the #bittorrent channel is having a problem with trolls, and we need to get rid of them, and thanks to that lamebrained policy there's currently noone with sufficient ops privileges in the channel to do anything about it
 <mquin> I'm sorry you feel that way, it's not really reasonable for us to keep nickname registrations perpetually when they are not being used
 <bramm> get real. I've logged in within the last year, getting rid of them after six months is nuts
 <bramm> if nobody does anything about this I'm going to go public about it, freenode does NOT want the publicity of me being pissed off
 <bramm> er, after 60 days I meant, I've never heard of nick expiration on such a short time scale, from any site
 <bramm> I can easily prove who I am. I'm the well-known author of an important project and need my nick back to stop trolling in the project channel, is there anything which can be done about this or do I have to make a stink?
 <mquin> handing the nick back to you, even if I were able to do that, would not restore any channel access you had when you held the registration
 <mquin> channel access flags are dropped along with the account
 <bramm> well how can we get someone to have ops on the channel?
 <mquin> if you are an offical representative of the bittorrent project you can assert that by filing a group registration, which would allow you to reclaim #bittorrent
 <bramm> and how can I do that?
 <mquin> http://freenode.net/group_registration.shtml
 <mquin> you may also wish to talk to the current channel registrant - he can add additional users to the access list at this point
 <mquin> oh, my mistake, it's been held
 <bramm> what do you mean held?
 <bramm> maybe you missed that part about me being the channel registrant
 <bramm> and my nick being stolen
 <mquin> yes, I misread something I was looking at - my mistake
 <mquin> to avoid primary namespace (single-#) channels being lost in sitations such as this we transfer them to staff control in the event of the founder's nick being dropped
 <mquin> it makes it fairly straightforward to reassign them when there is a group registration rather than having them appear to be available for re-registration by anyone
 <bramm> I have never, in my entire life, heard of a registration expiration process which was this aggresive, or this cavalier about damaging existing relationships
 <mquin> the 60 days figure is just a minimum - we normally allow more grace (typically 1 week per year) for long standing registraions when processing drops by hand
 <bramm> you say that as if adding a few weeks to the end would make the time frame reasonable
 <mquin> we don't feel it is reasonable to hold nickname registrations perpetually if they are not being used
 <bramm> I'm not asking for perpetually
 <bramm> just something vaguely reasonable
 <bramm> and I hope you realize that you just completely pissed off one of the most well known and respected people in the whole open source community
 <mquin> I'm sorry you are upset
 <bramm> I'm just going to pretend you're a robot and not blow my stack at you
 <bramm> but it's requiring effort
 <mquin> What do you expect me to do? I can't very well return a nick to you that has been in use by someone else for well over 6 months.
 <bramm> well maybe the policies could have kept that person from taking over the nick, seeing as how I was using it for NINE YEARS prior to that
 <mquin> Had we known at the time that you were planning to be away from the network for an extended period of time we could have arranged for it to be held for you
 <mquin> I know it's unfortunate to lose a long-standing registration, but we do have to have some limit on what we consider a reasonable activity level
 <bramm> I was never informed of there being any such policy. I was never informed via email than my nick was about to expire. Any minimal checking of expirations being done by hand, which you say it is, would have indicated that my nick should absolutely not have been expired
 <mquin> unfortuantely it's difficult to verify which steps were or were not taken this long after the event

[Update] Well now that I've managed to get called an asshole (hi, HackerNews commenters who registered five minutes ago!) Here are my calmer thoughts

The reason I posted the log verbatim, me being pissed off and all, is that I wanted to make very clear that I was accurately representing official freenode policy, and that requesting help through support leads nowhere. My gripe is with freenode policy, which is asinine, not with the particular person I spoke to, who was merely being useless and patronizing.

The reason I got pissed wasn't because of the nick loss, which I find mildly annoying, but because channel ops got blown away, causing me to have to deal with this bullshit instead of just giving ops to someone else.

Yes I can be blunt. If you value the superficial affectation of politeness over the essential point of what someone is saying, you can shove it. I don't appreciate people saying that I'm this way because of asperger's, it just causes other people to whine that they're being oppressed because they can't criticize me. The whole line of argument is stupid. People are free to criticize me for not being polite, and I'm free to respond that they're being petty and superficial.

The whole 'it's free so you can't complain' argument is bullshit. There are plenty of free things which are of negative value to society because they suck up or distract resources which could be working on a much better alternative. I've provided lots of support for free stuff myself, both via employees and directly, and never have I claimed that a problem won't be fixed because the person airing a legitimate gripe hasn't gone through arbitrary bureaucratic processes, or that the person complaining should implement it themselves because they're a programmer, or refused to acknowledge that some pain a user experienced through no fault of their own really was unfortunate. And I always prioritize up users who matter and problems which need immediate fixing. That's the way you run things if you actually care about providing a valuable service.

As far as whether my ops problem might get resolved, whether I'd utterly cursed out the guy from support or had the humility of a saint, it probably wouldn't get handled regardless.

[Update 2] Some commenters don't seem to understand that Freenode policy, in fact Freenode's whole foundation for legitimacy, is that project leaders are entitled to control their channels. I am in fact a project leader with a long established channel, and in the time that site op spent pedantically repeating rules and procedures he could have verified who I was and fixed the situation, which, say what you will about lilo, is something he actually would do. I was not making any claim to importance which I don't unambiguously have, and my message to other programmers considering using public servers is that OFTC is down the hall and to the left.

Mon, Nov. 23rd, 2009, 03:22 pm
Threading in web forums

Threading in web forums isn't hard to get right, but it continues to be done right approximately nowhere. In the interests of helping to improve the situation, I will now explain the right way to do things.

When returning to a web forum, what should be displayed are either all posts which are either new, or which are immediate ancestors to a new post. The non-new posts should be displayed grayed out, since they're there for context, to avoid the annoying practice of people having to manually edit quoting in improperly threaded systems.

Posts should be displayed properly threaded, with indenting used to indicate responses. But the standard simple way of indenting each response one level more isn't quite right, because it leads to way too much indenting. What should happen is that if there is more than one response to a post currently displayed then the responses get extra indentation, but if there's only one displayed then it gets the same level, with a graphic included to indicate that it's a response rather than a separate post. Note that this only takes into account posts which are currently displayed. Not currently displayed posts are irrelevant and shouldn't affect formatting.

There should of course be indicators and expanders for ancestors and undisplayed responses to displayed posts, and those should change the formatting of everything appropriately when hit.

If someone views a thread which has no new posts, they should simply be shown the entire thread.

There, that wasn't so complicated. Now please get it right!

Fri, Nov. 13th, 2009, 10:08 pm
Comments on Go

Here are my preliminary thoughts on the Go programming language.

The most interesting feature for me personally is the built-in threading. Aside from its superb support for multi-core, it's just plain a good set of ways of doing networking. The lack of a decent built-in networking library (and generally coordination library) in Python has been a thorn in my side just about forever. In particular the promotion of queues to being one of the few built-in primitives with their own special syntax encourages good threading practice and is clearly warranted. Even such a simple command as 'wait until either the UI thread or the networking thread comes up with something' is a source of ongoing pain in most languages, but is built into Go as a core concept using select.

Go seems to finally get the static typing problem solved. Its := operator is a reasonable middle point between C++'s ludicrous verbosity and ML's excessive magic. Types being structural is also a huge win. There's no end of stupid architectural haggling over what module a base type sits in and who owns it, and the answer 'nowhere' completely gets rid of that problem. It seems to me that there are deep subtle problems with such declarations - for example, how does it statically check that the parameters accepted by methods of a type you're receiving are compatible with what you want to pass them? But maybe I just haven't thought about it enough. It's too bad that Go doesn't currently have generics. I for one won't start any new project in it until it reaches that level of maturity.

Go's lack of exception handling is a real problem, and another thing I'm blocking on to do real development in it. My own preferred method for adding it would be that if you call a function which has multiple return values and you don't handle one of them, it defaults to shorting to the same return of the caller, although some people might complain about that being too much like Java's 'throws'. That said, I've gotten so used to debugging by stack trace that I'd be loathe to not have stack building built into the language in some form, and in fact I've gotten really attached to a tricked out logging tool I wrote which can decorate any object and automatically logs a full stack trace of every assignment which is made to the object and allows you to print them all out at once. But perhaps such trickery is really the domain of highly dynamic languages, and not appropriate for something as low level and performance oriented as Go.

The primitives in Go are quite good. All languages should have special maps and lists built in. I think it actually doesn't go far enough with giving them special status, and should have Python-style special syntax for maps. The curly brackets could be freed up by simply eliminating their current usage and making formatting have syntax. It's more than a little bit absurd that the language designers themselves have a setup where a utility standardizes the formatting of their own code every time they commit, but they still maintain the nominal free-form nature of the language. Really guys, I know you were traumatized by Fortran's original awful enforced formatting, but that was a long time ago and it's time to let go.

That said, the primitives are given too much special status in other ways - they're the only things which have type parameterization, making it impossible to even implement their interfaces yourself, and worse, they're the only things which are call by reference. The call by reference thing worries me a lot. I really, really don't want Go to become the reference/pointer mix hell which C++ has become, but it's already headed in that direction. It really shouldn't matter that much - things which are passed are either an address or a copy, and the reference/pointer distinction really just has to do with what's the default (okay, so typically references don't let you overwrite either, but that's not a fundamental property). I for one strongly prefer the default be an address, and clearly when push comes to shove Go's designers do too, but more important than which way it is is that it should be consistent. Already transitioning to something consistent might require rewriting huge amounts of code, and it's getting worse, so fixing this problem might have to happen soon or never, and I'm afraid that it might already be never.

Go's speed of compilation is very nice, although I'm afraid I view that not so much as a strength of Go but as an awfulness of C++. Why C++ continues to take forever to compile even on machines many orders of magnitude faster than the first ones I ever used it on has long been a mystery to me. I hope the answer is simply that it's a language which wasn't designed with ease of parsing in mind, and has a whole layer of preprocessing on top of it which is horribly abused.

It's interesting that Go is going the garbage-collected route. If such a low-level language as Go can get away with that (and, truth be known, their preferred garbage collector isn't really integrated yet, so it's a little early to call it) then we may never see another non-garbage-collected language ever again.

I despise the use of initial capital letters to specify that something is public. Maybe if I used it for a while I'd learn to not hate it, but for now I hate it. Does chinese even have uppercase?

It's entirely possible that after using Go for a while something else would really start to gnaw at me about it, but it generally has a good smell, so hopefully not.

If you've read this far, you should follow me on Twitter.

Mon, Oct. 26th, 2009, 10:02 am
Dual Rings is on the market



This is my 'Dual Rings' puzzle, the result of a bizarre collaboration between myself and Oskar where I came up with a completely abstract puzzle and he came up with an unexpected mechanism for it. It's going on the market produced by Hanayama now, and there are a few copies you can buy immediately (see video info for details).

Mon, Oct. 19th, 2009, 12:55 pm
Google suggests life's most important questions

Thanks to google search suggestions, here's the list of life's most pressing existential questions:

How can...
How can you tell if a guy likes you?
How can you tell if a girl is a virgin?
How can I make my hair grow faster?
How can you tell if a girl likes you?
How can you tell if you're pregnant?
How can I get pregnant?
How can I keep from singing lyrics?
How can you tell if someone is lying?
How can I lose weight fast?
How can I keep from singing?

Am I...
Am I pregnant?
Am I fat?
Am I in love?
Am I overweight?
Am I depressed?
Am I having a boy or girl?
Am I an alcoholic?
Am I bipolar?

How long...
How long does implantation bleeding last?
How long does it take to get a passport?
How long does alcohol stay in your system?
How long does it take to get pregnant?
How long does sperm live?

Why do...
Why do men have nipples?
Why do dogs eat grass?
Why do cats purr?
Why do dogs eat poop?
Why do men cheat?
Why do we dream?
Why do we yawn?
Why do mosquito bites itch?
Why do cats knead?
Why do dogs lick people?

When can...
When can you get pregnant?
When can I take a pregnancy test?
When can you find out the gender of the baby?
When can you take a pregnancy test?
When can you feel the baby move?
When can I get pregnant?
When can you hear a baby's heartbeat?
When can you tell the gender of a baby?
When can babies eat eggs?
When can I take a home pregnancy test?

When will...
When will I die?
When will the world end?
When will I get married?
When will windows 7 be released?
When will xbox live be back up?
When will iphone 3.0 be released?
When will the recession end?

Can someone...
Can someone tell if you are looking at their facebook page?
Can someone get pregnant during their period?
Can someone spy on my computer?
Can someone tell if you look at their myspace page?
Can someone be allergic to water?
Can someone die from a broken heart?
Can someone find out my ip address?
Can someone see if I looked at their profile on facebook?
Can someone track my ip address?
Can someone tell if you searched them on facebook?

Can you...
Can you run it?
Can you get pregnant on your period?
Can you duet?
Can you feel the love tonight lyrics?
Can you be pregnant and still have a period?
Can you get pregnant right after your period?
Can you get pregnant right before your period?
Can you get pregnant from being fingered?
Can you have your period and still be pregnant?
Can you get pregnant if he pulls out?

The less existential prefixes have a lot of questions about Michael Jackson.

How come Michael Jackson is white?
Did Michael Jackson die?
Did Michael Jackson bleach his skin?
Did Michael Jackson write his own songs?
Did Michael Jackson molest children?
What are Michael Jackson's kids names?
What is up with Jermaine Jackson's hair?
Where will Michael Jackson be buried?
Where did Michael Jackson die?
Where did Michael Jackson live?

Fri, Oct. 16th, 2009, 10:26 am

This post unintentionally demonstrates that functional-style Python is ugly and bad.

Let us start at the top. At the beginning, it has some sample code which defines a multiple() function, which could be trivially inlined, resulting in code which looks like this (I'm doing all examples in Python 3):
print(sum(x for x in range(1, 1000+1) if x % 3 == 0 or x % 5 == 0))

There's no reason whatsoever to expand that out. This should be an early indication that maybe the code samples here aren't the greatest.

Moving on, there's getting the sum of all fibonacci numbers less than 4 million. This is done in the example using itertools and yield, resulting in a fair amount of ugly code. Here's how a sane person does it:
def fibsum():
	a, b, c = 0, 1, 1
	total = 0
	while c < 4000000:
		total += c
		a, b, c = b, c, b + c
	return total

Now that's much more readable, flexible, and maintainable.

Finally, there's the problem of finding the largest palindrome which is the product of two three digit numbers. Here's my solution, which contains less code, is much more readable, and oh yeah, I threw in an optimization to make it return almost instantly instead of having to crunch for a second:
def bigpalindrome():
	best = 0
	for i in range(999, 0, -1):
		if i * 999 < best:
			return best
		for j in range(999, i-1, -1):
			x = list(str(i*j))
			x.reverse()
			if int(''.join(x)) == i*j:
				best = i*j

Much better. I think these examples do a good job of exploding the idea that the functional style of programming is clearly better and the appropriate first thing to teach people. Obviously some people are being driven to write horribly contorted and ugly code because they started in a functional language when they switch a more, ahem, mainstream one.

Fri, Oct. 16th, 2009, 10:09 am
print isn't even vaguely thread safe

Thanks to everyone who provided suggestions for my problems with print yesterday. It turns out that print isn't even vaguely threadsafe, and that the rather surprising behavior when there's a collision is for there to be duplicate output bytes. So if you ever see duplicate print lines, you know what to guess it is now.

My quick hack for getting around the problem is to make a module called rprint which looks like this (I'm using python 3):
from threading import Lock

mylock = Lock()
p = print

def print(*a, **b):
	with mylock:
		p(*a, **b)
Then from every module where I have a print I say:
from rprint import print
I don't know if this is an elegant solution or an ugly hack, but hey it works.

Thu, Oct. 15th, 2009, 12:07 pm
print() experiencing deja vu

I have a very multithreaded Python3 app I'm running which I have a whole bunch of calls to print() in. I don't generally do multithreading, but this is spinning up a whole bunch of servers to check if they're working right. I have another test which is non-multithreaded and reproducible, and will of course have another test of actually running everything on multiple machines, but this is the intermediate step.

Anyhow, my problem is that print() appears to be sometimes causing the same line to be printed out repeatedly. I'm loathe to draw the conclusion that the underlying libraries are misbehaving, but I've taken the following precautions:

The amount of data it's spitting out is extensive
The code is asserting that the exact string hasn't been printed before
The thread and object id's are included in what's printed
There's a call to randrange() included at the end, just to make sure

And yet, I'm still getting identical lines of output.

My questions are:

What the fuck?

Is there some config setup to make this stop?

If not, is there a workaround?

My next step is to try making everything dump objects to be printed on a queue and have a single thread to all the printing, in the hopes that that will get the duplicates to go away, or at least be next to each other instead of obnoxiously interleaved. But first I'm taking a break for lunch, I've already wasted a day on this crap.

Wed, Oct. 7th, 2009, 07:35 pm
Sudoku and the way of the SAT solver

Writing a Sudoku solver is a slightly tricky but reasonably straightforward problem. Here I'll present a non-obvious way of implementing it which is short in terms of lines of code and also much easier to modify and extend than the more direct approaches. The general technique is applicable to a very wide range of problems, so hopefully someone will read this and then find it useful in the future.

My general approach is in multiple steps. First, express the Sudoku problem as a much more general type of problem called a SAT problem (short for 'satisfiability', traditionally capitalized for no particular reason). Then solve the SAT problem, then translate the solution back into the original format.

A SAT problem works on boolean variables, and the problem is to find a find an assignment which satisfies every one of a list of clauses. An individual clause states that at least one of a set of variables an their negations must be true, for example 'A or B or C' or 'not B on not D or E' are both typical clauses. To translate a Sudoku problem into a SAT problem, I make one boolean variable for each possible state of each cell of the original problem, so there's one variable for whether the upper-left cell is a one, another for whether it's a two, etc. Translating the constraints on the Sudoku problem is then straightforward - each digit must occur exactly once in each row, column, and subsquare, and each pair of cells in each row, column, or subsquare must not both be true (expressed as, either the first one is false or the second one is false).

To solve the SAT problem, the solver first scans for a clause containing no variables. If there is one, then the problem is insoluble and you return. Otherwise, it scans for clauses containing exactly one variable. If there is such a thing, it makes that assignment and recurses. If not, it finds the clause with the smallest number of variables in it (skipping the 'not A or not B' clauses, because everything contains about the same number of those) and recurses twice, setting one of the variables to true and false.

The advantage of this approach is that the heuristics for noticing when there's only one possible value left on a row, column, subsquare, or an individual cell don't have to be special cased - they're all subsumed by the single simple deduction that if a clause only contains a single variable, then that assignment has to be made. As a result, adding for example a constraint that the diagonals have to have each digit occur only once is trivial, and making irregular shapes and adding novel types of constraints is also fairly easy, while it would be quite difficult in the more obvious but less flexible approaches to writing Sudoku solvers.

It would be possible to make this code a lot faster, even without changing the architecture - for example, one could build an index of what variables occur in what clauses, to get rid of all the linear scans. But for solving a small number of Sudokus, or even a largeish number, taking a few seconds each is no big deal, and this approach is much more about maintainability, flexibility, and extensibility. Interestingly, for a large number of real world problems the fastest known way to solve them is to translate to SAT and run a general SAT solver with appropriately tuned parameters on the result. Obviously this is a hack, but applying custom heuristics to the unencoded problem is so messy and difficult, and so much work has been done on fine-tuning the general purpose solvers, that more often than not the theoretically better appoach can't compete in practice. Sudoku has some very special properties which tip the balance in favor of a very special custom bit of code, and in fact I made my SAT solver do a heuristic which is very Sudoku specific and skipped the common one that if a variable only appears in the positive or negative then you assign it because that doesn't apply, but such customization being warranted is highly unusual, and even in this case the general SAT solver has some clear advantages, starting with it containing fewer lines of code.

I used exception handling to indicate when a solution is found, and add assigments to it in handle and reraise clauses, mostly to make the point that this is a perfectly valid and highly pragmatic way of using exceptions.

A flat file of the code can be found here, but I've included it verbatim below.

By the way, the method of testing I used on this is to run it on a Sudoku problem with no initial constraints, which is a simple one-line test and does a good job of hitting all the edge cases regardless of the implementation details.

class SolutionFound(BaseException):
	def __init__(self):
		self.s = []

def assign(clauses, val):
	n = []
	for c in clauses:
		if val in c:
			continue
		elif -val in c:
			n.append([x for x in c if x != -val])
		else:
			n.append(c)
	return n

def try_sat(clauses, val):
	try:
		solve_sat(assign(clauses, val))
	except SolutionFound, s:
		s.s.append(val)
		raise s

def solve_sat(clauses):
	if [] in clauses:
		return
	if len(clauses) == 0:
		raise SolutionFound()
	for x in clauses:
		if len(x) == 1:
			try_sat(clauses, x[0])
			return
	smallest = clauses[0]
	for c in clauses:
		if c[0] > 0 and (smallest[0] < 0 or len(smallest) > len(c)):
			smallest = c
	try_sat(clauses, smallest[0])
	try_sat(clauses, -smallest[0])

svals = [list(range(x, x+9)) for x in range(0, 81, 9)] + \
		[list(range(x, x+81, 9)) for x in range(9)] + \
		[[x,x+1,x+2,x+9,x+10,x+11,x+18,x+19,x+20] for x in [0,3,6,27,30,33,54,57,60]]

def group(c, p):
	p.append(c)
	for i in range(len(c)):
		for j in range(i):
			p.append([-c[i], -c[j]])

def groups(c, p):
	for i in range(9):
		group([x + 100 * i + 1 for x in c], p)

def decode(r):
	r2 = [[0 for x in range(9)] for y in range(9)]
	for v in r:
		if v > 0:
			pos = (v-1) % 100
			r2[pos // 9][pos % 9] = (v // 100) + 1
	return r2

def solve_sudoku(problem):
	p = []
	for i in range(81):
		group(list(range(i+1, 901, 100)), p)
	for s in svals:
		groups(s, p)
	pre = []
	for x in range(9):
		for y in range(9):
			if problem[x][y]:
				m = x * 9 + y + 100 * (problem[x][y] - 1) + 1
				pre.append(m)
				p = assign(p, m)
	try:
		solve_sat(p)
	except SolutionFound, s:
		return decode(s.s + pre)
	return None

import sys
sys.setrecursionlimit(10000)

if __name__ == '__main__':
	tt = [[0] * 9 for i in range(9)]
	print('\n'.join(' '.join([str(p) for p in q]) for q in solve_sudoku(tt)))
	print(' ')
	tt[0][0] = 6
	tt[3][4] = 1
	print('\n'.join(' '.join([str(p) for p in q]) for q in solve_sudoku(tt)))

Tue, Sep. 29th, 2009, 12:19 pm
Signatures don't do what you think they do

Security people tend to think that we live in a secure world, one in which everyone is constantly auditing the behavior of everyone else, and the end result is widespread mutually ensured honesty. We don't even vaguely live in that world. We live in a trusting world, where most people are mostly good, and the need for auditing is much lower than it would be if everyone were greedy sociopathic automatons. I would not want to live in a world which worked that way, and it would probably not only be an unpleasant place to live, but an extremely unproductive one as well, as every attempt by anyone to get anything done would be destroyed by theft and corruption.

I say this not to engage in broad philosophizing but because I have a very concrete point to make about a very specific thing: signatures. People  think of signatures as being a strong form of physical evidence, useful in court proceeding for proving that a particular person really did sign a particular thing. While this belief being widespread does a good job of denying that things they signed are actually their signature, which is a good thing, the claimed difficulty of forging signatures is simply not true. Anyone can practice forging a signature from a few samples for a few hours and be able to do a passable replica. Anyone with decent skill who practices a bit can get quite skilled. And people aren't very consistent about how they sign their own signatures, making even legitimate matches sometimes look fake. Thumbprints would be far better as a piece of evidence.

Despite that, signatures are still very important and good at what they're used for. What is that? It's to make it clear that someone knows when they're entering into a binding agreement. You can't be forced into an effective contract just because you said 'yeah, whatever' when asked if you want to participate, and you can't be forced into a contract by being tricked into signing a document which says something different than what you think it says. The theory of contracts is based on parties mutually agreeing to be contractually bound, and requires they all go through sufficient ceremony that it's clear when a contract has been entered into (sometimes merchants can get into binding contracts much more easily, but that's because they're expected to be more savvy, the law is big on protecting little old ladies from being suckered).

For example, take the use of signatures for receiving packages. There isn't even a contract entered into when a package is signed for, but the reasoning behind it is the same - it's to make clear that the person receiving the package knew they were receiving a package, and not claim later that there was a misunderstanding. To the extent that the signature has any evidenciary power in this case, it's mostly in that people generally by default put down their real name, and since the delivery person generally doesn't even know what the potential name of a recipient might be, it's hard for someone to lie later and claim that no package was delivered at all.

The hoopla around cryptographic signatures is largely misplaced. Having signatures which were on a web page which clearly stated what was being indicated and the signature was done by moving the mouse like a pen in a drawing area would do a much better job of indicating what signatures are supposed to indicate, and probably be much easier to back up in court later.

Now somebody please explain this to Bruce Schneier, because he doesn't get it.

Mon, Sep. 14th, 2009, 04:21 pm
Awful Programming Advice

I just came across a blog post going over that old saw of Object Oriented Design, a square being a subclass of a rectangle.

This advice is worse than useless. It's either wrongheaded or meaningless, depending on how literally you take it.

Taken literally, it would never make sense to make a full-blown class for such a trivial piece of functionality. There simply would be more lines of code taken up making declarations than could possibly be saved by convenience. Taken less literally, it's just gibberish, a completely nonsensical way of thinking about the problem, like teaching a drawing class where you cover pentagrams.

So what would be a sane way of building this functionality? Well, first you have to decide what the functionality actually is, since there wasn't any actual functionality in the first example. A reasonable set of functionality would be polygons. Polygons can be rotated, translated, and scaled, and you can take their union and intersection with other polygons, and calculate their area. This is a nontrivial set of functionality which it makes sense to encapsulate. How then to make rectangles and squares? The simplest way is to have two convenience functions - one which builds rectangles, and one which builds squares. Or just make the convenience function accept a variable number of parameters, and if it only gets one to return a square.

But this example doesn't use any subclassing! I can hear the OOP gurus exclaiming now. How are people supposed to learn subclassing if you don't give them any examples of it? This is a deranged attitude. Subclassing is not an end in and of itself, it's a technique which is occasionally handy. And I'll let you in on a little secret - I personally almost never use subclassing. It's not that I one day decided that subclassing is bad and that one should avoid it, it's that as I got better at coming up with simple designs I wound up using it less and less, until eventually I almost stopped using it entirely. Subclassing is, quite simply, awkward. Any design which uses subclassing should be treated with skepticism. Any design which requires subclassing across encapsulation boundaries should be assumed to be a disaster.

Unfortunately this is hardly atypical of introductory object oriented design examples. Even the ones which are more real world tend to be awful. Take, for example, networking APIs. A typical example of an API is one which allows the creation of sockets, with blocking read calls and maybe-blocking write calls. The first few versions of Java had this sort of API exclusively. This approach is simple, seems logical to people unfamiliar with it, and is an unmitigated disaster in practice. It leads to a ton of threads floating around, with ridiculous numbers of race conditions, and awful performance because of all the threads swapping in and out. Such awfulness unfortunately hasn't stopped it from being the default way people are shown how to do things.

So what would be a better API? This is something I have a lot of experience with, so I'll give a brief summary. I'm glossing over some details here, but some of that functionality, like half-open sockets, is perhaps best not implemented.

The networking object constructor takes a callback function. Each socket is referred to using an identifier (yes, that's right, an identifier, they don't warrant individually having objects).

The methods of the networking object are as follows:

Start listening on a port

Stop listening on a port

make a new outgoing connection (returns the connection id)

write data to a socket (returns the number of bytes written)

check if a socket has buffered data to write

read data from a socket

close a socket

Here are the methods of the callback:

incoming socket created

socket closed

socket has data to be read

socket has flushed all write data from buffer

You've probably noticed that this API, while simple, isn't completely trivial and there is considerable subtlety to what exactly all the methods mean. That is an important point. For functionality of less that this level of complexity object orientation is generally speaking a bad idea, and trying to force simpler examples in the name of instruction results in poor and misleading instruction.

Unfortunately not all asynchronous networking APIs are in agreement with my example, which really says something about the state of the art in software design. I daresay that if Twisted followed this example straightforwardly (it does something similar, but awkwardly and obfuscatedly) then Tornado would never have been written.

Tue, Sep. 8th, 2009, 01:42 pm
Logging decorators in Python

I've been doing a lot of debugging lately, and have been using this bit of code to great effect:

from traceback import extract_stack, format_list
ti = 0.0

class rec:
    def __init__(self):
        self._his = []

    def __setattr__(self, key, value):
        if key != '_his':
            self._his.append((key, value, ti, extract_stack()[:-1]))
        self.__dict__[key] = value

    def dump(self):
        for key, value, t, stack in self._his:
            print '%s = %s' % (key, value)
            print 'time %f' % t
            print ''.join(format_list(stack))
 
To use this, I make a struct-like class subclass rec, call rec's init from its init, and then call dump() when I want to know what's happened to the particular object, and it displays all assignments which have happened to it, and the virtual time when they happened (I'm running a lot of simulations) and where the assignment took place.

Extremely useful, but missing in some key areas:

It should really be a decorator, rather than a base class

If someone retrieves a primitive (list or dict), then it should return a wrapped version of that, so if someone says, for example, myobj.stuff[3] = 4 then the logging should record exactly that. Right now such assignment are simply not logged, which sucks.

The virtual time should be less of a hack than a global. Maybe getting a virtual time by associating off the current thread would work. This is less general-purpose than the other functionality here, but I find it quite useful.

Can anybody please flesh out this functionality, or point me to something which does it already? I'd like for this to be part of my standard debugging toolbox, the first thing I go to when a simulation I'm running hits a complex and hard to track down bug.

Tue, Aug. 11th, 2009, 09:43 pm
Tracker Ball Puzzle


People seemed to not understand that these puzzles are collaborations between me and Oskar, even though he clearly explains it in the videos, so I did the presentation in this one.

Thu, Jul. 30th, 2009, 10:26 pm
3D printing, interview with Bram

There's a new interview with me about 3d printing, puzzles, and a few bits of other random stuff.

Fri, Jul. 17th, 2009, 08:25 am
Oskar's Mixup Cube


This puzzle looks like a Rubik's Cube, but it allows some moves which clearly shouldn't be.

Thu, Jul. 9th, 2009, 03:24 pm
Someone at Mozilla Foundation needs to be fired

Somebody at Mozilla decided they need lots of 'true' random numbers.

My patience for this subject completely ran out about five years ago, so this post is going to show a complete lack of diplomacy. I would like to emphasize, in advance, that this is my honest, reasoned opinion, not said in anger, and that if you ask my opinion again in the future I'll say the exact same thing.

Once a computer has collected a small number of 'true' random bits (maybe it's 128, maybe it's 256, but regardless it's small) there's no need whatsoever for it to block on collecting more 'random' numbers. A pseudorandom number generator based on AES will be able to generate random numbers based on that seed until the end of the universe and noone constrained by the laws of physics and math will ever be able to tell the difference between that and 'true' random numbers. This is extremely well established cryptography. To require 'true' random numbers is, to use an apt analogy, wankery. It does not, and cannot, do anything to improve security, and it mostly just causes huge amounts of pain. It is (and I repeat myself, because I have a hunch people will think I'm glossing over some nuance here) of no benefit whatsoever.

My advice to the Mozilla foundation (and again, this is my reasoned opinion, not said in anger, and I won't be changing my mind later): find out who was responsible for this policy of requiring lots of 'true' random numbers, and fire them. Fire them today. They have demonstrated gross incompetence, a total lack of understanding of the very most basic concepts in security.

Some people might think that if I knew more about who was behind this and what their specific motivations are, then that might change my mind. That is incorrect. The security field is filled with people who to non-experts seem very impressive and knowledgeable, especially when they're advocating, and even moreso demanding, very painful and difficult things in the name of security. Most of these people are frauds. I have had it with paying homage to the concept of impartiality when discussing these peoples's opinions. If someone spouts a bunch of technical mumbo-jumbo to bring the conversation to a place which a lay person has trouble understanding, then they may be able to make the argument be based on pure rhetoric, but gross incompetence is still gross incompetence, and despite having found an effective way to bullshit their way through, they're still wrong.

Tue, Jul. 7th, 2009, 11:30 am
Ludology in City of Heroes

City of Heroes has had some interesting issues with its gameplay, involving a character named Twixt using some tactics which made everybody hate him.

Several years ago I happened to be seated next to the designer of City of Heroes at an event where he won something. He was a pudgy guy, wearing big round glasses, with a white city of heroes t-shirt and a blue cape on. We got into a conversation about his game, and I asked what it was that made it compelling, and he said that it's every kid's fantasy of being a superhero, and it was very obvious that he'd based the game's design on his own. I asked him if City of Heroes is compelling as a pure abstract game, and the interesting response was that he didn't understand the question. After a few minutes of conversation he got what I was asking, and his answer, which really perplexed me at the time, was that it was a good question, but he didn't know.

Consider a game with the following semantics: You sit, unmoving, for two hours, with no user feedback, no buttons to push, nothing, completely passive, while the game plays out in front of you, exactly the same way as it would for anybody else. This sounds like a terrible game, but it's exactly what movies are, and movies are very popular and get little criticism that they're terrible games.

The Twixt problem was caused not so much by any one person behaving unreasonably as the game engine having a problem. There's a battle tactic which is quite effective but has the effect of wiping out an enemy without even giving them a chance to play, making it not much fun for them and it doesn't even get much credit for you. Because City of Heroes is more fantasy than game, players have a convention of not using this tactic, because that maximizes the fun of play. This is done at the expense of an individual's success taking the game as a sport, but since the game isn't a sport, people don't worry about that too much. Real sports don't involve dressing up as superheroes (except for figure skating, but that isn't a real sport). What really should be done is that the rules should be modified so that the particular tactic isn't so nasty. It's a general rule of game design that all players should get a chance to play and have fun, even if they aren't very good, and tactics which allow a better player to win without the weaker player even having a chance to try to retaliate are no fun.

Mon, Jul. 6th, 2009, 04:07 pm
Bandwidth fundamentals

A random person asks about something they read on Wikipedia:

Example from wiki below:

http://en.wikipedia.org/wiki/Bram_Cohen

quote from site:

MojoNation allows people to break up confidential files into encrypted chunks and distribute those pieces on computers also running the software. If someone wanted to download a copy of this encrypted file, he would have to download it simultaneously from many computers. This concept, Cohen thought, was perfect for a file sharing program, since programs like KaZaA take a long time to download a large file because the file is (usually) coming from one source (or "peer"). Cohen designed BitTorrent to be able to download files from many different sources, thus speeding up the download time, especially for users with faster download than upload speeds. Thus, the more popular a file is, the faster a user will be able to download it, since many people will be downloading it at the same time, and these people will also be uploading the data to other users.

This explanation was lifted from an actual new article, which doesn't necessarily mean it's true. In fact, it's somewhere between grossly misleading and wrong.

There's a classic fallacy because if one person stands up during a concert they get a better view, then if everybody stood up during a concert they'd all get a better view. This is of course is not true - they wind up slightly worse off by all standing, because they all compete with each other for a view. The same thing happens with downloading from a server. In general, web servers will give about the same rate to every client downloading from them, so if you open many more connections than everybody else you get a greater proportion of the bandwidth and hence a better rate. But you do so simply by taking bandwidth from other downloaders. The overall supply of upload is unchange, it's simply being shuffled around. If everybody does the same thing it results in overall slightly worse performance and you're basically back where you started, but with a bunch of headaches tacked on.
 
So why does BitTorrent perform so well? Quite simply, because it does a better job of finding more places to do uploading. Any peer which is downloading is in general willing to upload as well, and their uplink is usually unutilized, so if you can get a peer to start uploading as soon as it starts downloading, and keep uploading as long as possible, and saturate its link while it's uploading, then overall performance will be better. It doesn't necessarily help to transfer over more connections, or make more different things available at the same time, or use error correcting codes. In fact, all of those are a complex tradeoff between benefits and costs, with the net result being that small amounts of them can help reliability and robustness, but in general it's good to keep things simple and be polite to the network.

On the internet, the formula is bytes downloaded = bytes uploaded. It's that simple.

Mon, Jul. 6th, 2009, 02:50 pm
The Uncanny Cube



From deep in the uncanny valley of the Rubik's Cube, it's the Uncanny Cube. At first blush this appears to be a slight variant, but it is quite profoundly and perversely different.

Fri, Jul. 3rd, 2009, 09:41 am
Bram's Cube



This is Bram's Cube, an idea I'm very fond of. It's very interesting to solve, since the middle layer and everything else can be thought of independently and solved on their own, but that scrambles the part you weren't thinking of.

Mon, Jun. 15th, 2009, 07:55 am
New Puzzles - Bram's Black Hole and Fairly Twisted


Bram's Black Hole is a concept which occurred to me a few minutes after the first time I ever played with a puzzle similar to Peter's Black Hole.


Fairly Twisted is a twisty puzzle with no symmetry.

Mon, Jun. 8th, 2009, 02:10 pm
Puzzle Ring tweaking

It turns out that when cadding a puzzle ring, some issues turn up. Specifically, when a band goes over one band then immediately under another, it frequently doesn't have enough space to squeeze through between them.

I've come up with some techniques for dealing with this, and drew the following diagrams, which I think are kind of pretty, and I'm proud of my diagramming technique involved :-)


         ___
_______ /   \ _______
       \     \
      / \   / \
___   |  |  |  |  ___
   \ /   \ /   \ /
    \     /     \
___/ \   / \   / \___
      |  |  |  |  
      \ /   \ /
_______\     \_______
        \___/ 


         ___   __________
_______ /   \ /          ___________
       \     /          / \
      / \   / \    __   |  \
___   |  |  |  |  /  \ /    | ______
   \ /   \ /   \ /    /     \
    /     /     \    / \   / \
___/ \   / \   / \   |  |  |  |  ___
      |  |  |  |  \ /   \ /   \ /
      \ /   \ /    \     /     \
______ \     \    / \   / \   / \___
        |   / \__/   |  |  |  |
        \   |        \ /   \ /
___________/          \     \_______
          \__________/ \___/


              __   ___   __
       ___   /  \ /   \ /  \   ___
_____ /   \ /    \     /    \ /   \ ________
     \     \    / \   / \    /     /
    / \   / \   |  |  |  |  / \   / \   
___/   \ /   \ /   \ /   \ /   \ /   \ _____
        \     \     \     /     /     /
       / \   / \   / \   / \   / \   / \
____   |  |  |  |  |  |  |  |  |  |  |  L___
    \ /   \ /   \ /   \ /   \ /   \ /
     /     /     /     \     \     \
____/ \   / \   / \   / \   / \   / \   ____
       |  |  |  |  |  |  |  |  |  |  |  |
       \ /   \ /   \ /   \ /   \ /   \ /
___     /     /     /     \     \     \_____
   \   / \   / \   / \   / \   / \   /
    \ /   \ /   |  |  |  |  \ /   \ /
_____/     /    \ /   \ /    \     \________
      \___/ \    /     \    / \___/
             \__/ \___/ \__/

Sun, May. 31st, 2009, 10:13 pm
Kite wind power

There are a number of different projects going on to harness wind power using kites. Most of them are utter lunacy, involving so many gizmos and unproven techniques that you might as well be trying to get power from lightning. That said, I have an approach which I think is plausible, but before even idly thinking about working on it I'd like to know if the whole concept of kite-based power even works on paper. Gathering the necessary information and running the numbers is a bit beyond my off the top of the head skills, so I'd greatly appreciated it if anyone would help me calculate the following:

For several points of interest on earth (particularly plausible/notable/typical places) calculate the following for both ground level and an altitude of half a kilometer:

What is the minimum, average, and max wind velocity? For the average wind velocity what would be the pull of a kite with an area of a hundred square meters? What would be the weight of the kevlar necessary to hold down something producing that much force at the maximum wind velocity? After factoring in holding up all that kevlar, could a kite stay up at minimum wind speed? How much power could be generated by its remaining upwards force? How much of that would be lost in the electrical generator? How does that amount of power output compare to the per capita max and average power consumption in the United States?

My suspicion is that flying kites at an altitude of more than a few hundred meters is simply not worth it, and that the optimal kite size in practice is around a thousand square meters, and that in an area where the wind never dies down a kite system compares quite favorably to a turbine system, once you get all the engineering problems worked out.

Also, I have a dumb question about controls - At what altitude do the sorts of control systems used for power kites simply stop working due to even very tensioned materials having a lot of slop at that length?

Sun, May. 31st, 2009, 09:52 pm
Maybe goto isn't so bad after all.

I've used this pattern a lot over the last few days:

for eggs in spam:
    for jello in eggs:
        if jello is the one I want:
            break
    else:
        continue
    break
else:
    return
do stuff with eggs and jello


People familiar with my code will find this amount of complex control flow unsurprising. But the reasoning here is different from my past practices. In the past I always optimized to have as few lines of code as possible, regardless of how upside down and backwards it made everything be organized, but that could be done much better in this case like so:

for eggs in spam:
    for jello in eggs:
        if jello is the one I want:
            do stuff with eggs and jello
            return


There are two reasons I find this construct objectionable. First is that it leads to a lot of code having a lot of indentation (the 'do stuff' is in many cases dozens of lines) which isn't good for readability. Second is that the natural flow of the code is 'scan for the thing I want, and then do something with it' which is the flow of the structure I've been using. I suppose one could write it like this:

def my_little_func():
    for eggs in spam:
        for jello in eggs:
            if jello is the one I want:
                return eggs, jello
    return None, None

eggs, jello = my_little_func()
if eggs is None:
    return
do stuff with eggs and jello


But that's hardly elegant and forces the flow to jump around in terms of the visual layout of the code, which goes against what I'm trying to accomplish.

At the risk of starting a religious war, I'd like to point out that the most readable and versatile way of doing what I want is like so:

for eggs in spam:
    for jello in eggs:
        if jello is the one I want:
            goto x
return
x: do stuff with eggs and jello


But short of that the following may be the best option currently available. I might switch to it:

for eggs in spam:
    for jello in eggs:
        if jello is the one I want:
            after_func(eggs, jello)
            return

def after_func(eggs, jello):
    do stuff with eggs and jello


But it still doesn't look as nice as the version with a goto.

Thu, May. 28th, 2009, 10:00 pm
Geary Cube


This is the Geary Cube puzzle, another Bram and Oskar collaboration. It's externally identical to a Rubik's Cube, but has internal gearing which forces opposite faces to turn in opposite directions simultaneously. It's my favorite subgroup of the Rubik's Cube (you can try it yourself with a regular Rubik's Cube) and easily makes a bunch of pretty patterns. It's overall easier than a regular Rubik's Cube is, but the solution technique is very different and interesting and not related to the standard ways one solves twisty puzzles.

Fri, Apr. 17th, 2009, 04:41 pm
CodeCon 2009 now streaming live

Live streaming of CodeCon 2009 is now up. It's linked from the CodeCon 2009 front page.

Thu, Apr. 16th, 2009, 06:02 pm
CodeCon 2009 starting Tomorrow

CodeCon 2009 starts tomorrow. We're just finishing up getting ready, and it's going to be great. See everybody there!

Wed, Apr. 8th, 2009, 12:37 pm
I'm twittering

I've signed up for a twitter account, and have been posting there a lot more than here. I also linked my Facebook and Twitter accounts, and started using bit.ly. The facebook reposts get a lot more comments, because I have more friends there, and oh yeah, Twitter doesn't have comments yet. Please fix.

In the future, I'll post headline links to all my blog posts on the twitter stream for all my regular blog posts, and continue to post here for thoughts which require more than one sentence to convey.

Sun, Apr. 5th, 2009, 10:03 am
The End is Nigh

Science fiction authors have dreams of the world's last days, and our distant descendants watching the sun flame out about four billion years from now. They're living in fantasy land. There's no way we're going to last even close to that long.

You see, the sun is getting hotter as it approaches flameout, and the earth is slowing down in its spin on its axis as tidal forces drag it down. As the earth's spin slows down, the cycle between day and night will become longer. A mere billion years from now the earth will stop spinning completely, and the light side will turn into smolding embers while the dark side will become a truly frozen wasteland. Long before that the day/night cycle will become so long that peak daytime temperature will surpass the boiling point of water, causing moisture to get into the upper atmosphere and get blown away by the solar wind, depleting the oceans while cooking all the water-based life forms during the day.

There is only one way to avoid this. We must force the earth's temperature down until the oceans freeze, thus stopping tidal forces and allowing the earth to continue with its current length of day/night cycle indefinitely and allowing us to live an extended, if slightly chilly, existence. Anyone opposed to this plan is engaged in strictly short term thinking.

Wed, Apr. 1st, 2009, 08:31 am
Codecon

Remember that today's the last day to get the early bird discount to Codecon 2009, so if you haven't signed up yet, do it now.

Fri, Mar. 27th, 2009, 12:12 pm
Rotacubes


This is the Rotacubes puzzle, a collaboration between me and Oskar van Deventer. Nobody could make out how this one worked from pictures/text descriptions, so here's a video.

Mon, Mar. 23rd, 2009, 12:18 pm
CodeCon 2009 Program Posted

The Program to CodeCon 2009 is now up, and registration is at the early rate of $75 for all three days until April 1st, after which the cost goes up.

An explanation of CodeCon, from the site:
CodeCon is a conference of working demonstrations. Three days of quick presentations of actual working code, shown by people who wrote them, all for under $100. Past highlights include early presentations of BitTorrent, PGP, and SiteAdvisor.

Presentations this year include:

BioHack! Track:
Code Track:
To learn more about CodeCon, you can look at the sites from previous years:

2006
2005
2004
2003
2002

Digg this!

Sat, Feb. 28th, 2009, 11:14 pm
version control tidbits

I wrote up an explanation of when patience diff gives significantly different results, and why they're important.

The trickiest case for line based merging is when the number of blank lines between two functions changes. Were the new lines inserted at the beginning or the end? If two different people add a single line, should that clean merge? What if one person adds one blank line and another person adds two? Should that be a conflict of one blank line on one side and two on another, or a conflict of no lines on one side and a blank line on the other with a single line added, or no conflict and merge to the original length plus two, or no conflict and merge to the original length plus three? How about if a function was added in the middle of a section of blank lines and the number of blank lines after it was changed - should the new lines be at the end of the new function or the beginning of the old one? What if lines were removed, where should they have been removed from?

There aren't clear answers to these questions. Any answer you come up with will involve some tradeoffs, and it's a judgement call what behavior really is best. The one clear lesson is that you shouldn't change the number of blank lines between functions willy-nilly, because it will confuse the version control system.

A difficult distributed merge scenario which hasn't been written up anywhere is one I like to call the circular staircase. Three different branches are all made off of trunk, and all make changes to the same section of code. Think of the branches as all sitting around a circular table. Each branch then simultaneously pulls the most recent version of the branch to their right, and resolves the conflict in favor of their version. This situation is weird because any two of the three branches will now do a clean merge together, but pulling in the third one should cause a conflict. I won't give the full explanation as to why that is here, because my point is that the scenario is completely counterintuitive and inscrutable. As soon as the relationship between branches ceases to be a tree, it becomes impossible for users to understand what's going on. The lesson is that you should have clear relationships between your branches, and not do anything goofy. At this point I'm in favor of the version control system not allowing you to do goofy things, and keep track of the coherent relationships between branches which you do have. I have more ideas than were given in my last post on this subject, but more on that at a later time.

Another interesting case is the following:
     a
    / \
   b   c
  / \ /
 c   b
  \ /
   ?

If we want to support cherry-picking, we have a problem here. On the one hand, the change to c has already been applied to and rejected from on the right hand side. On the other hand, the change to c on the left happened 'after' the change on the right, so by the staircase criterion the new value should be c.

I think that if you want to support implicit cherry-picking you need to suck it up and accept that this scenario creates the weird action at a distance of merging to b. I don't really want to argue it all that much though, because the horse has already been beaten to death, and basically nobody in the real world is asking for implicit cherry-picking, because nobody has proposed a good UI for it, and nobody other than a handful of experts understands how it behaves, and because it directly conflicts with implicit undo.

Wed, Feb. 25th, 2009, 09:58 am
Trigears video


When I last posted about this several people guessed that it's easy. It's not, it's quite difficult, and the logic behind the solution is very interesting. It also has no planar equivalent. 

Tue, Feb. 17th, 2009, 12:06 pm
Bram explains unix time



Here's me explaining unix time. A thought on appearing on TV - if somebody puts a video camera on you, it's best to babble on about all your thoughts on a matter without concern for time, because they're going to edit it down later.

Fri, Feb. 13th, 2009, 03:17 pm
CodeCon deadline coming up

The deadline for CodeCon submissions is coming up this sunday! If you've got a cool project with actual working code, you should submit. CodeCon is designed to only require a few minutes to put together a submission if you've got something to show, and the amount of time to present is short enough that all you need to do is cover your project to fill it. Get your submissions in!

Tue, Feb. 10th, 2009, 05:09 pm
Corrections

Believe it or not, you can't believe everything you read on wikipedia.

I currently live in the San Francisco area.

I have three children.

I've never heard of badger and badger.

Also, the article about my asperger's said that I was in college for one year. I was actually there for two years, but flunked most of my classes my last term (I knew I was leaving and never wanted to come back, so didn't see much point in doing any schoolwork).

Thu, Jan. 29th, 2009, 10:17 am
Sea Snakes

Earlier this week (I think it was on Monday) I was riding the ferry to work, and saw a sea snake in the water. It was a couple feet long (my guess would be about four, but it was at a distance so that could be way off) and black. A minute or so later I saw another one, and then got the amazing site of a cluster of what I'm guessing was over a hundred of them. They were arranged into approximately parallel groupings in regions, with an overall effect of looking like a line drawing of an octopus.

I hadn't seen a sea snake in the water here before, so didn't think much of it, but then I looked it up online and it turns out hardly anyone else has seen them up here either! There's a species of sea snake native to California, but they live way down south, and aren't even all that common down there. Sitings up here are extremely rare.

Unfortunately I didn't take a picture. Were I used to using my phone camera and had any idea how unusual it was to see that I'd definitely have taken some footage.

Thu, Jan. 15th, 2009, 04:21 pm
Linked Cranks Puzzle



Tue, Jan. 13th, 2009, 03:55 pm
CodeCon Call For Papers out

For everybody who's been wondering and long anticipating when the next CodeCon is going to happen, here's the announcement:


CodeCon 2009
April 17-19, 2009
San Francisco CA, USA
www.codecon.org

Call For Presentations

CodeCon is the premier showcase of cutting edge software development. It
is an excellent opportunity for programmers to demonstrate their work and
keep abreast of what's going on in their community.

All presentations must include working demonstrations, ideally accompanied
by source code. Presentations must be done by one of the active developers
of the code in question. We emphasize that demonstrations be of *working*
code.

We hereby solicit papers and demonstrations.

    * Papers and proposals due: February 15, 2009
    * All Authors notified: March 1, 2009

Possible topics include, but are by no means restricted to:

    * community-based web sites - forums, weblogs, personals
    * development tools - languages, debuggers, version control
    * file sharing systems - swarming distribution, distributed search
    * security products - mail encryption, intrusion detection, firewalls
    * malware analysis - detection, compensation, and mitigation of
      emerging threats

--

As a new feature this year, CodeCon will be presenting a Biohack! track.
While we will continue our tradition of presenting only one talk at a
time, a portion of one of the days' talks will be reserved for interesting
biotechnology hacking projects. A key requirement for these presentations
is ease of reproduction with minimal access to expensive laboratory
equipment.

Example topics include:

    * Purifying DNA using common household items
    * Developing genetically-modified bacteria in a kitchen laboratory
    * Using specially-designed software to assist in bioengineering
    * The use of simple bioengineering techniques to solve real-world
      problems.

Ideal Biohack! Track submissions will have a strong emphasis on the
"hack" portion of the talk -- in the last few years, there has been a
strong growth in the community of biology hackers; we aim to bring these
hackers together to discuss their techniques for inexpensive, at home
experimentation in biological engineering research.

--

Presentations will be 30 minutes long, with an additional 15 minutes
allocated for Q&A. Overruns will be truncated.

Submission details:

Submissions are being accepted immediately. Acceptance dates are
February 7th and March 1st. After the first acceptance date, submissions
will be either accepted, rejected, or deferred to the second acceptance
date.

The conference language is English.

The conference venue is open to all ages.

Ideally, technical demonstrations should be usable by attendees with
802.11b connected devices either via a web interface, or locally on
Windows, UNIX-like, or MacOS platforms. Cross-platform applications are
most desirable. Biohacking demonstrations should be viewable with a
presenter-provided camera, or prepared movies for projection.


To submit, send mail to submissions-2009@codecon.org including the
following information:

    * Project name
    * Code track or Biohack! track
    * url of project home page
    * tagline - one sentence or less summing up what the project does
    * names of presenter(s) and urls of their home pages, if they have any
    * one-paragraph bios of presenters, optional, under 100 words each
    * project history, under 150 words
    * what makes the project novel -- how it differs from similar projects
    * what will be done in the project demo, under 200 words
    * slides to be shown during the presentation, if applicable
    * future plans

General Chairs: Jonathan Moore and Bram Cohen
Program Chair: Jered Floyd and Len Sassaman

Program Committee:

    * Jon Callas, PGP, USA
    * Bram Cohen, BitTorrent, USA
    * Roger Dingledine, The Tor Project, USA
    * Jered Floyd, Permabit, USA
    * Ben Laurie, Google, UK
    * Nick Mathewson, The Tor Project, USA
    * David Molnar, University of California, Berkeley, USA
    * Jonathan Moore, Mosuki, USA
    * Meredith L. Patterson, Osogato, USA
    * Andrew S. Peek, Integrated DNA Technologies, USA
    * Len Sassaman, Katholieke Universiteit Leuven, BE
    * Cliff Skolnick
    * Paul Syverson, Naval Research Laboratory, USA
    * [Others may be added]

Sponsorship:

If your organization is interested in sponsoring CodeCon, we would love to
hear from you. In particular, we are looking for sponsors for social meals
and parties on any of the three days of the conference, as well as
sponsors of the conference as a whole and donors of door prizes. If you
might be interested in sponsoring any of these aspects, please contact the
conference organizers at codecon2009@codecon.org

Press policy:

CodeCon provides a limited number of passes to qualifying press.
Complimentary press passes will be evaluated on request. Everyone is
welcome to pay the low registration fee to attend without an official
press credential.

Questions:

If you have questions about CodeCon, or would like to contact the
organizers, please mail codecon2009@codecon.org. Please note this address
is only for questions and administrative requests, and not for workshop
presentation submissions.


50 most recent