Log in

No account? Create an account

Mon, Sep. 14th, 2009, 04:21 pm
Awful Programming Advice

I just came across a blog post going over that old saw of Object Oriented Design, a square being a subclass of a rectangle.

This advice is worse than useless. It's either wrongheaded or meaningless, depending on how literally you take it.

Taken literally, it would never make sense to make a full-blown class for such a trivial piece of functionality. There simply would be more lines of code taken up making declarations than could possibly be saved by convenience. Taken less literally, it's just gibberish, a completely nonsensical way of thinking about the problem, like teaching a drawing class where you cover pentagrams.

So what would be a sane way of building this functionality? Well, first you have to decide what the functionality actually is, since there wasn't any actual functionality in the first example. A reasonable set of functionality would be polygons. Polygons can be rotated, translated, and scaled, and you can take their union and intersection with other polygons, and calculate their area. This is a nontrivial set of functionality which it makes sense to encapsulate. How then to make rectangles and squares? The simplest way is to have two convenience functions - one which builds rectangles, and one which builds squares. Or just make the convenience function accept a variable number of parameters, and if it only gets one to return a square.

But this example doesn't use any subclassing! I can hear the OOP gurus exclaiming now. How are people supposed to learn subclassing if you don't give them any examples of it? This is a deranged attitude. Subclassing is not an end in and of itself, it's a technique which is occasionally handy. And I'll let you in on a little secret - I personally almost never use subclassing. It's not that I one day decided that subclassing is bad and that one should avoid it, it's that as I got better at coming up with simple designs I wound up using it less and less, until eventually I almost stopped using it entirely. Subclassing is, quite simply, awkward. Any design which uses subclassing should be treated with skepticism. Any design which requires subclassing across encapsulation boundaries should be assumed to be a disaster.

Unfortunately this is hardly atypical of introductory object oriented design examples. Even the ones which are more real world tend to be awful. Take, for example, networking APIs. A typical example of an API is one which allows the creation of sockets, with blocking read calls and maybe-blocking write calls. The first few versions of Java had this sort of API exclusively. This approach is simple, seems logical to people unfamiliar with it, and is an unmitigated disaster in practice. It leads to a ton of threads floating around, with ridiculous numbers of race conditions, and awful performance because of all the threads swapping in and out. Such awfulness unfortunately hasn't stopped it from being the default way people are shown how to do things.

So what would be a better API? This is something I have a lot of experience with, so I'll give a brief summary. I'm glossing over some details here, but some of that functionality, like half-open sockets, is perhaps best not implemented.

The networking object constructor takes a callback function. Each socket is referred to using an identifier (yes, that's right, an identifier, they don't warrant individually having objects).

The methods of the networking object are as follows:

Start listening on a port

Stop listening on a port

make a new outgoing connection (returns the connection id)

write data to a socket (returns the number of bytes written)

check if a socket has buffered data to write

read data from a socket

close a socket

Here are the methods of the callback:

incoming socket created

socket closed

socket has data to be read

socket has flushed all write data from buffer

You've probably noticed that this API, while simple, isn't completely trivial and there is considerable subtlety to what exactly all the methods mean. That is an important point. For functionality of less that this level of complexity object orientation is generally speaking a bad idea, and trying to force simpler examples in the name of instruction results in poor and misleading instruction.

Unfortunately not all asynchronous networking APIs are in agreement with my example, which really says something about the state of the art in software design. I daresay that if Twisted followed this example straightforwardly (it does something similar, but awkwardly and obfuscatedly) then Tornado would never have been written.

Tue, Sep. 15th, 2009 02:56 pm (UTC)

Twisted could in principle work, in fact someone's already implemented swapping out the Tornado networking layer with Twisted (at some cost to performance). The main problem is that Twisted looks like a convoluted monster, and it's extremely unclear how to get started from just looking at it.

Tue, Sep. 15th, 2009 10:02 pm (UTC)

Do you think there's room to do much better than Twitter does? How?

Thu, Sep. 17th, 2009 11:35 pm (UTC)

How does twitter do networking?

Fri, Sep. 18th, 2009 07:10 am (UTC)

Twisted. Much better than Twisted does.

There's almost nothing Twitter does that there isn't room to do much better.