I've seen a lot of discussion of great programmers, usually centering on how to find them, but usually what people really want to know is how to become one. Since I'm widely considered to be a great programmer, I'll give some advice.
First of all there's raw coding ability. For this, practice makes perfect. Implementing lots of algorithms from, say Introduction to Algorithms can help sharpen your technical abilities, but really the important thing is to have some experience. Anyone with enough natural talent will get good at basic raw coding.
There are only two coding skills which mostly people who are completely self-taught as a programmer miss out on: proper encapsulation, and unit tests. For proper encapsulation, you should organize your code so that changes which require modifying code in more than one module are as rare as possible, and for unit tests you should write them to be pass/fail so that all unit tests can be run as a comprehensive suite. And now you know everything you need to about those two things. Anyone who is taught the above guidelines, and decides they really want to learn those skills, will with sufficient practice become good at them.
Coding skill is all well and good, and you can't become a great programmer without it, but it's far from everything. I'm decent at raw coding, but I know many people who are better, and some of them are abysmal programmers. I in particular can't deal with being tasked with fixing up spaghetti code. My brain simply locks down and refuses to make any modifications which it isn't convinced will work, which is of course impossible when the source material is an incurably bug-ridden mess.
What truly separates the great programmers from the journeyman programmers is architecture. What's puzzling is that architecture appears to be one of the simplest parts of the whole process, requiring in most cases little more than some pencil and paper calculations and a willingness to change.
The simplest architectural problems to solve are the ones which for lack of a better theory most people ascribe to emotional or psychological problems. These are decisions for which there's no rational justification whatsoever. For example, writing a non-speed-critical program (which is most of them) in C or C++. A few years ago you could justify that because the other languages didn't have such extensive libraries, but today it's ludicrous. Another one is building one's protocol as a layer on top of webdav. And another one is building a transactional system for retrieving any subsection of any point in the history of an arbitrarily large file in constant time when that isn't part of project requirements. Yes, I'm making fun of subversion here. It's a great example of a project permanently crippled by dumb architectural decisions.
Half of these 'emotional' architectural decisions are dogmatically using a past practice in situations where it's inapplicable. The other half are working on interesting problems which have little or no utility in the finished product. Once decisions like these have been made, questioning them can become a political impossibility. If someone new comes in to a project with many man-years on it, and in their first week learns that there's a networking call which includes a parameter as to whether it should be blocking or non-blocking, and immediately declares that the entire codebase is a mess and difficult if not impossible to maintain, they'll almost certainly be correct and justified, but their opinion will likely be disregarded as as brash and ill-informed. After all, they haven't spent the kind of time on the codebase than everybody else. I've actually had this happen to me, and while others have claimed that there are more political ways of approaching such problems, my experience has been that once the truth becomes unthinkable a couple people need to get fired before any improvement can be made.
My advice about technically unjustifiable architectural decisions is to not do them. If you find yourself doing them, you probably need to get laid or see a shrink or have a beer.
But what if you're emotionally well-adjusted, and want to get better at software architecture? Logging more hours at work will get you nowhere. When I wrote BitTorrent multiple other people were working on the exact same problem, most of them with a big head start and a lot more resources, and yet I still won easily. The problem was that most of them simply could not have come up with BitTorrent's architecture. Not with 20 code monkeys working under them. Not with a decade to work on it. Not after reading every available book on networking protocols. Not ever.
Clearly this isn't because BitTorrent's architecture is terribly difficult to understand. The entire approach can be understood without any really hard thinking in about an hour, with the possible exception of the state machine for the wire protocol, and even that is extremely simple as state machines go. The realy difficulty in coming up with something like BitTorrent is that it involves fundamentally rethinking all of your basic approaches. This is very difficult for humans to do. We attack any new problem we encounter with techniques we already know, and try small modifications if difficulties turn up.
My suggestion for learning software architecture is to practice. Obviously you can't practice it by doing hundreds of projects, because each one of them takes too long, but you can easily design a hundred architectures for problems which only exist on paper, and where you strive to just get the solution to work on paper. Start by modifying the requirements of a problem you're working on. What if the amount of bandwidth or CPU was a hundredth what it currently is? What if it were a thousand times? A million? What if you had a thousand times as much data? A million? A billion? What if the users were untrusted and you had to either prevent them from damaging the system or have a means of fixing things when they did? It doesn't matter if these scenarios are totally unrealistic, what matters is that they're different and that when you try to find architectures for handling them you take the inputs just as seriously as if you were about to start writing a system with those requirements for work. Try to find as many different approaches as you can, and come up with scenarios in which the stranger ones would be better.
Learning these skills takes time, but is definitely worth it. I couldn't have come up with Codeville's architecture without first having spent a lot of time working on voting algorithms. Not that voting algorithms have anything to do with version control, but the process of coming up with example scenarios and defining the behavior which should happen in each of them carries over very well.
Screened comment
December 21 2004, 08:03:16 UTC 7 years ago
Re: Hey Bram
That depends what sort of programmer you wish to become. If you want to dabble in computers because computers seem important to you, I suggest learning Python. If you want to learn computers because you're fascinated by their inner workings, I suggest learning C (not C++, that has a lot of extraneous cruft). Most people fall into the first category.BitTorrent took two years of full time work to get to the not sucking stage, and another year to become reasonably mature.
December 25 2004, 22:42:59 UTC 7 years ago
Re: Hey Bram
Python is a good introductory programming language, not least of which because programmers with very good taste like to use it. That means the code you encounter on the net that's written in Python is usually of pretty good quality.Plus, it's very common to find Python code that has unit tests.
January 19 2008, 17:32:13 UTC 4 years ago
Learning C (not C++)?
If you're fascinating by the inner workings of the computer, learn assembly language! Until you understand what an individual instruction can or cannot do, you won't really understand the details. C is not really that much closer than VisualBasic.I wrote plain C for years, before C++ was common. Yet I'd say it's kinda dumb to learn C and no C++ at all.
I've interviewed hundreds of programmers. I get quite a few who say, "The only C++ features I really use are vector and string (and sometimes hash_map.)". This is a very logical attitude -- the overwhelming majority of programs use strings and lists. Do you really want to roll your own -- do you really think you can do better than the STL?
The other essential part of C++ is destructors. As a long-time C programmer I can tell you that the largest single source of C errors is resource leaks, the second being errors involving char* pointers and stdlib (e.g. not allocating that extra byte for the \0 -- which might work fine 99% of the time and yet cause random intermittent errors...)
The one interesting thing you learn from learning C is how to deal with pointers. Pointers are Hard -- there's some question in my mind as to whether they're Hard Useful or Hard Dangerous (you can certainly shoot yourself in the foot much more easily with pointers) -- but once you have them mastered you can often generate some astonishingly fast code
Anonymous
December 21 2004, 05:59:04 UTC 7 years ago
Codeville architecture
So how about a post about Codeville architecture, then, eh? ;-)Your competito^Wcolleagues are curious!
(I guess the answer might be "then go to CodeCon", but while Graydon and I talked about it some, he's busy then and I was busy now (and don't really have $80 to throw at it anyway), so no Monotone submission. Maybe some other time.)
-- Nathaniel Smith <njs@pobox.com>
December 21 2004, 08:11:11 UTC 7 years ago
Re: Codeville architecture
Codeville's documentation, especially its architectural documentation, has lagged far behind its implementation, mostly because implementation is a higher priority that documentation, especially when we haven't even hit 1.0 yet, and there's still the occasional significant change.What I'd really like to see is a paper comparing the architectures of Darcs, Monotone, and Codeville, although I'm not sure that there's a single person who's grokked two out of those three systems.
Are you local to the San Francisco area?
Anonymous
December 21 2004, 10:38:35 UTC 7 years ago
Re: Codeville architecture
I noticed that ;-). Really, I'd love to see just the one-piece-of-8.5x11 version. Yeah, interesting idea. I grok Monotone relatively deeply, obviously, but I haven't had a chance to actually play with Darcs or Codeville yet. I think I have a pretty good idea of what the ideas in Darcs are, at least the 1000-foot view, but not Codeville at all. I suspect writing such a paper would be a fascinating exercise for the author -- this space is so rich in ideas that refuse to stay put; it seems like every time I turn around I realize there's yet another completely different way of looking at some piece of the problem domain. Some days I think, say, Arch/Darcs/Monotone are deep-down basically identical, some days completely different at every level... trying to pin down a comparison on paper sounds mind-bending :-). I also bet that if 3 people wrote such papers, you'd get 3 completely different conclusions... it's interesting that AFAICT all the VCS comparison pages out there are just completely useless. How to usefully parametrize the space of tools is as much of a research question as everything else in the area... Yeah, sometimes -- IIRC we actually met very briefly a few years ago, at some EFF thing at the UC law building. Why? - Nathaniel7 years ago
Anonymous
7 years ago
7 years ago
December 21 2004, 10:19:27 UTC 7 years ago
Age and development
I can't help wondering- I'm 16, and in high school in San Diego, CA. I've been programming for perhaps 2 years, and have gotten fairly good at it for a kid, but nothing compared to most of the programmers I meet and talk to. After studying code in college, for example, does writing applications (especially with GUIs) get easier? Or should I give up for "not having the gift"? I know you're not a counselor, but answering this question would mean quite a lot to me.- David
Anonymous
December 21 2004, 10:49:42 UTC 7 years ago
Re: Age and development
http://www.norvig.com/21-days.htmlAnonymous
December 21 2004, 15:52:43 UTC 7 years ago
Re: Age and development
I learned how to program GUIs while I was still in high school. In fact, I had only known C++ for 5 months at the time. I think the key is the API you choose. I learned the BeOS API which was(is?) quite easy to wrap one's head around. I would recommend Qt nowadays.As for the college question, nothing I learned in college really applies to GUI programming. College exposed me to many other languages and programming paradigms. While those are good experience, they aren't quite the same as learning an API.
So really, what I'm trying to say is that you are as ready to learn GUI programming now as you ever will be. Studying programming in in college is just 4 years of practice with fundamental theory thrown in for good measure.
Anonymous
December 22 2004, 01:24:29 UTC 7 years ago
Bram...
Bram, How would I go about learning C? would you recommend any books? What are you confined to with C? as I know VB stops you from doing many things. If i took python as an option, is it possible to make such a complex and vast program such as bittorrent? and finally, Which software is needed to compile C?Thanks,
Andy
December 22 2004, 01:39:56 UTC 7 years ago
Re: Bram...
I'm not Bram, but I'd recommend the Absolute Beginner's Guide to C. It worked for me :)Anonymous
December 25 2004, 14:23:44 UTC 7 years ago
Re: Bram...
I'm not Bram, but I'll answer your questions...> What are you confined to with C?
The only thing you can't do in C is take advantage of architecture-specific processor instructions (and some compilers _can_ do this). There are no restrictions but no safety nets either.
Advantages: Very fast, no restrictions, highly portable
Disadvantages: You have to keep track of your memory. This is a pain, a royal pain. Also, it's difficult to build abstractions if you aren't used to the C way of doing abstraction.
> as I know VB stops you from doing many things.
This is why VB sucks.
> If i took python as an option,
I strongly recommend you learn python if you just want to make programs that work. C is considered a lower level language than python because it requires you to have a good understanding of how computers work and the details of making the computer work take up code and mental space that could be devoted to getting your work done.
I defer to Bram's earlier comment. Learn python if you want your computer to do stuff, learn C if you want to learn about your computer and do stuff in the process. Don't learn C++.
If you've programmed before I recommend 'dive into python' (available online). There are numerous resources available for beginners, but I learned python after I'd been programming a while so I have no recommendations there.
Advantages: short code, clean abstraction, multiple programming styles (procedural, object-oriented, functional, aspect-oriented (with decorators)), excellent library, good resources
Disadvantages: comparatively slow, Not good as resume fodder
> is it possible to make such a complex and vast program such as bittorrent?
The original bittorrent implementation (bram's) was written in python.
> and finally, Which software is needed to compile C?
You need a compiler (and a linker, but that's usually included)! In windows, if you don't want to fork over the cash for Visual Studio (which rocks, buy VS6 used or bum it off a friend) you can get the free gcc using cygwin. If you run Linux, gcc should be available on the system.
Anonymous
December 26 2004, 17:40:56 UTC 7 years ago
C in Windows
if you want a free development environment for windows, check out Bloodshed Dev++. Does C or C++ and is reasonably user-friendly for starting out with either language. It's gcc underneath, but you aren't trying to use the (shudder) Dos prompt .7 years ago
Anonymous
7 years ago
Anonymous
December 28 2004, 17:02:13 UTC 7 years ago
Re: Bram...
Again, I != Bram, but I would definitely suggest "The C Programming Language", 2nd Edition by Brian Kernighan and Dennis M. Ritchie. It's considered "the C bible" by most accounts, and it is what I was given in college. It's a no-nonsense book designed to teach the language. Carry it with you whereever you go; I do.Anonymous
December 25 2004, 05:38:53 UTC 7 years ago
Subversion
Subversion also has a non-webdav server "svnserve" which uses the TCP based svnserve protocol. It's much faster. Many open source project repositories use plain svnserve. Not sure what you meant with your comments on transactions.
December 25 2004, 06:35:22 UTC 7 years ago
Re: Subversion
Well, they've now gone halfway to admitting that using webdav has been a failure. The other half would be to stop supporting the webdav version. Unfortunately for them, the webdav view of the world was the basis for how files are handled, as a result of which subversion doesn't support file renames, and never will. Don't believe the feature list. Subversion does 'renames' as a copy and a delete of the old version, as a result of which if one person moves a file and another one modifies it the change will be dropped, which is even worse behavior than cvs has.The whole transactional file store thing is covered in Tom Lord's post diagnosing subversion. One thing not covered in that post is that the way that data structure is built on top of berkeleydb is also comically stupid, even if you assume that it's a worthwile thing to build, which it isn't.
December 25 2004, 18:45:00 UTC 7 years ago
Re: Subversion
If you're basing your opinion of Subversion on Tom Lord's "Diagnosing" post, I hope you've also read my response, http://web.mit.edu/ghudson/thoughts/undWhat's comically stupid about the way we built the data structure on top of Berkeley DB? And why are you convinced that Subversion will never support true renames?
7 years ago
7 years ago
7 years ago
Anonymous
7 years ago
7 years ago
7 years ago
7 years ago
Anonymous
March 6 2005, 05:23:53 UTC 7 years ago
Subversion rename problems not due to WebDAV.
The reason for Subversion's current (and regrettable) lack of true rename support has nothing to do with WebDAV. It has to do with the decision to use the same style working copies as CVS, followed by a decision to allow that working copy architecture more influence than it should have had.If a system that does directory versioning supports renames, the obvious question is, what happens when someone commits one side of a rename but not the other? E.g.:
$ svn mv foo bar
$ svn commit bar # note that foo is not mentioned
Our answer was to break the rename down into a copy and a delete, so that if you commit just one side of it, you get either the copy or the delete -- which are already supported operations.
Now, I think this decision was a mistake. First of all, the repository should support true renames anyway, so that doing a rename without a working copy (which is totally supported) Does The Right Thing. And secondly, even in the working copy, I think it would have been better to simply disallow committing one side of a rename without the other.
My intention here is not to defend the lack of true renames, but only to point out that WebDAV had absolutely nothing to do with it.
December 26 2004, 07:29:26 UTC 7 years ago
Anyway, go ahead and finish writing Codeville. I think you'll find that a version control system that scales beyond basement projects is a lot tougher than you think, and that the vast majority of the user base doesn't have the same priorities as you do. If you're right and I'm wrong, I'm sure you'll be able to revisit this thread in a few years and be proud.
(In your review of the google results, you seem to have missed, among many other posts, http://svn.haxx.se/dev/archive-2003-0
January 1 2005, 18:03:32 UTC 7 years ago
March 26 2005, 10:15:03 UTC 7 years ago
November 4 2005, 13:50:29 UTC 6 years ago
How are architectural decisions made?
Hello there,I try to grasp the meaning of your statement about "emotional or psychological problems" and being "emotionally well-adjusted" when doing architectural decisions.
Could you please elaborate on what you mean with that. How do emotions lead to technically unjustifiable architectural decisions?
Cheers
Daniel
Germany
PS: Maybe I just did not get your point because Englisch is not my native tongue. But I tried hard :-)
November 4 2005, 19:05:43 UTC 6 years ago
Re: How are architectural decisions made?
The most common problem is that people can't admit when they're wrongJanuary 19 2008, 17:57:44 UTC 4 years ago
Two skills?
While this is true, I believe that the top two skill that self-taught programmers are missing are documentation and naming, and decomposition. Good naming and good documentation can only be learned from working with others. Just yesterday I was reviewing someone's class for our new team (call it team Foo).
He called it FooEngine. I pointed out that we're already in a "Foo" directory -- and this is general code not specific to our team -- so he changed it to Engine. If you were a stranger and you saw a class called Engine, would you have the faintest idea what it was doing? It actually exports individual lines of a CSV file, with retries... if we weren't going to break it up into three classes (see below) then I'd call it ExportCsvLinesWithRetry.
Decomposition is another thing you rarely learn on your own. If only one person is working on the code at a time, you have to pay a moderate cost for decomposing your design into disconnected parts, and you don't see the benefit. But a key to cooperation with others and to maintainable software in general is to decompose the design into the smallest reasonable pieces.
In the case above, when you split up the Engine into three parts, it turned out that there were good solutions to each of the three parts already in the codebase. So in fact nothing might need to be written, but more likely, we'll add some useful new functionality into a couple of places where other people can use it.