Programming

Software is Uniquely Chaotic

In my continuing series on why software is special, motivating writing a book about it, this post discusses how software is chaotic.

Here I refer to the mathematical definition of chaos, which I will define as: "A chaotic system is one in which small changes in the initial conditions can cause large and unpredictable changes in the system's iterated state." This is based on the mathematical definition(s), but simplified for our purposes. It's not just a word, it's a quasi-formal concept.

Software is Uniquely Complicated

In 2007, with a well-loaded Linux desktop installation, my /usr/bin is 257 megabytes, with debugging off and dynamically-linked libraries not contributing to that count. My particular copy of the Linux kernel 2.6.19 with certain Gentoo patches has 202,381,268 bytes of C code alone. If I'm computing this correctly, at a constant 100 words per minute (5 chars/word), that's 281 24-hour days just to re-type the C code in the kernel.

The Programming Construction Metaphor

I've gone on before about how distrustful of metaphors I am, and it seems like every year I'm getting more distrustful of them. Either deal with the thing as it is, or just give up understanding it. Metaphors lead to the beginning of understanding, but no farther.

Programmers aren't immune to the metaphor sickness, and if there's one metaphor you can expect to see trotted out at the earliest available opportunity, it's the "programming as construction" metaphor. This metaphor has been skillfully deconstructed many many times before, but I'm going to deconstruct it from the opposite angle... what if construction was like software engineering?

Programming is Uniquely Difficult

Engineers of other disciplines often take offense at the claim that software is uniquely difficult. They do have a point. As pointed out by Fred Brooks in hyper-classic The Mythical Man Month, one reason software is hard because software is so uniquely easy. We fundamentally built on top of components that have reliability literally in the 99.999999999% range and beyond; a slow 2GHz CPU that "merely" failed once every trillion operations would still fail on average in eight hours at full load, which would be considered highly unreliable in a server room.

Why Do We Care What's Special About Programming?

The first step to attaining wisdom is to understand why special "programming wisdom" is needed in the first place. It's easy to get the idea that software is easy to create, because it is partially true. Computers get more powerful every year, and we trade in on that power to make programming easier. Every year results in more and better libraries. Changing software is very easy, and it's relatively easy to test compared to an equivalently-complex real-world object.

Big Haskell Projects List

On programming.reddit.com, I have referred to a list of Big Haskell Projects I'm keeping. I've been keeping it in my head, because it's been short, but I thought I should go ahead and start actually keeping one.

Haskell as a language intrigues me, but I can't help but notice that there aren't a lot of large projects that use it. My intuition suggests that this could be difficult, because I suspect the type system may become increasingly unwieldy. I once asked about this directly, and the results were less than impressive. I'm asking for examples of large projects because concrete results trump my intuition.

In the interests of honesty and transparency, I'm actually going to keep this list in a public place and try to keep it updated as people suggest things until either A: It satisfies me and I start trying to learn Haskell or B: It becomes too much of a time sink, which would basically mean that there are a lot of large projects. That is, my goal is not to keep a list in perpetuity, but just until the point is made.

Update March 8, 2007: This has been up for three weeks now and attracted some attention from some people who really ought to know. I'm willing to say now that pending further updates, I see no compelling reason to believe that Haskell is practical for larger projects. Furthermore, Haskell arrogance is totally unjustified by evidence. It may someday be proven a compelling choice for Real Programming, but there isn't even any significant evidence of that, let alone enough to justify arrogance.

My New Favorite Spam

I've seen this sort of spam before, but never with such purity, usually only with the Subject or something left unprocessed. Date: Sat, 17 Feb 2007 16:42:06 -0500 Received: from 192.168.0.%RND_DIGIT (203-219-%DIGSTAT2-%STATDIG.%RND_FROM_DOMAIN [203.219.%DIGSTAT2.%STATDIG]) by mail%SINGSTAT.%RND_FROM_DOMAIN (envelope-from %FROM_EMAIL) (8.13.6/8.13.6) with SMTP id %STATWORD for <%TO_EMAIL>; %CURRENT_DATE_TIME Message-Id: <%RND_DIGIT[10].%STATWORD@mail%SINGSTAT.%RND_FROM_DOMAIN> From: "%FROM_NAME" <@FROM_EMAIL> %TO_CC_DEFAULT_HANDLER Subject: %SUBJECT Sender: "%FROM_NAME" <%FROM_EMAIL> Mime-Version: 1.0 Content-Type: text/html Date: %CURRENT_DATE_TIME %MESSAGE_BODY Also note that the bit starting with %TO_CC_DEFAULT_HANDLER was the beginning of the messaged body, which is also wrong.

Something's rotten in the state of [web] Devmark...

I've got a Django review in the pipe, and it's generally positive. It's still cooking both so I can gather more experience, and while some fact-checking occurs.

But there's something still missing that I can't put my finger on. I think cutting away even more of the general cruft of making web apps is bringing it out. I've been programming on the web for nearly ten years now, and something's not right. Despite the fact that I have gotten generally good at programming, and I keep refactoring and refactoring, I keep writing the same web page over and over again: I've got a tree of heterogeneous objects that I need to render, which itself a view of a graph-like structure, I allow the user to manipulate it somehow, and I have to propagate those changes back to the database. And for some damn reason, no matter what I do, there's always something super-special about this form that requires special treatment and requires further extension of the frameworks or libraries I'm using.

(Note this programming post meanders a lot; part of the point is that I don't entirely know where I'm going with this.)