Debug It!

by Paul Butcher

I saw this on Nate Matias’s bookshelf when I was stowing conference supplies in his office at MIT. Nobody tells me about good books in my field, so I take note of what people seem to be reading. Finally, I found some time.

It’s a good book. Debugging is a skill; half the questions I see at StackOverflow could have been resolved fairly quickly if the writers had better debugging skills. Butcher walks through the basics with patience and tact, showing inexperienced programmers how to find bugs and pointing out some kinds of problems that are tricky to find.

This is a book for the inexperienced, and it stops when things are getting interesting. Yes, asynchronous and multithreaded code is hard to debug. We knew that! And yes, doctor, we know that if it hurts when we do that, it might be a good idea to avoid doing that. Sometimes we can’t avoid it. Sometimes, the bugs appear when we’re already hip deep in the Big Muddy. Sometimes, the Big Fool says, “Management decided to architect it this way. Fix the damn bug.”

There are two kinds of tricky bugs. The first is the masquerade party: something is going wrong, it’s clear what the problem is, but that can’t happen. I had one of these publishing this post. A newly-rewritten method, ConstantTextSize, was crashing because when it called CreateAttributedString because, whatever it was getting from CreateAttributedString wasn’t an AttributedString. That’s nice, but it turns out that CreateAttributedString only returns AttributedStrings. I spent a lot of time proving that it could only create an attributed string. And then proving that the object it was complaining about was, in fact, an attributed string. So, why the crash? Because the text encoding was fouled up and we were looking beyond the end of the string, and an attributed string made up of garbage is not an attributed string as far as the system is concerned, no sir! So, we marched out of that river and now, when you use a whole slew of mathematical symbols in a post, Tinderbox won’t holler at you the way it hollered at me.

Masquerade parties take time but they’re kind of fun. The other tricky bug is the Epic Storm, and it’s no fun at all. Debug It! book avoids bogging down in war stories, but in a way that also falsifies the experience of debugging. I really hated Ellen Ullman’s The Bug, but it did get one thing right: an epic bug is a terrible thing. Everyone is screaming and shouting and generally emoting all over the place. Survival depends on finding the thing. Somehow, whatever is wrong keeps going wrong. You think it’s fixed, you pop the corks, you ship. And then it happens again. And again.

Over the years, we’ve had some awful epic bugs. They had names. Storyspace Ate My Links was the worst, but the Tinderbox unit tests have plenty of methods named for the great customers who filed the reports. I think we’ve only had two category five epic bugs in the last few years, though; test-driven development has helped, and perhaps experience and whisky have helped keep things calmer than in the old days. (One is fixed, the other is still open but we think it lies deep in the Carbon libraries where no program went before, and where no program is ever going to go again.)

One of the great contributions of Martin Fowler’s Refactoring was that it gave a name to one of the best approaches to addressing an intractable bug. If you can’t find the problem, and localization strategies can’t quite pin it down, you can just refactor the hell out of the neighborhood. This may or may not fix the bug, but at least it improves the code. Before refactoring, bug hunts left tracks all over the program in the form of hooks and temporaries and diagnostic writes. Nowadays, bug hunts leave mown lawns and cleaner design.