Tuesday, November 27, 2007

What's that Data Structures Class for?

I assume that when computer science students go through college, they all take a required course in data structures. If I were designing a course like this, I'd make them learn how a variety of useful data structures worked. Certainly if you read a book on data structures you'll learn this sort of thing.

How many programmers actually need this information? In today's world, there are a lot of libraries out there that have a reasonable implementation of AVL tree, hash table, and B tree. Certainly some people need to learn to write these things. Why does your typical student programmer care about different ways to handle collisions when inserting into a hash table?

Okay, I'll admit that programming without this knowledge for me would feel a bit like skiing naked. It'd be a bit too weird to choose an algorithm because someone told me that I should always use AVL trees because their worst case performance and therefore their predictability in the face of ignorance is better. For me, I'd rather be able to use a sorted array to improve locality of reference even if I don't know that there's a problem anywhere. I'm sure that at least 95% of the time that I've used an immutable sorted array it hasn't made a difference. I certainly don't check with a profiler unless a real problem exists.

Every so often performance does matter, though. It sometimes matters a lot. In that case, you might say that you need a programmer who knows to check the hash tables to make sure they aren't behaving horribly. However, a decent profiler is likely to tell you a lot of useful information without having any knowledge of data structures. Since these cases are rare for typical programmers, wouldn't they be just fine if they knew a collection of data structures that they could swap in until they got something that worked better?

I can't remember many times when I've had to swap out a data structure because of performance issues. The last time was inserting into the end of a very large STL vector. The copies were too expensive due to C++'s tendency to copy things too often. (Even this case will be fixed with the next C++ standard.) Anyway, STL has some other data structures that can be used. I was able to replace my data structure and things immediately improved. I can't remember enough details to know what knowledge I needed. It's also possible that I guess correctly more often than an ignorant programmer would. Who knows how many times they might need to swap things randomly?

A C# programmer would have it even easier. The System.Collection namespace in .NET doesn't have a lot of different options. It'd be pretty easy to try all the possibilities pretty quickly. If none of the options solves the problem, it's entirely possible that there's something that could be done elsewhere.

Memory and speed performance are pretty much the only times you might care about the differences between a hash table and an AVL tree. A few years from now, application programmers may just add a bit of concurrency if they want more speed. Few web applications run low on memory. Are data structures classes useful anymore for typical programmers?

I've left a lot of unanswered questions in this post. I'm really curious about the answers. I'd love to hear from you.

This originally appeared in PC-Doctor's blog.

Wednesday, November 7, 2007

Model View Controller and Beautiful Code

I've been reading Beautiful Code. It's a fun book with chapters written by a few well-known programmers about some beautiful code that they've written or seen. They all have wildly different views on what makes code beautiful, so it's pretty entertaining. It's fun seeing what programmers I've heard of think is beautiful. Brian Kernighan, for example, wrote about some string processing code. I found that extremely entertaining and ironic. Yukihiro Matsumoto had an impressively concise description of what it takes to make beautiful code. It was refreshingly in character with the Ruby philosophy.

Most of the authors were completely unknown to me. However, even though the most obscure is more famous than me, I'm still going to tell you about some beautiful code that I run into frequently.

The Model View Controller (MVC) design pattern is, perhaps, one of the oldest out there. It's been used for almost 30 years now, and it is still an extremely clean way to divide up the responsibilities of a user interface. When I see it implemented well, it's extremely satisfying.

For those of you who grew up using Microsoft's GUI frameworks, you may have never seen this done well. .NET Forms and MFC make it difficult to implement it correctly. (In fact, this topic was inspired by me working simultaneously on a Rails and a MFC project.) Ruby on Rails, by the way, revolves around the MVC design pattern.

What's so beautiful about the pattern?

It cleanly separates three different activities. The separation is clean enough that the pattern is relatively easy to use. It's usually easy to figure out whether some functionality should go in the model, the view, or the controller.

Furthermore, even in a highly interactive GUI, the program becomes easily testable. The view is the only part that interacts directly with the user, and, other than interacting with the user, the models and controllers have all of the functionality. (This statement becomes less true if you're writing custom controls such as a text editor.) If you test the models and controllers, then you've tested a large fraction of the code.

I always smile when I see some MVC code that works well.

This originally appeared on PC-Doctor's blog.