Tag Archives: coding

Data Structures Are Dead…Long Live Data Structures!

Eons ago, when I was first learning how to code, we spent a fair amount of time focusing on data structures at the heart of development. In that time, both memory and fixed storage were expensive and scarce, so by learning how the base structures worked (and learning how to code them up from scratch), we could pick the most economical version and save an extra kilobyte of memory or two — a precious commodity!

These days, neither memory nor storage are terribly expensive nor scarce (although in our networked environment, there are still reasons to conserve). So it may not be much surprise that there are many programmers who couldn’t write an implementation of a doubly-linked list or b-tree.

Yet it’s not stopping them from being productive, great developers. Why? Because in today’s framework and higher-level language-driven world, those lower-level structures are often abstracted away or handled behind the scenes. Take Cocoa, for instance, which has classes for arrays, sets, dictionaries, and more — all with optimizations and methods to do querying, modification, and more built in.

Sounds like unbridled progress to many, and I certainly agree with the ‘progress’ part. But unbridled? I’m not so sure. I continue to think there’s value in knowing how the fundamental data structures work — for at least two reasons:

First, in knowing the internals, you can make more informed decisions about which to use when. With Cocoa, for instance, Apple has released a whole set of tech docs about the best uses for each of the “collection” types. It’s helpful and well-written, but mostly superfluous if you have the underpinnings needed to internalize that kind of thing.

Second, there’s still a role for optimization. Hardware and networks continue to get faster and frameworks continue to get more efficient — but it is poor form to rely on those increases when you don’t have to. And to optimize, there’s really no substitute for knowing the basics of how things work.

So maybe we’re seeing the slow erosion of understanding about low-level data structures like stacks, queues, and heaps. And for some, maybe that’s OK. But I’m not ready to see them go just yet.

The Paradox of Black Boxes

When I was a young programmer, I was developing an application for a client — it may have been for Windows 3.1 or Mac System 6, I don’t remember — and my code just wouldn’t work. I banged my head against it for quite a while, and just couldn’t see any errors.

So I made an intellectual leap. One of youth and hubris, in part. I concluded that there was a bug in the system’s API. Somehow, in the millions of lines of code that comprise the (then-new) operating system, a bug had been placed that was causing my code to fail.

Incensed, I dashed off a bug report and posted indignantly to a programming community (probably on CompuServe or AppleLink). My unhappiness at the bug was trumped only by my self-righteousness at finding someone else’s.

But of course, you know the end of this story. My indignation and feeling of superiority evaporated quickly when the programming community noted no such error in THEIR code, which took me back to my own — where I found my problem. My happiness at solving the problem was eclipsed by my embarrassment.

This may be an anecdote borne of, as I mentioned, youth and hubris, but it also derives from another phenomenon: the black box. My code was relying on someone else’s code, so when mine didn’t work, it was intuitive to blame the mystery “other” code.

This happens all the time today — likely more than at any time in recent memory — because of the rise of modularized code. It is trivial today to take code libraries from multiple places, call a remote API or two, and package it up as a new application. In fact, in some cases, it increases one’s efficiency by several orders of magnitude.

This is the virtue of the black box. We can drop one in whenever we need specific functionality and we don’t have to do much to make it work.

The difficulty, however, is in provenance. Who knows how that code was written, or how well it works? You can get a sense of the latter often by doing some web searching, but if you are using a narrow-niche kind of code where few use it, or you are using a “black box” in a corner-case/niche-y way, you may not be able to rely on the commons.

So what’s the answer? Surely it’s not to build everything yourself from scratch. If that were true, we’d all be releasing our own versions of iPhone operating systems to go with our applications, and it would be worse than the Android fragmentation! (Kidding, kidding. Mostly.) Rather, it’s to be judicious when using external libraries and APIs, to do so where you can degrade gracefully if something breaks, and to code defensively, handling any errors or anomalies that could involve the foreign code.

Because even though it’s unlikely, it’s possible that next time, it really WILL be the black box code that’s broken, and not my application. And that will make my younger self smile.

Finally! finally in PHP 5.5

PHP 5.5.0 was released yesterday, and there are a handful of helpful new features. Among them, there is one that looks the best to me — the try-catch block has the finally block added. After all, if you’re doing exception handling in your code with try-catch (and if you’re not, you really should be), there are often scenarios in which you want to execute some arbitrary code at the end of the block, regardless of whether an error got thrown or not.

In some cases, you can put it outside the try-catch block (at the end, of course), but there are at least two circumstances where that’s less-than-optimal. First, if the program flow is altered in the try or catch, yet you want to tack on some code at the end, you’re in luck. But in our case, most of the time, it’s a readability issue. We try to group code conceptually when we write, much as we do when writing in English. That means we use sentences and paragraphs. Well, when coding, we do the same thing. And the addition of the finally block will make that even easier.

Another nifty functional help is the addition of the list form in foreach structures.If you work with nested arrays, you can now unpack them in the foreach construct, exposing the underlying variables much more easily. Like the finally block addition, it’s not as though you couldn’t code around this limitation, but when you don’t have to, code gets more readable.

In a deja-vu moment for C programmers, you can now dereference arrays and string literals to access specific items and characters. So in PHP 5.5, ‘Square’[1] equates to ‘q’. Not too shabby.

You can now pass a function call to empty() to evaluate it, which is another keystroke-saver when coding. So if you have a function foo() that may return data or may return, say, false, you can evaluate empty(foo()) and see if false came back. Again, a minor assist, but an assist nonetheless.

There are other features in this release, too, including a new Zend optimization caching extension, some enhancements to the GD graphics library, a new yield keyword that allows you to do simple iteration without using as much memory, a class function that gives you the unqualified class name, and more.

It’s good to see PHP continuing to evolve and grow — it has come a long way since I first used it fifteen years ago!

Retro Geekiness: Sorting Out Sorting

As a child about thirty years ago, when I decided to transition from computer programming hobbyist to student, one of the big topics that represented that transition was algorithms. These “recipes for the computer,” as I was taught at the time, were the key — the secret to unlocking the potential of this great technology.

One of the first areas of algorithmic exploration I embarked on was sorting algorithms. Which, although it was probably a pretty haphazard selection, turned out to be a good choice. Even now, three decades later, knowing how sorting algorithms work and when to choose the right one comes up all the time.

But back then, it was a brave new world to me, and I was just trying to figure out how they worked at a basic level. It was a struggle — I knew that some were faster than others in certain cases, some really sucked all the time (using the vernacular of the time), but I was having a little trouble grasping the nuances.

Then one day, while I was idly flipping through my school district’s central film and video catalog (the depth of my geekiness as a child is unending), I saw — what’s this? — a video about programming! I quickly badgered the librarian to request it, and when it came, I beheld: Sorting Out Sorting.

This 30-minute video was created entirely on the computer in 1980 — an impressive feat by itself — by the University of Toronto. And it remains, I think, the best teaching material on sorting algorithms in existence. When I started teaching computer science, I went on to show it to several classes of students over the years, and while it’s certainly dated (oh, man, is it dated), the information remains stellar.

There really is no substitution for seeing the algorithms in action. These days, there are websites like this one that show you much of the same animations in a browser window, but the movie gives you more in-depth detail about each one.

This is by no means a film review website, but this is one I’d give two-thumbs up. Put some legwarmers on, turn down the Huey Lewis & The News for a minute, and check it out.