A Less-Definite Article

“The.” Such a meaningless word. Such a cause of trouble for those of us who rely on the alphabet.

Take a look at my iTunes library. What’s the name of the band that originally recorded “Hey Jude”?

You say you know. I’m telling you that you don’t. I always have to take a guess at it…if I’m looking for it in my iTunes library. Thanks to the plurality of early iTunes users who submitted CD track listings to the CDDB while stoned, the Beatles catalogue is split between three different bands:

“The Beatles”



“Beatles, The”.

And why, yes, this does completely **** up browsability! I’m forced to weed these things out and fix them manually.

The “the” problem screws lots of things up. It’s “Spider-Man,” not “The Spider-Man.” But purists will insist that it’s supposed to be “The Batman.” And as big a fan of this band as I am, I’m not 100% sure if it’s “Foo Fighters” or “The Foo Fighters” until I consult a canonical source.

(Usually, I call up Dave Grohl. “How the hell did you get this number?” he shouts. “You know perfectly goddamned well that the court order forbids you from ever calling me or any other member of Foo Fighters!” And there’s my answer.)

All of this is simply part of our daily burden as free-thinking members of this planet’s alpha species. It’s on my mind tonight thanks to a conversation that Marco Arment has been having on Twitter about how his lovely new podcatcher app sorts show titles.

I don’t think that’s the right way to go. I’m looking at my list of podcast subscriptions and I reckon that by this scheme, about a third of the shows I regularly listen to will be clumped under “T.” I reckon that this is why so many music apps (like Google Play Music) will display “The Beatles” but sort it as though the band name starts with “B”. I reckon if I use the word “reckon” a third and fourth time and point that out, it’ll sound like I used it over and over again to be entertaining, when in truth, I was just lazy.

If you’re a stickler, you could just say “common rules of indexing command that ‘the’ be treated as though it were the last word in a business name or title.”

(Go on and check out the Chicago Manual Of Style’s Q&A page about alphabetizing. It’s a hoot, and reminds me that if I took a job as a librarian, I’d last about three weeks before I shot myself in the foot to get a discharge stateside.)

But all of this skips over the real point, when designing software. Rules should be damned: the choice just has to make sense and it has to be consistent. The developer needs to ask “where will people expect to find ‘The Beatles’?” and act accordingly.

At some point, he or she just has to make the choice that feels right. Then, send baby out into traffic and see how well that choice works.

This is a good example of what I think of as a “big endian/little endian” problem. These terms have nothing to do with how data is stored in address space; I’m referring to the original Jonathan Swift idea of a society in which people who slice open their hard-boiled eggs from the little end can’t understand the people who slice them open from the big end because, obviously, their way is totally the right way to do this. The other way seems so bizarre that those people might as well be of some other species or something.

So: you can argue endlessly about the “right” way. But it’s almost (not quite) an arbitrary choice. By trying to satisfy people who will never agree with the “other” way of doing things, you’ll just screw things up for everyone. It’s best to just have a point of view and stick with it until user feedback makes you second-guess your choice.

Alphabetizing things will never work smoothly, anyway.

John Hodgman and Jesse Thorn refer to their show as “The Judge John Hodgman Podcast” on-mic, but it’s canonically listed without the definite article. Where do I find Elvis Mitchell’s swell entertainment interview show, “The Treatment“? Is it under “T” for “Treatment,” or “T” for “The”?

Trick question! Both words start with the letter “T”!

AHA! DOUBLE trick-question! Because it’s listed as “KCRW’s ‘The Treatment'”!

My mental eye paints White-Out over the “The” in a podcast title almost every time. But I never think of my favorite podcast as anything other than “The Bugle.”

This sort of problem goes way, way back. When I was a kid, Marvel Comics inflicted the first of what would become a decades-long string of abusive editorial decisions by renaming the comic “Peter Parker, The Spectacular Spider-Man” as “Spectacular Spider-Man.” Well, crap. Now where do I file these issues? I checked the indica. Marvel didn’t start a new numbering scheme and this was still Volume 1.

Nerdy kids who grew up in the Eighties are united by two traumatic events that affected all of us: “Spectacular Spider-Man,” and the Challenger disaster. I am convinced that if I were to send a one hundred item questionnaire to 500 comic book fans, the answer to Question 1 (“How did you choose to sort your Spectacular Spider-Man comics?”) would let me predict the answers to many questions about the respondent’s views on politics and ethics, after all 500 sets of answers were submitted to proper analysis.

(My introduction to formal data structures came when I wrote an app to keep track of my comics. Through high school and college, I solved so many problems and added so many features to it. But I never figured out an elegant way to handle a comic that runs for 131 consecutively-numbered issues across three or four titles.)

What I’m saying is that alphabetizing things is a big mess…maybe the biggest mess there is, if ranked as a ratio of “how difficult this problem is” to  “how difficult it appears to be.” I always expect, and hope, that “the” is invisible for sorting purposes…but I can forgive a developer for doing what makes sense to him or her.

All of this reminds me of a brilliant name for a band, which I came up with when I was a teen: “Miscellaneous M.” It guaranteed that the band would get its own divider in every store’s CD department even if it only released one album. The only way this scheme could possibly fail would have been if the entire market for physical media were to collapse over a short period of time.