12/07/2009

IF(mp3=digital, createnewrecord, ctrl+A, Del)

I can't believe what a nerd I am. Look at this post I just wrote, and thought was a good idea! Can you believe the nerdy title I gave it? Wow. Anyway, posting anyway, as an example of the weird analytical stuff I actually think about during the day.

So I did this really stupid thing about a year and a half ago. While working as a karaoke DJ (this wasn’t the stupid part, okay?) I decided to copy over the external hard drive of DJ tunes to my own hard drive. I knew it wouldn’t be the best music of course, but I thought, here’s a chance to get all those classic party songs they put on those monthly mainstream DJ compilations, and well, I just never could turn down an opportunity to make my music collection more encyclopedic.

Big mistake.

Not only did I severely over-estimate the number of “classic party songs” to pure crap, I also forgot to take into account that the guy whose hard drive it was is one of the most unorganized, non-encyclopedic people I’ve ever met. My music library became clogged with unlabeled, mis-labeled, duplicate tracks, most of which I didn’t want anyway, with their titles written in all caps. To the tune of about 200 gigs.
If you are not acquainted with the true depths of my analytical neurosis, let’s just say that such a poorly organized “library” has been a heavy weight bearing on my database soul for the past year and a half.

But never fear dear reader, because I am working through. Little by little, I am making my way through the genres and deleting, re-categorizing, consolidating, stripping, and re-writing the metadata. I first did “alternative”, “punk/hardcore”, “classical”, and “jazz” so at least I could listen to some music without going crazy. I got rid of the “other” and “uncategorized” categories little by little, and eventually consolidated “hip-hop”, “hip hop/rap”, “gangsta rap”, “rap/hip-hop”, and “hip-hop/R&B”. Last night, I finally finished “pop”. The only ones left are “rock/pop” and “rock”, which are large, but by this time I am being brutal with my deletion, so I hope to finish this week. If I don’t immediately recognize the name, it goes. If they have a single song I don’t like, it goes. If they have a single song with a Christmas theme… Ctrl+A, Del.

Throughout this process, I have had much time to lament how horrible the music player programs are at sorting music. I use iTunes primarily (iPhone user). But for sorting purposes, I also tried Songbird, Media Monkey, Windows Media Player, and Winamp. They differ a little bit, but without buying extra modules, there really isn’t any improvement. The best thing one can do is to sort by a metadata category, and just brute force your way through it. Even so-called “duplicate” finders are pretty weak, with no way to qualify how close or far a supposed duplicate might match its pair. And then, they are remarkably proprietary. iTunes is notorious (at least among the people who discuss music library databases online) for not allowing the language of its library files to be touched. There are some Applescripts out there for making some changes, but amazingly, it is very hard to re-organize a music library any other way than through a browser.

Just so we’re clear, I’m talking about the music library, which is different than the actual mp3s on your hard drive. The library is basically a database file, in some derivative of XML, for organizing the track names, numbers, artwork, actual file locations, and other metadata for display through the player’s browser window. There is a re-write process between the file itself and the database (what iTunes calls “organizing”, or maybe “mediaTunes” now?) that will adjust the actual metadata of the mp3 to cohere with the library database.

Now, I know when I say this, the reason it is so is because so few people have the disposition to categorization that I have, but all the same—the databases available for media organization are abysmal. I don’t really see why—it is easy enough to add XML interpretation into a program. Your word processor can probably do it. But I guess in the effort to make media players as “cleanlined” as possible, (i.e. iPod/iTunes-like) these are abandoned in favor of tools that let the program do all the work.

And I’m not interested in trashing the iTunes mentality, because through it all, they’ve still put together an excellent media player. Sure, it’s a bit heavy for a media player program. And it has a tendency to do things “automatically” that really screw up—like losing user-uploaded artwork trying to auto-download it, and we don’t even need to get into the DRM stuff. But as basically a front end for their music store, it is still pretty damn usable for someone like me, who has only bought maybe two things from the iTunes Store ever.

For example, I love the Smart Playlists. This is the sort of functionality I’m talking about. These are basically database queries, where you can define ranges of the metadata variables like “times played” or “date last played”, and insert randomization and total record quantity. I have several personal “radio stations” made from these tools, and they work great. Of course, there is not as much flexibility as I would like. The same thing goes for the Genius function, which is basically a personalization query, based on variables iTunes doesn’t disclose. Of course, you can’t edit this, and for someone with +100 gigs of mp3s and a computer 5 years old, it kind of gums up the works. But it’s the right idea.

The thing I realized, while deleting 50+ copies of duplicate shit-club mixes of Akon’s three biggest songs of 2007, was that despite the hysteria about intellectual property insinuating that a song is infinitely replicable, and a mere collection of digital bits, we still don’t look at our music files as data. There is an aspect of the commodity in every mp3; it takes on more than what it is. An mp3, to a consumer, is purely the music experience, not the possession of data which can create the music experience. My DJ associate with bad file habits thinks to himself, I want this song in my music collection, and adds it in, with no thought of where it will go. When he wants to play the song, he searches for that particular track, and plays it. There is no browsing, no querying, no organization. The more duplicate tracks he has, with different spellings and different data in different categories, the more likely he’ll find an instance of it when he searches for it in the search bar. The entire analytical process is, Want->Get. It’s the purest sort of production/consumption there is.

This is good for record companies, who try to institute the fear that if they can’t make money, then you won’t have any more mp3s. Actually, with DRM, they’re probably right. But it isn’t true—being able to drag and drop an entire collection of mp3s proves the point. An mp3 is only data. Music has long since past the point of expressive performance, and has entered the realm of digital data, along with many other aspects of our life. Now, expressive performance, the actual production and consumption, live within the differences of binary digits.

So what are you going to do? Well, as any database administrator will tell you—stop doing that! That is, having poor data habits. We know to back up our data, and to be careful where we get our data, but now we need to learn to organize it. A well-kept database is a useful database. Only one item of data per variable, each record separate, no duplicates, proper linking conventions. Clean query programming. It’s just what makes sense.

Of course, no 13 year-old just starting their mp3 collection is going to do this. You just throw ‘em in a file as you download them. So instead of instituting my Universal Rules of Epistemological Fortitude, as I would like to do, I instead look to the media players. I want MS Access, with a media player function. I want write combo boxes for my playlists. I want SQL queries in mp3 queries. I want to add IF/THEN statements to my iPhone syncing. Maybe with some top-down redesign of software, we could start treating our mp3s as what they are—valuable data.

No comments: