Unlocking Your Hidden Data Potential
I have too much data, and no good way to get at it. But if developers could make some slight adjustments, it'd open up a world of possibilities.
January 25, 2010
I seem to have some really interesting data on my PC and my PDA, but I can't get at it. I'll bet you do, too, and with a few small changes in the desktop and mobile app world, we could probably learn some pretty interesting things.
I'm not talking about business data; I've got all the reports I need in that area. No, I'm talking about all the personal databases that I—and, probably, you—accumulate constantly. What sort of "personal databases" and queries am I talking about? Well, for example, my calendar. It's a database, but I can't seem to easily get it to tell me how many hours I've spent in airports last year. My Contacts is also a database, but I can't get it to help me clean out the useless entries by telling me which contacts I've not even looked at in at least five years. Adobe Lightroom is a database of my photos, but it won't tell me if, out of all relatively few really good photos that I've taken in the past few years, whether I took those photos with just one or two particular lenses, or if there's no relationship between lenses and picture quality in my case. Heck, it might even be fun to correlate the nutrition log I keep on my PDA to Lightroom to see if picture quality's more a function of what I ate that morning. Yes, friends, there's gold in them ther' bytes, and those examples just scratch the surface. I think that the notion of being able to mine our own data is not an impossible one, and here's how I think it could be possible.
History of Proprietary/Open Debate
Of course, I don't mine those databases, because it's either too much trouble or simply impossible. Their developers locked my data into proprietary formats, and they did that because, well, it's easiest to do things that way. Now, that's not a slam at those folks, it's just history repeating itself. When I started writing apps on mainframes in 1973, most people saved their application data in files formatted in some proprietary, home-grown manner. That worked fine... until some other program needed to use that data. As time went on, third-party tool vendors created programs that simplified writing apps that could easily store data in a standard, easily-accessed manner; such database management systems (DBMSs) were an advance, but they were pretty expensive, and ended up creating little compatibility universes of their own: writing code to extract data from a Pick database looked nothing at all like the code needed to pull data from DB2 database. A 1970 paper suggested a standard database query language—what eventually became SQL—but affordable SQL implementations didn't appear for nearly 30 years, like MySQL and Microsoft's cut-down versions of SQL Server: MSDE and, later, SQL Server Express Edition. (By now, you've probably divined the crazy part. Yes, I think SQL's the answer. No, it's not crazy. Read on.)
Nowadays, many business-class applications store their data on something SQLish and they do that for three very good reasons. First, letting a DBMS manage your app's data is, in the long run, just plain easier for the application's developer. (Trust me, I've done it both ways.) Second, including DBMSs in many apps is cheap or free and actually isn't all that hard to code (it's usually just one SQL table). Third, providing the customer access to his data via SQL says to that customer, "It's your data... use it as you like!"
Simple Steps to Unlocking Data
Any application that keeps track of me contains data that is of interest to me, but the app's developer couldn't possibly guess what I might want to do with that data at some point, so in a world where there are excellent, free database engines that support SQL, there is no real reason why an application designer shouldn't allow that application's users to optionally store that app's data on a cheap SQL server, so that I can use any of a number of SQL query tools to poke around that data.
Now suppose that app vendors start offering us the option to store data on a SQL server, then we'll need a SQL server to hold that data, so how where are we going to get a cheap, simple SQL server? I can think of two places. Either download a prebuilt virtual machine with MySQL already running on it (search "turnkey mysql appliance") and run that VM under VMWare's free Player, or download SQL Server Express 2005 from Microsoft's site and follow a set of instructions at my website at www.minasi.com/newsletters/nws0509.htm to get a SQL engine running on your desktop. (Neither alternative is childishly simple, but they're both doable and as time goes on, "no-brainer" SQL appliances will continue to get better.)
Finally, once we've created this nifty new world of SQL ubiquity, how would we query those SQL databases? MySQL has a set of free tools that include a Web-based SQL query tool. SQL Server Express edition has a command-line query interface, but probably the easiest front-end to Microsoft's various versions of SQL Server is Access, if you've got a copy of that handy, or Excel, with some work. Additionally, a web search on "SQL query front-end" will yield a goodly number of hits.
So, what do you say, app vendors? With your assistance, PC users could soon become some of the most self-aware folks on the planet, and how often does one hear of the possibility of achieving such a lofty goal with a relatively small amount of effort? Of course, it could lead instead to creating a class of people afflicted with the new disease of "data solipsism," but hey, every technology has its potential downside, eh?
About the Author
You May Also Like