Mike Schaeffer's Blog

June 21, 2006

Whither XML?

I was skimming Vincent Maraia's (he has a blog!) book, The Build Master: Microsoft's Software Configuration Management Best Practices, and ran across the following quote:

Thus, if you want to learn one languauge that will cover many tools and technologies no matter what platform you are working on, that language is XML.

On the surface, this is a true statement. Since XML's introduction it's shown up virtually everywhere: XML has been used for everything from configuration files to RPC protocols. Better still, all these XML documents have the same understandable syntax and can be parsed by the same, standard tools (which exist virtually everywhere). If you want to work with XML files, chances are your favorite text editor has built in XML support; If you want something more structured, Excel has a powerful XML import capability, as do most databases. However, as nice as all this is, it glosses over one fundamental fact: XML, by itself, doesn't mean anything.

Saying that a document is in XML is basically the same as saying it's in CSV: it implies the format that contains the data, but it doesn't imply anything about the data itself. There are still a bunch of unresolved questions: What tags are supported? How are attributes parsed? What do the tags actually mean? These are the types of questions you'll find yourself asking about five seconds after receiving an XML document in a new format. While some of these questions can be answered by a Schema or DTD, the last question, the key question, isn't addressed by a schema at all. As anybody who has had to reverse engineeer an otherwise unknown XML document can tell you: even if you know a document is in XML, the things you really care about are still left unspecified.

So, XML isn't really the 'language' Mr. Maraia states it to be, which is why his comment is so optimistic. While it is true that XML is useful and universal, you'll also need to learn schemas for the documents you'll be working with; That's where the bulk of the work will be. (Syntax is generally an easy thing to learn, and developing syntax processing code is a well understood branch of computer science.) So, while you should learn XML, if learning XML itself is the kind of decision you have to mull over, you probably aren't prepared for steps you'll be taking immediately after learning XML. (This is particularly true if you're using XML to configure build tools, as in Maraia's book. If you're in that role and are having trouble with XML, you should just quit now.)

Erik Naggum, of comp.lang.lisp fame, summed this up quite nicely:

Structure is nothing if it is all you got. Skeletons spook people if they try to walk around on their own. I really wonder why XML does not.

Tags:tech
June 21, 2006

The abominable dialog box...

Every once in a while, my Windows XP laptop decides to display this little jewel of interaction design:

Update Dialog Box

If you haven't seen it before, this dialog box basically means that Windows has downloaded a system update that needs to restart the system to install. Once it appears you have three choices:

  • Do nothing - The system will wait for five minutes, filling in the progress bar, and then forcibly restart your computer.
  • Click 'Restart Now' - Your computer will restart now.
  • Click 'Restart Later' - The dialog box is dismissed and will reappear in 5-10 minutes.

In short, once this dialog box appears, you're screwed; Windows is so dead set on the immmense value of whatever unknown update it's downloaded that it's going to restart your computer and close every open application, document, and network-connection whether you like it or not. It doesn't matter if you're surfing the web or working on a cure for cancer: this dialog box bascially means that 1) whatever you're doing is less important than the system update and 2) you aren't able to decide for yourself any different.

To see why this is true, consider what the dialog box does not have:

  • There is no way to see a list of what updates are being installed.
  • There is no way to control how long the system restart is deferred.
  • There is no way to launch the page of the system control panel that controls automatic update.

None of this stuff is rocket science. My guess is that no more than 2-3 person/weeks of developer time would be enough to put together a first rate and fully documented implementation of all three enhancements: The first already exists elsewhere and the second and third are both small bits of functionality. It's because of this I'm prepared to guess the reason features like these weren't implemented: a conscious decision was taken not to. Microsoft essentially decided that, for the class of users who didn't disable automatic updates from the control panel, they weren't willing to allow those users to make their own decisions about when patches are applied. For some reason it made more sense to preemptively restart users' computers. Is this kind of known invasive bad behavior really justified by the risk?

June 13, 2006

Excel links...

Recently, I was asked for advice on how to learn Excel macro programming. I'm not the expert, but I read sites and books written by folks who are. Here are some useful links.

Websites:

Books:

The last book is particularly interesting: it focuses on the C/C++ API to Microsoft Excel. Since Excel 5, when Microsoft introduced VBA and the Excel COM object model, this programming mechanism seems to have fallen out of vogue, but it is still being maintained, and represents a way to write really fast and secure Excel add-ins.

May 24, 2006

A good couple of weeks for laptops...

Sorry for the long delay between posts... it's been a busy couple of months.

  • The "One Laptop Per Child" Prototype This is the result of the famous $100 laptop project at MIT. It looks cheap, simple, and bulletproof. When the machine is closed, it's sealed well enough to tolerate rain and sand storms. The machine has also been designed with an unusual display that offers good contrast in sunlight and 200dpi resolution for reading electronic books. For developing countries, this seems like it's the most valuable use case: as a cheap way to dissemenate lots of textbooks and information to folks that would not otherwise have access. The display can also switch into a more conventional color mode for running Fedora Linux. To be frank, the unique attributes of these machines seem like they'd be useful to almost any laptop user. The OLPC laptop can be pre-ordered in the US for $300, triple-price: The extra money goes to fund two laptops for children in the developing world.
  • Dell D620. I saw one of these in the Penn bookstore the other day and came away impressed. Not only did Dell do the usual CPU/chipset upgrades, they switched to a widescreeen display that'll work better in an airplane seat. Price seems pretty much the same as the older D610, so this looks like a pretty good value for a metal chassis 'professional' laptop.
  • Samsung Q1-SSD - This is the first laptop with a fully solid state disk: the usual rotating media has been replated with a 32GB flash drive. This quiets the machine down and provides good disk bandwidth for faster boot times. I'm also willing to bet that 'disk' access times are better by orders of magnitude. This is Korean-market-only right now, but this is hopefully a positive sign of things to come: 32GB is enough space to do lots of interesting things, and if this allows smaller, quieter, faster, lower-power computers, I can hardly wait.
  • Apple MacBook - a 13" widescreen laptop running Mac OS X with a Core Duo, and priced starting at $1099. This seems like a bargin.
Tags:tech
Older Articles...