Microformats: breathing life into web content

An interesting conversation came up at work around embedding XML documents into web pages using namespaces, and in my opinion, the conversation entirely underscored why microformats make sense. Since the late 90’s, there have been many efforts to standardize the way information is described using XML. While these definitions have been useful for many applications, their usefulness typically fails to translate to the web for a couple of reasons.

Case-in-point, look at MathML. The first version was designed by a W3 committee in 1999. It has been used successfully in many applications. Yet even after seven years, the popular version of Internet Explorer still requires a third-party plug-in to view it. This means, if an organization wants to store math related content as MathML, yet wants to publish it in a web format supported by major browsers, it must first transform the MathML into something browser-friendly like a PNG or GIF.

This scenario points out two of the bigger problems with XML on the web:

  • Web applications often fail to deliver content retaining the structure found implicitly in XML and databases. Rather, web applications typically take structured data and transform it into web friendly (unstructured) formats like HTML and GIF.
  • Web applications that deliver structured content typically rely on browsers having the capability to display it. For example, in order for MathML to properly display in a web browser, it must have support it out of the box (such as FireFox), or have a plug-in (like Internet Explorer).

Why are these problems worth overcoming? Look at Google. Its search algorithm exploited one of the few bits of structured data available in plain-vanilla HTML, the hyperlink. Give programs the ability to easily extract meaning from a web page and you get something indistinguishable from magic.

Microformats stand as a possible solution to these problems. They leverage the existing popularity of XML based web-friendly formats such as XHTML and RSS and do so in a way that makes the technology accessible to the average web developer knowing only HTML and CSS.

With microformats, data is both structured and web friendly at once. So instead of embedding XML documents within an web page, consider the benefits of hiding them.

2 thoughts on “Microformats: breathing life into web content

  1. My company, Design Science, makes MathPlayer, which is the 3rd party plugin for IE that displays MathML that is probably being referred to in this article.

    While I have nothing against Microformats at all, I do feel the need to dispute the implication that reliance on 3rd party plugins is a “bad thing”. The same people that feel that plugins are bad also probably argue that bloated, do everything apps like Microsoft Office are also bad. One gets the feeling that the underlying opinion is really Microsoft is bad — not that there’s anything wrong with that.

    Although there may be valid reasons for believing that builtin support is better (don’t have to download and install the plugin for one thing), there is another, more important (IMHO), reason that plugins may be better.

    While the builtin MathML support for Firefox is adequate, it doesn’t include Content MathML support and it doesn’t include interface with screen readers for accessibility via math-to-speech. Our MathPlayer does include these things but my point is that by providing a rich, powerful plugin mechanism, IE makes it possible for any closed source or open source company or individual to provide a plugin that does a better job or does it in a way that the consumer prefers. In other words, it is an enabling technology.

    Bottom line: plugins are better than builtin, iMHO.

    Paul Topping,
    Design Science

  2. Hi Paul, thanks for the great comment. My point regarding browsers and structured web content isn’t that the plug-in option is inferior to built-in, but rather, that *neither* relying on plug-ins nor built-in support are practical in the common case. In the current paradigm, trying to publish structured content while supportting major browsers is a non-starter– either your browser supports the format out of the box (unlikely) or you require 100% of your users to download a plug-in (impractical).

    Microformats attempt to address this by using an alternative approach. They use the existing syntax of XHTML as the means for providing the semantics of the structured content. What this means in practicality is that web browsers are already set up to display them without a lick of extra code or plug-in.

    Does this mean we’ll see an hMath format? Unlikely. The Microformat community embraces the approach of solving the 80% case quickly. This means focusing on putting structure around the basic types of information that we see commonly on the web, such as people, events, reviews, etc.

Leave a Reply

Your email address will not be published. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>