Told  /  Origin Story

The Birth of Our System for Describing Web Content

Over a weekend in 1995, a small group gathered in Ohio to unleash the power of the internet by making it navigable.

In 1995, most librarians were using MARC (MAchine-Readable Cataloguing) to create metadata for their library catalogues. MARC records are complex, extremely long, and require deep expertise to create. These kinds of elaborate descriptions could never work at scale for the entire web. Automated approaches weren’t on the table back then, and it soon became clear to all attendees, even those who had showed up thinking that they might be tweaking an existing system, that the metadata standard for the web would have to be something entirely new: simple enough for anyone to label their own documents as they posted them online, but still meaningful and specific enough for other people and machines to find and index them. A brand-new, simple and succinct metadata system would mean that, for the half a million existing items online, and the millions and billions more that everyone knew were coming, there would need to be one agreed-upon way of adding the metadata tags, with the same kinds of information in the tags themselves.

Creating these labels involved figuring out not just what would be needed to find files that were online now, but also what might be needed later as web content continued to snowball. There was no formalised voting or veto process to come up with the system; each piece of metadata was created through consensus, compromise and, occasionally, real fights. Much of the argument, in fact, concerned the nature of the future no one could truly predict in full.

For example, many attendees didn’t anticipate that automated search engines were coming, though some of the more technical people saw them on the horizon and were pushing requirements for improved geolocated discovery. As Miller says: ‘I remember introducing the [geolocation] coverage element and getting a lot of blowback. I made the point that coverage is going to be local as well as global, like: Find a restaurant near me. We were trying to push the envelope so that we would be ready when other technologies advanced and other services became available.’ Other attendees saw geospatial data as something put in to assuage a person or community, and they weren’t sure it made sense given the need to keep the system lean.