Technology Review - Published By MIT
Advertisement
« Back 1 2 [3] 4 5 6 7 8 Next »

March 2007

A Smarter Web

Continued from page 2

By John Borland

smaller text tool iconmedium text tool iconlarger text tool icon

Dewey envisioned all human knowledge as falling along a spectrum whose order could be represented numerically. Even if arbitrary, his system gave context to library searches; when seeking a book on Greek history, for example, a researcher could be assured that other relevant texts would be nearby. A book's location on the shelves, relative to nearby books, itself aided scholars in their search for information.

As the Web gained ground in the early 1990s, it naturally drew the attention of Miller and the other latterĀ­-day Deweys at OCLC. Young as it was, the Web was already outgrowing attempts to categorize its contents. Portals like Yahoo forsook topic directories in favor of increasingly powerful search tools, but even these routinely produced irrelevant results.

Nor was it just librarians who worried about this disorder. Companies like Netscape and Microsoft wanted to lead their customers to websites more efficiently. Berners-Lee himself, in his original Web outlines, had described a way to add contextual information to hyperlinks, to offer computers clues about what would be on the other end.

This idea had been dropped in favor of the simple, one-size-fits-all hyperlink. But Berners-Lee didn't give it up altogether, and the idea of connecting data with links that meant something retained its appeal.

On the Road to Semantics

By the mid-1990s, the computing community as a whole was falling in love with the idea of metadata, a way of providing Web pages with computer-readable instructions or labels that would be invisible to human readers.

To use an old metaphor, imagine the Web as a highway system, with hyperlinks as connecting roads. The early Web offered road signs readable by humans but meaningless to computers. A human might understand that "FatFelines.com" referred to cats, or that a link led to a veterinarian's office, but computers, search engines, and software could not.

Metadata promised to add the missing signage. XML--the code underlying today's complicated websites, which describes how to find and display content--emerged as one powerful variety. But even XML can't serve as an ordering principle for the entire Web; it was designed to let Web developers label data with their own custom "tags"--as if different cities posted signs in related but mutually incomprehensible dialects.

In early 1996, researchers at the MIT-based World Wide Web Consortium (W3C) asked Miller, then an Ohio State graduate student and OCLC researcher, for his opinion on a different type of metadata proposal. The U.S. Congress was looking for ways to keep children from being exposed to sexually explicit material online, and Web researchers had responded with a system of computer-readable labels identifying such content. The labels could be applied either by Web publishers or by ratings boards. Software could then use these labels to filter out objectionable content, if desired.

Miller, among others, saw larger possibilities. Why, he asked, limit the descriptive information associated with Web pages to their suitability for minors? If Web content was going to be labeled, why not use the same infrastructure to classify other information, like the price, subject, or title of a book for sale online? That kind of general-purpose metadata--which, unlike XML, would be consistent across sites--would be a boon to people, or computers, looking for things on the Web.

« Back 1 2 [3] 4 5 6 7 8 Next »
March/April 2007

Would you like to read more articles from the March/April 2007 issue?

This article is from the March/April 2007 Issue of Technology Review. To read other articles from this issue simply register for My.TechnologyReview.com. It's free.

Subscribe today and save up to 41% »

Comments

  • Dewey
    james.c.robertson on 04/09/2007 at 2:53 PM
    Posts:
    1
    A gentle suggestion that OCLC (and librarians) aren't "obsessed with organizing and accessing information", but perhaps "dedicated" to it, instead.  The word "obsessed" reinforces a specific stereotype.
    Also, OCLC only took over ownership of the Dewey Decimal System in 1988.  While Dewey is still used widely in public library systems, most larger systems (like research libraries and universities) use the Library of Congress Classification System -- millions of books classified in the LC system are in the OCLC database.
    Rate this comment: 12345
  • Web Is Not Packaged Software
    jabailo on 06/16/2007 at 9:15 PM
    Posts:
    4
    Avg Rating:
    4/5
    The whole term "Web n.0" is based on naming conventions from packaged software in the 1980s (Word 2.0, Windows 3.11 ).    It's completely wrong for web and internet technologies.   There is no "release date" for these things -- they emerge.
    Rate this comment: 12345
  • very instresting
    ???? ????? ????? on 02/05/2008 at 12:27 AM
    Posts:
    1
    Very intresting article Thanks
    Rate this comment: 12345
Advertisement

Current Issue

Technology Review November/December 2008
Sun + Water = Fuel
An MIT chemist has opened the way to making hydrogen fuel from water using sunlight.
•  Subscribe
Save 41%
•  Table of Contents
•  MIT News

Magazine Services

Career Resources

MIT Technology Insider

Stories and breaking news from inside MIT about the latest research, innovations, and startups--in a convenient monthly e-newsletter. Subscribe today

Follow us on Twitter

Twitter

Get Technology Review updates via the web, cellphone, or Instant Messager – Follow techreview on Twitter!

Advertisement

More Technology News from Forbes

Advertisement
Advertisement
TECHNOLOGY RESOURCES
Advertisement
MIT Massachusetts Institute of Technology