Semantic Web?

Philip Greenspun's Homepage : Philip Greenspun's Homepage Discussion Forums : Ask Philip : One Thread
Notify me of new responses
In your litigation story, you stated that aD was successful because, among other things, they started working on the problem of web/db before anyone else noticed it. According to Tim Berners-Lee the Semantic Web will be the Next Big Thing, allowing software to make intelligent decisions using RDF and AI (whatever that is). In your opinion, is the 'Semantic Web' at the point the web/db problem was back in '93? I.E., is it mature enough to start investing time into developing tools and applications for it?

-- Ryan Campbell, June 4, 2001

Answers

Here's something I wrote in a report to my friends about my experience attending the Web conference in Hong Kong (May 2001):

The Web Consortium as AI Laboratory

Tim Berners-Lee opened the conference with a talk in which he said that his hope was to see the 21st century Web as a creative environment. He proceeded to lay out so many and such complex standards that it was apparent that ability to construct the next generation of Web services will be limited to 4 engineers at IBM, 2 at Sun, and 3 at Microsoft. Anyone else who wants to join the club will have to spend a year just reading standards documents.

The initial Web standards were simple. HTTP is simple enough that any competent programmer can write a basic server in a day or two. HTML is simple enough that programmers were able to build their first page within 30 minutes and non-programmers weren't far behind. In fact, the initial Web standards were so simple that academic computer scientists predicted that the system wouldn't work.

With the push to build "the semantic Web", i.e., an Internet in which many documents are machine-understandable, the Web Consortium has leapt into the 1960s and the challenge of Artificial Intelligence. Instead of Web Consortium members talking about whether or not to facilitate adding a caption to a photograph, you hear words like "ontology" thrown around. The standards and solutions in this area are so immature that there were hardly even any academic papers on the subject, much less practical systems. "Enabling knowledge representation on the Web by extending RDF Schema", http://www.cs.vu.nl/~mcaklein/papers/www10/, is a good indicator of where things are.

[For an introduction to the ideas of semantic Web, check out http://philip.greenspun.com/research/shame-and-war (written in 1994). Then check out http://www.w3.org/2001/sw/ for the modern version.]

The good news is that Web standards work is no longer about finding a workable subset of ancient formatting and hypertext markup languages. The bad news is that progress in Web standards, formerly dilatory, may slow to the historical pace of progress in AI (i.e., glacial).

Given that there are no standards for class hierarchies I think that there is no way for a small company to make any headway in semantic Web applications. Maybe a company like GE could dictate standards to its vendors and customers and have an interesting subcommunity using semantic Web. But I don't see any opportunity for the equivalent of an amazon.com or a photo.net. On the other hand it is probably a good opportunity for tools vendors.

-- Philip Greenspun, June 17, 2001

I'm a longtime XML skeptic but I recently stumbled on a domain where I think XML (or a simpler equivalent) can make a big difference within a very short time: timelines.

All we need is standard format for representing dates, and a registry for 'celebrity IDs', and search-engines will be able to collate all the events in a range of dates, and all the events with a given celeb. And that's just the startingpoint-- from there it gets really interesting.

-- Jorn Barger, June 15, 2001


I think there are two issues here, XML and the Semantic Web. Many people are doing interesting things with XML, including GE, and timelines would not be too hard to implement.

The difficult thing is designing a system by which twenty different groups can each have different XML schemas for timelines, but your program can still query and aggregate data from all of them because it knows how to decode the different XML formats and make sense of things. That is what the Semantic Web is all about: overcoming the failure of humans to agree on a standard way of representing information.

I think Philip is right, it's like AI, and we can't expect a silver bullet any time soon. Of course, anyone who's worked on "Enterprise Application Integration" can tell you that even big companies who have dutifully followed the advice and "best practices" of big consulting firms, and implemented "best of breed" software to automate their business processes, can't agree on simple things like what to call the price on a purchase order. Despite the best efforts of Tim Berners-Lee and company, integration projects and middleware probably have a brighter future in the 21st century than the Semantic Web.

-- Ben Ballard, June 19, 2001


This article made me giggle. I work on a web application for the medical industry. I spent the last week importing an insurance company's COBOL file. People say XML is replacing EDI formats like X12, while these people are struggling to move to X12. By the time they're providing RDF, our blood will be full of nanobots and we won't need doctors anymore.

-- Dennis Peterson, August 28, 2001

As a Semantic Web hacker (and proud of it) I'm glad to say that the Semantic Web doesn't tackle impossible AI problems, has standards much simpler than the Web services stacks that the big companies seem to produces in 500 page increments, and has already developed a simple language for class hierarchies, descriptions, etc.

I'm working on some explanatory material that should help you get up to speed with the Semantic Web concepts in an hour or two. It's not very difficult, and I'm sure anyone who can learn SQL can figure out how to write some RDF rather quickly.

In the end, what the Semantic Web comes down to is a way for sharing database content over the Web. There's a lot of power in doing a JOIN between two websites. If we start getting a lot of data out there, that's a pretty cool thing, IMO.



-- Aaron Swartz, August 28, 2001