
All Roads Lead to RDF
by Edd Dumbill
August 11, 2004
August. Season of long, hot, lazy days, and not much else.
While the vacation of all things XML continues, those
eager RDF folks are still hard at work. Consequently, the XML-Deviant
this week focuses largely on RDF-related topics.
RDF: The Natural Conclusion of Web Services?
For my main topic this week I am reaching again into the world of
weblogs. Web services have never made for great mailing list
discussions, but there are often quite thoughtful pieces to be found
on the topic on the Web. One of the more prolific writers on web
services has been Mark Nottingham. He is currently employed by BEA and has been an
active figure in the development of web services specifications.
On his weblog, Nottingham has been pursuing an occasional series
called "e;XML Heresies,"e; and it is his latest installment that drew my attention this week. In The 'Document' in Document-Oriented Messaging he discusses exactly
what is inside the payload of a web service.
As Nottingham points out, document-oriented messaging is
characterized by specifying a protocol in terms of what goes over
the wire, rather than the code that handles the protocol. These
days, that content is likely to be in XML. The next
natural step then is to look for a way of constraining these
messages. Nottingham highlights EBNF, DTDs, W3C XML Schema, Relax NG,
and OWL as all being suitable technologies.
Nottingham then raises the question that concerns the current
state of web service specification:
The whole idea of web services is to give people a protocol
construction toolkit that allows them to easily specify messages,
suck them into code, and start working with them easily, on most
any platform. So, why is it then that web services went shopping
for these things and came back home with the XML Infoset as
described by XML Schema, of all things?
Now, aside from general aesthetic concerns, what's the problem
with using Infoset and Schema? There's simply too much in there,
says Nottingham.
Think of it from an information theoretic standpoint; if the
various Information Items and properties of an Infoset are each
capable of carrying information, we've got a pretty big footprint
to work with, and Schema doesn't give very precise tools for
sorting the signal from the noise. Because each different tool
chooses a different, incomplete portion of the Infoset to model,
interoperability is hard.
Also in XML-Deviant
The More Things Change
Agile XML
Composition
Apple Watch
Life After Ajax?
Nottingham observes that both Atom
and WSDL 2.0 eschew
W3C XML Schema for describing their wire formats. The latter
is particularly ironic, of course, given the general expectation on
web services developers to use W3C XML Schema.
Something more than W3C XML Schema is needed, and something that
addresses data modeling rather than merely syntax constraints.
Nottingham writes:
Don't get me wrong; XML is a great foundation for syntax, but
data models that directly map to it (such as the Infoset, PSVI,
XQDM, etc.) are a horrible basis for a generic, interoperable
protocol toolkit...
...The real trick, IMO, is getting the advantages of XML -- like
platform neutrality, versioning, extensibility, nested data
structures, self-description, and human readability -- without the
complexity of the Infoset or the problems of XML Schema. A
simpler, higher-level data model that has a mapping onto the
Infoset while still providing these things could do the job.
Nottingham's next proposal is the one that should surprise most
followers of web services. He points out there are two ways
forward. The first way is to subset W3C XML Schema, an approach
that the WS-I group seems to have started. The second way is to start over. But
reinventing from the ground up might not be necessary:
...we might be able to just switch horses. A little while back, I
made a
direct comparison between the two stacks that the W3C is
developing; one based on the Infoset, the other on the RDF data
model. It's pretty clear to me that the RDF data model is simpler;
the next step, I think, is to see if and how it (along with OWL)
provides the purported benefits of XML, such as nesting,
extensibility, and versioning. The first of these is pretty easy
(it's a directed graph, so it's arguably superior); the latter two
are beginning to be explored. Stay tuned.
Wow. However, this actually seems to make a lot of sense.
Nottingham's notion perhaps may not be too surprising to XML document-heads who wondered at the bizarre monster that is W3C XML
Schema, nor to the semantic webbers who have marveled at the rush
to cram all data into XML's tree-shaped structures.
So what reaction did Nottingham's piece receive?
James Tauber agrees,
and sets out a bulleted list of his beliefs of where XML and RDF
stand in relation to each other. Interesting in particular are his
conclusions:
I therefore believe that when one develops a vocabulary (or
"e;application"e; in the SGML sense of the term) it should
include:
- a schema for the RDF in something like OWL
- a schema for the XML in something like RELAX NG
- a mapping between the two (and RELAX NG should support inclusion of this mapping)
Back on Nottingham's site, Randy Charles Morin speaks up for W3C XML
Schema:
XML Schema hits the 80-20 mark. End of story.
Nottingham politely rebuts Morin:
Sorry, I need more convincing than you saying it's good
enough. Lots of people -- including myself -- have done the work and
found XML Schema lacking, so much so that they're looking for
something better.
Sean McGrath, who's not without a good deal of web-services integration experience, is a little less reserved.
W3C XML Schema hits the 80/20 mark for schema languages the same
way that a boiled egg hits the 80/20 mark for a balanced diet ...
If you want to see what a real 80/20 point looks like
in the field of schema languages, look at Relax NG.
Analysing the problem in a broader scale, Bill de hOra doesn't
think there is a silver bullet -- the real interoperability issues
in web services are to do with humans -- and says the bigger
benefits in RDF lie elsewhere:
RDF technologies will be useful insofar as they'll help drive
the interop problem up the stack. But there will continue to be an
interop problem since people won't even agree on vocabulary, never mind semantics.
But here's the thing -- RDF versus XML, or RDF as some kind of
surrogate for XML, are xml-dev permathreads that must die. Really
where RDF could have significant impact is not swapping out the
XML stack, but in the business logic and mapping rules we're been
busy embedding into in-systems programming languages for the last
two decades -- in that sense it aligns nicely with data-directed
languages like Schematron, SQL, and from way back Prolog (before
it got tarnished with the AI brush).
I've not had room here to include the whole text of all the
contributions, so I do recommend reading the pages concerned. It's
good to know that there is still lots of debate and questioning
taking place in web services.
The Growth of FOAF
The Friend-of-a-friend
(FOAF) project is an RDF vocabulary for creating machine-readable
home pages. (Read Leigh Dodds' Introduction to FOAF on XML.com.) Created by Dan Brickley and Libby Miller, it
has proved among other things a useful testing ground for ideas and
software for the Semantic Web. As a center of development FOAF is
coming of age this year by hosting two events for those working on
or around the project.
The first of these is FOAF Camp, an informal
gathering in the Netherlands on Aug. 19-20. Conducted in a
self-organizing way similar to O'Reilly's FooCamp, FOAFcamp promises
to be a forum for fun and creative discussion about FOAF.
The second event is more formal, taking the shape of an academic
workshop. The 1st Workshop on Friend of a Friend, Social Networking and the Semantic Web is part of the Semantic Web Advanced Development Europe
initiative. It will be held Sept. 1-2 in Galway, Ireland.
I've been fortunate enough to be on the program committee for this
workshop, and can predict a rich and valuable meeting.
Births, Deaths, Marriages
The latest announcements from XML-DEV.
- XML 2004 Program Released
Find out who's speaking at the main USA event for the XML
community. New format places tutorials on either side of the
conference, rather than in the preceding two days. Call for
participation is still open for late-breaking news presentations, and
sponsorship and exhibitors. XML 2004 takes place in Washington DC,
at the Marriott Wardman Park Hotel, Nov. 15-19.
- Stylus Studio 5 Home Edition
Not entirely sure who wants to take an XML IDE home.
"e;Stylus Studio 5 Home Edition is specifically designed for
learning or working with XML in educational, training, or home
settings, and is available now for only $49 (USD) for a
single-user license."e; Look out for forthcoming Barbie
Edition.
- Expat 1.95.8 Released
New release of venerable XML parser adds "e;suspendability."e; According to the announcement, this "e;allows for
parsing a document in chunks without having to use a separate
thread, and it makes it also possible to build a pull parser on
top of Expat."e;
- Nature RSS Newsfeeds
Nature Publishing Group embrace RSS 1.0 in style, adding
in metadata from the PRISM standard.
Scrapings
Soundtrack of their lives: it's
fun to program in XSLT!... fancy
a 3,000-message flame war?... Messages to XML-DEV this week: 40
(vacation season), Len rating 7.5% (blame the blog)
...
humor prematurely curtailed due to vacation. I'm off to Jordan for
a week, I hope you enjoy your travels, too.