
Getting in Touch with XML Contacts
by John E. Simpson
March 31, 2004
Q: How do I record contact information in XML?
I am trying to develop an address book kind of application. The
contact information will be maintained in XML format. Is there any
standard DTD for contacts?
A: Great question, especially since it's firmly grounded in common
sense. An address book would seem to be one of the simplest XML
applications to develop from scratch. But if it's so simple, surely
someone must have already tackled it. Why reinvent the wheel? And as
it happens, you've got several options. Which you select is a matter
of preference, compatibility with other standards, and perhaps
compatibility with the other parts of your application.
vCard in XML
First, as far back as 1998 -- the year XML 1.0 became an XML
Recommendation -- Frank Dawson submitted a proposal to the Internet
Engineering Task Force (IETF) for a "vCard in XML" standard. As you
may know, a vCard is an "electronic business card," suitable for
exchanging information between, for example, two e-mail
correspondents. (Indeed, many e-mail application programs allow you to
set up and attach vCards to your messages.) The vCard standard
consists of two documents promoted by IETF and the Internet Mail
Consortium: "A MIME
Content-Type for Directory Information" and "vCard MIME Directory
Profile."
MIME is the Multipurpose Internet Mail Extensions standard, also an IETF specification,
which dates back to 1996. The simplest way to think of MIME in this
context is that it allows you attach to an e-mail message some other
content, such as a text file, an image, or even a vCard.
There's nothing inherently XML-based about the vCard specification
itself. (Most applications, for that matter, don't represent vCards in
XML format.) But Dawson, who also contributed to the aforementioned
two documents, independently devised a DTD for representing vCard
data. You can find a
copy of it, together with an abstract and other supporting
materials, at Robin Cover's invaluable "XML Cover Pages" site.
All of these documents date back to 1998, which is ancient history
in terms of XML. Why might you be interested in such a cobwebbed
standard?
The answer is that the vCard in XML standard has been adopted by
the Jabber Software Foundation for use in their flagship Jabber project -- an open-source
instant-messaging protocol. Dozens of IM clients are now available
supporting Jabber's various protocols, including their version of vCard
in XML. (Note that this is a de-facto standard: although it hasn't
been officially blessed by the Jabber Software Foundation, it's in
widespread use among Jabber clients.)
A Jabber vCard is contained in a Jabber XML wrapper element
(including instructions for sending/retrieving the vCard itself),
called iq. Here's a sample vCard-only portion of
such an exchange, taken from the specification (actual addresses
altered for obvious reasons):
<vCard xmlns='vcard-temp'>
<FN>JosephUser</FN>
<N>
<GIVEN>Joseph</GIVEN>
<FAMILY>User</FAMILY>
<MIDDLE/>
</N>
<NICKNAME>joe</NICKNAME>
<EMAIL>
<INTERNET/>
<PREF/>
<USERID>joseph@notareal.org</USERID>
</EMAIL>
<JABBERID>joe@notareal.org</JABBERID>
</vCard>
The W3C vCard in XML/RDF Note
In 2001, IPR Systems Pty Ltd
submitted a Note to the
W3C, formally outlining the use of XML as a vCard standard. Like
other Notes, this one -- its full title is "Representing vCard Objects
in RDF/XML" -- has no official status; you might consider such Notes
"strawman"-style proposals or extended comments on other
proposals. Still, depending on how much detail you want to provide in
your contacts-management application, and how concerned you are with
meshing your approach with the larger world of standards, it might be
worth taking a look at.
Like Jabber's vCard in XML approach, the vCard in XML/RDF proposal
(which I'll henceforth refer to simply as vCard/RDF) embeds vCard-type
information in a larger XML document. The wrapper here, though, isn't
an application-specific one (like Jabber's IM protocol). Instead, it's
a general-purpose Resource
Description Framework (RDF) document. RDF is a full-blown W3C Recommendation;
its purpose is to encode metadata about Internet resources. In
vCard/RDF's case, the resource in question is the vCard itself.
What might you want to know about a vCard, other than the contact
information which it includes? At the very least, you might want to
know what (or rather, who) a given vCard is about. My vCard might
tell you how to get in touch with me by various means: postal and
e-mail addresses, phone numbers, and so on. But that contact
information doesn't lay out for you everything you might need to know
about me; in short, it doesn't describe
me.
vCard/RDF attacks this problem by combining, in a given document,
information in the RDF namespace with information in the vCard
namespace. Here's an example, taken from the vCard/RDF Note
(RDF-namespace elements and attributes boldfaced):
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:vCard="http://www.w3.org/2001/vcard-rdf/3.0#" >
<rdf:Description
rdf:about="http://qqqfoo.com/staff/corky">
<vCard:FN>Corky Crystal</vCard:FN>
<vCard:N
rdf:parseType="Resource">
<vCard:Family>Crystal</vCard:Family>
<vCard:Given>Corky</vCard:Given>
</vCard:N>
<vCard:EMAIL
rdf:parseType="Resource">
<rdf:value>corky@qqqfoo.com</rdf:value>
<rdf:type
rdf:resource="http://www.w3.org/2001/vcard-rdf/3.0#internet"/>
</vCard:EMAIL>
<vCard:ORG
rdf:parseType="Resource">
<vCard:Orgname>qqqfoo.com Pty
Ltd</vCard:Orgname>
<vCard:Orgunit>
<rdf:seq>
<rdf:li>Commercialisation
Division</rdf:li>
<rdf:li>Engineering
Office</rdf:li>
<rdf:li>Java
Unit</rdf:li>
</rdf:seq>
</vCard:Orgunit>
</vCard:ORG>
</rdf:Description>
</rdf:RDF>
In general, most of the RDF markup is used to describe constraints
on how the contact information is structured or what sort of resource
a particular datum is. (For instance, the three
rdf:li elements are to be used in the order shown
when referring to "Corky Crystal's" work unit; this constraint is
imposed by making those elements children of an
rdf:seq element.) Aside from that markup, however,
note in particular the rdf:Description element:
- Everything about this contact is contained within
rdf:Description's scope. While the simple
rdf:RDF element does perfunctory duty as the
document's true root, rdf:Description might be
considered its heart and soul.
- The
rdf:about attribute points to a resource
outside this document which really tells you about Corky
-- not how to get in touch with Corky, but who Corky is.
(Of course, an application which cares only about contacting
Corky would be free to ignore this information. But it's great to
have it available, and mixing the vCard markup with RDF is what
makes that availability possible.)
It's also interesting to compare this vCard/RDF sample with
the Jabber vCard above. Even without considering the namespace
prefixes, the vCard/XML Note doesn't seem to be consistently tied
to the element names from the earlier standard:
EMAIL is EMAIL in both, but Jabber's
FAMILY becomes vCard/RDF's
Family.
A commercial alternative
In researching this column, I came across an existing commercial
contact-management package which touts XML-readiness as a feature. The
application is called GoldMine, from FrontRange Solutions. (I don't
claim, of course, that this is the only such package. If you know of
others, feel free to use the "Comment on this Article" link
below.)
While GoldMine isn't just a contact manager, managing contacts
seems to be at the heart of the other things the product does. The
last several versions have offered an import from/export to XML
feature, specifically for transferring contact data between GoldMine
itself and other applications or data sources. All that's required for
importing to GoldMine is that the data conform to the expected
structure. (Exported data presumably conforms to the structure without
further user involvement.)
The structure in question is codified in an XML Schema
document. You will probably search the FrontRange Web site in vain for
this Schema -- I certainly did -- but I was able to obtain a copy of it through the generosity of
FrontRange's marketing organization. While you asked specifically for
a DTD, it's worth taking a look at the GoldMine Schema for insights
into how a commercially successful product solves the problem
(including, not insignificantly, how to handle multiple
contacts in the scope of a large-scale application).
Tying it together
So the instinct implied in your question was right: you're nowhere
near the first to consider using XML as a structured-data format for
contact information. But you might consider broadening the question's
scope a bit, by imagining something a bit more elaborate than a
"closed-shop" contact-management system: how might you build a tool
for translating contact information from one of these standards (or
any others you can find) to any one of the others?
The obvious platform for such a tool is XSLT. I'm about out of
space in this month's column to detail every issue you'd want to
(ahem) address, should you decide to tackle this bigger project.
Still, here are a few points to consider:
- Do you want a single stylesheet for handling all the combinations
of input (source) and output (result) data structures?
- A single stylesheet might be parameter-driven (specify the input
and output data types, e.g. "vCardRDF" and "GoldMine," at
runtime).
Also in XML Q&A
From English to Dutch?
Trickledown Namespaces?
From XML to SMIL
From One String to Many
Little Back Corners
- Multiple stylesheets -- one for each input/output combination
-- might be simpler to tackle at first. But they might be harder to
maintain and keep consistent over time. (Plus, they wouldn't be able
to take advantage of structural duplication; for instance, regardless
whether you're transforming to Jabber or vCard/RDF,
an
EMAIL element is an EMAIL element.)
- Each standard includes not only required elements and attributes,
but optional ones as well. How will you handle this optionality? For
example, vCard/RDF allows for the inclusion within the vCard of
text-encoded binary data, such as an image. Is there some other way
of including (or at least referencing) this possibly useful data in
vCards conforming to the other standards?
- Are the specific element contents and attribute values' data types
consistent across standards? How will you handle differences?
- Is there some way to leverage your newfound awareness of the
various data structures to provide output to formats besides other
XML-based contact managers? As an example, think of feeding the
contact information through a stylesheet to generate an XSL-FO
document; this might be suitable for printing to Rolodex-type
hardcopies, or even being passed to a text reader for audible
output.
The important thing, I think, is not to confine your imagination to
the relatively static context of "an XML document"
-- even a bunch of XML documents. As always with XML, the most
important questions are not those dealing with the data as such,
but those dealing with what to do with the data once
it's in XML form -- not only what the data might be, but what it
might just come to mean.