P2P and XML in Business
by Brian Buehling
July 11, 2001
Following the growth of business-to-business exchanges and supply
chain management systems, the emergence of peer-to-peer (P2P)
computing is likely to become another deployment arena for XML
technology. Whether exchanging user messages, application state, or
processing instructions, relaying information effectively is a
critical component of any P2P application. By using XML system
designers can establish rules for peer interaction that allow
developers to build applications independently. From this
perspective, one sees how, by facilitating this communication, XML
plays an important role in P2P application design.
What is P2P?
As with any technology emerging under the media spotlight, P2P as a
whole is open to misinterpretation. Much of the confusion surrounding
the term "peer-to-peer" arises from companies applying the label to
dozens of distinct types of system. For instance, SETI@home, the
well-known distributed computing project designed to analyze data with
the hope of finding extraterrestrial life, has little in common with
the infamous Napster music community. Similarly, Groove Networks'
collaboration system cannot be directly compared to the Jabber Open
Source project that focuses on instant messaging. Yet these disparate
systems are all are touted as key elements of the P2P movement.
As a result one is hard pressed to find a common technical thread
among these P2P applications. Complicating matters further, there
exists no field monopolizing these initiatives, as notable
contributions to P2P technology have been made in many areas. Nor is
there a single industry sector driving the effort. Network equipment
manufacturers, open source projects, educational institutions, and
scores of unaffiliated independent programmers all have played an
important role to further the development of the P2P systems.
Without a suitable definition in terms of technology or
contributors, the industry is left to describe P2P in terms of the
intent of its supporters. Framed this way, peer-to-peer is best
defined as the set of technologies targeted at better utilizing
resources that are networked together. Defining peer-to-peer as any
system designed with the explicit intent to take advantage of
under-utilized networking, disk space, processing, or user resources
at the edges of the Internet is the best way to accurately depict the
underlying movement while still encompassing all aspects of the
technology.
Does P2P make sense?
The timing of this new interest in peer-to-peer technology is
interesting. Just when IT managers have begun to adapt to the shift
from client-server applications to web-based application services,
users are showing new found interest in exploiting dormant resources
on their desktops connected to networks. In fact, users are beginning
to demand more control over their computing resources every day.
Whether creating chat rooms with colleagues or sharing files with
clients directly, users want the ability to use applications without
relying on IT departments to set up user accounts or create virtual
private networks to support them. For years, IT administrators have
been pressured to consolidate IT support operations by locking down
corporate desktops and centralizing computing resources. Now they are
being told that their systems are too rigid and don't allow users
enough control. Not surprisingly, the demand for new peer systems has
been met with harsh resistance.
Many IT managers thought that their jobs would be getting more
bearable as decreasing server costs allowed them to meet the budgetary
constraints of their departments. The pendulum seems to be swinging
once again as the indirect costs of under-utilized desktop computing
resources have offset the hardware savings of server-centric IT
systems. This current shift highlights the continuing oscillation
from central to distributed control of computer systems. Those who
witnessed the prior shifts, from mainframes to client-server
applications and more recently from client-server applications to
server-centric ASP architecture, should find the rationale behind P2P
architectures vaguely familiar. Looking at computing architecture over
the course of the last quarter century, one sees that the P2P movement
is the just the most recent phase of this centralized-distributed
cycle.
Despite the historical and theoretical justifications of P2P
systems, the costs associating with developing, deploying, and
supporting client application are not insignificant. So before
starting a P2P crusade within an organization, one should be certain
it makes economic sense. Although there is much discussion concerning
this topic, any viable P2P system should offer benefits that cannot be
achieved relying on another computing architecture that is less costly
to maintain.
XML and Peer-to-Peer Technology
After determining that P2P technology is appropriate for an IT
project, there are several design challenges that will have to be
solved before any development can begin. Since pure P2P systems have
no central servers for dispatching information between peers, devising
a mechanism for peers to communicate is a critical aspect P2P design.
And efficiently distributing and storing application data for peer
access is not a trivial task since data often has to reside locally on
the peer for processing. And managing the updates to the peer
application components themselves is of paramount concern as even a
simple bug fix can lead to a distribution nightmare. It is no
coincidence these are the areas in P2P technology that benefit the
most from XML.
Messaging
XML offers an ideal mechanism to transfer short, structured
messages between peer applications. XML can be easily customized for
specific P2P systems and readily transmitted over today's Internet
protocols. XML data can be encrypted using existing technologies,
making it an ideal candidate for secure messages. There are already
several implementations of XML-based messaging schemes, including SOAP
and XML-RPC.
Data Storage
Utilizing XML to cache application data locally in P2P systems
offers several advantages. Caching data in XML allows for more
flexibility and easier retrieval than custom or unstructured formats,
and it has a much smaller overhead than installing a relational
database on each peer. Developers can take advantage of XML handlers
to search, validate, retrieve, and manipulate the data needed to
support the peer application. This approach will reduce the overall
complexity of the P2P system. In many cases XML stores are easier to
implement than storing unstructured data directly in the file system
and require less system resources to operate than relational
databases.
Application Deployment
XML can also be used to help manage the deployment of the
application components to peers in the network -- often one of the
most difficult challenges of P2P systems. With the potential of
having millions of peers interacting, having an effective process to
distribute software updates is essential to the long-term success of
any P2P system. One XML-based solution to this problem is Open Software
Description (OSD). OSD files allow system architects to define the
application components required for peer applications along with the
location to download these components and any component
dependencies. Effectively integrating OSD files into a P2P deployment
strategy shifts the burden of software upgrades from the user to the
P2P application itself. Each peer can verify that it has the most
recent software components and automatically download upgrades if
needed.
Conclusion
It seems likely that P2P technology will have an influence on many
aspects of the IT industry. Whether or not it can live up to the
aggressive hype with which companies are promoting it remains to be
seen. Whatever the outcome, XML will continue to play an instrumental
role in its future.