From P2P to Web Services: Trust
by Andy Oram
April 14, 2004
In last week's article ("From P2P to Web
Services: Identification and Addressing"), I examined the ways in
which the development of web services might learn some lessons from
the peer-to-peer phenomenon of a few years ago. I focused on
identification and addressing. In this article I conclude my
examination by focusing on trust.
Trust
I like the idea that an application can search for a business in
UDDI and then automatically connect to that business to conduct a
deeper search. That's a P2P activity: each business can be responsible
for maintaining its own information and can provide a fuller and more
up-to-date response than a central repository is likely to do. In
fact, products and services have already been marketed that offer
distributed searches, without the benefit of web services.
But proponents of the ebXML framework, and the literature on UDDI,
call for more than search. They explicitly advocate a model where your
application completes the search, makes a choice, and automatically
enters into a business arrangement with the winner. It's seamless; it
takes the human out of the loop.
Some early proponents of ebXML thought it could lead to what Doug
Bunting, an XML Standards Architect at Sun Microsystems, calls
"instantaneous commerce"; others have been more restrained in their
expectations. Once again, if you're just refilling your pens, you
might not encounter trouble automating everything. But it's now
recognized that, if you're doing anything more complex, your
application cannot muster enough judgment to make the choice without a
human being. No IT manager wants to implement an application that
automatically chooses a vendor, find out that a product turns out to
be a failure, and then have to answer to the accounting department
when they ask, "Who chose that vendor in the first place?"
ebXML allows the simple, pen-refiller sort of purchases to go
through more quickly and can also serve as a grease to speed up
delivery in cases where parties need a more formal legal
agreement. Additional, upcoming standards may further streamline the
negotiations between the lawyers and managers on each side. But none
of them solve the problem of trusting whom you deal with.
P2P Solution
P2P researchers recognized the benefit of capturing trust in a
measure called reputation. The most familiar instance of reputation
online can be found in Internet auction sites such as eBay. There's
fraud, but it's kept down to a tolerable level. It would be
oversimplifying to attribute the trust on eBay to its feedback
mechanism. The main mechanisms holding back fraud rely firmly on
real-world enforcement, such as those provided by credit card
companies. That's a social infrastructure. And even that is probably
not enough. eBay's success relies, like so much in life, on the
fundamental decency of the average person. Most people are honest.
Unfortunately, reports in the press indicate that sophisticated
abuses of Internet auction sites are increasing, just like credit card
fraud, identity theft, and unsolicited email. Reputation systems are
nowhere near ironclad enough to resist deliberate subversion. Someone
can carry out many trivial transactions to establish a good
reputation, and then cheat on a transaction that really
matters. Someone can get his friends to submit dozens of testimonials
regarding his good name. Someone can even combine the two attacks,
carrying out trivial transactions with his friends so they are allowed
into the system to submit ratings (as on eBay, where ratings are
accepted only from people who carry out transactions).
The flip side of getting bogus ratings is the problem of not
getting enough ratings. Most people don't want to take the trouble to
submit a rating, so it's tempting to create incentives for them to
post ratings. But these incentives can become perverse incentives, if
people try to win them by submitting ratings on things about which
they don't really know anything.
Most problems of reputation systems boil down to this: who rates
the raters? How can you trust ratings? Even when people are honest,
their natural differences in temperament throw off the
ratings. Researchers have introduced meta-rating systems, but the more
they try to control for these differences, the more complex and
unwieldy their systems become, with no demonstrable resolution to the
original problems.
Reputation is easier to solve when it involves a matter of taste.
People who bought one book from Amazon were also likely to buy a
certain other book; that's a small but useful fact. If you enjoy many
of the same movies as someone else, you're likely to enjoy the next
movie that this person rates highly. That's the system that underlay a
service called Movie Critic, an early collaborative filtering system
on the Web promoted by O'Reilly & Associates. Collaborative filtering
is useful, but it deals with taste more than with trust. Furthermore,
they depend on aggregating many data points and therefore depend on
lots of transactions taking place.
How do we trust the people and organizations that we hear about for
the first time, often located thousands of miles away? There is a
small, intense community of researchers who deal with reputation
online, tirelessly advancing and combining research on such issues as
proofs of work, group dynamics, digital cash, game theory, and so on.
It's a heady mix and a wonderful field to watch, but it has made very
little progress beyond what you see in eBay.
The problems of reputation and distributed systems are made more
complicated when the requirements for the systems embody some
contradictory goals, such as preserving anonymity while preventing
denial-of-service attacks, ensuring that everyone's input is tallied
correctly, and limiting the number of times people can submit ratings.
Incidentally, these problems are commonly found with electronic voting
systems (particularly ones that allow remote voting), and explain why
their use is inherently risky no matter how ideally they are designed
and implemented.
Distributed monitoring systems currently depend on cooperating
processes: the authority that collects statistics and dispenses
commands must trust the other systems to provide accurate statistics
and to act on those commands. There is little or no provision for
processes that are not trusted.
Reputation researchers admit they don't have the answers, but they
do have one existing system they admire -- Advogato. This is a system for
coordinating contributions to free software from multiple programmers
who may not know one another. In the reputation part of the Advogato
system, each programmer assigns a degree of trust to other programmers
she's worked with or whose code she's evaluated. Furthermore, to the
degree you trust one programmer, the system assumes you have some
trust for the programmers she trusts, that is, the trust ratings are
transitive.
In real life, trust might not be transitive. I might entrust my
teenage kids with access to my telephone or my Internet account, but
not entrust the same resources to their friends. Advogato recognizes
that trust weakens as it goes through intermediaries.
One key feature of Advogato likely disqualifies it as a model for
global commerce and commercial web services: it depends on a chain of
personal contacts. Because your trust extends step by step through the
system, it can grow only incrementally. Introducing someone new to the
system requires her to build up trust relationships with individuals.
This is not scalable; it is not suitable for millions of people or
businesses.
Now there are a raft of "social networking" sites that try to scale
trust relationships. But it remains to be seen whether people really
benefit by forming relationships with people they don't know on the
social networks. I know that I get friendly with a lot of people
because they're interesting for one specific reason, but that doesn't
mean they're fit for some other purpose defined by another person who
is looking for a friend or colleague. In fact, I don't vouch that the
people I know are fit for anything at all. In a nutshell, social
networking faces a problem of metadata. How do you formalize
relationships? How do you say that something relevant to one
relationship is relevant to a different relationship with a different
person?
Some P2P applications can deal with the problem of reputation in
the same way as some deal with the problem of addressing, namely, by
ignoring it. If a system includes massive redundancy, as with
file-sharing networks, bad actors don't matter. They're considered
damage that the system can route around. And a user who receives bad
data can throw it away and try again. SETI@Home, an early model for
grid computing, uncovers users who submit false data by sending the
same data to multiple users and checking whether their results
match.
Trust applies not only to the data you get back, but the data you
give out. When grid computing enters the corporation, most companies
keep it internal in order to protect their data. Even so, new problems
of trust arise. Consider a scenario suggested by security researcher
William McCarty of Azusa Pacific University. Imagine that chunks of
information are sent to Joyce's computer in sales. Joyce, who has been
trusted up to now with sales information, is now also being entrusted
with any other data being processed on her computer as part of the
grid. Furthermore, if a malicious intruder should get access to the
program running on the grid, he may end up with access to sales
data.
Back to the Security Basics
Where do trust and reputation fit in modern computing? The computer
security field has built up an understanding that before you deploy
any technology that connects you to others, you should determine what
you're expecting of them and whether they're trustworthy. The simplest
security system is where two people trust each other. This is called
pairwise trust. It can be illustrated by encrypted email.
Alice sends Bob an email encrypted with Bob's public key. Bob sends
Alice an email encrypted with Alice's public key. They don't need to
trust any third party. However, this works only if they've previously
exchanged keys through an outside channel. It's not safe for Alice to
just send Bob her public key over email because anybody could pose as
Alice and send Bob a public key, then correspond with Bob indefinitely
until Bob discovers the ruse. So unless Alice and Bob have exchanged
keys in person or corresponded through postal mail, they're going to
depend on a third party.
Now there's a slight weakening of trust. Alice and Bob depend on
the third party being secure, so that no one can substitute the wrong
key, and on the third party being honest. Nevertheless, this scenario
is fairly trustworthy and is used all the time. Alice or Bob can buy
an encryption key fairly cheaply from VeriSign. So long as the
correspondent's mailer knows how to contact VeriSign, the email is
guaranteed to be from the address it claims to be from.
Jabber provides several possibilities for encrypting instant
messaging. You can use SSL, which is an uncertain form of security
because it's not end-to-end. You can also encrypt elements of the
conversation using some form of digital signature, just like
email. Jabber also presents the same start-up problem as with email:
unless you use a third party, you need some outside way to exchange
keys so each side knows they're communicating with the right
person.
Unlike the relatively lightweight approach of email encryption and
Jabber, the encrypted instant messaging announced in June by America
Online in conjunction with
VeriSign is considerably more heavyweight. It provides certified
digital signatures as well as encryption, which is good, but this
design choice has three important consequences. First, it's available
only to companies who sign up for VeriSign's services, not to
individuals. Second, it costs money, though ten dollars a month for a
business is a pretty trivial price. Third, it may perpetuate AOL's
lock-in on instant messaging, putting up another barrier to third
parties who want to offer competing services that can handle the AOL
instant messaging protocol.
When email or instant messaging uses a third party, it doesn't need
to be commercial authority. A colleague can vouch for Alice and Bob,
and a chain of colleagues can be set up this way, the famous web of
trust used in PGP. Each time you add another person, the web gets a
little weaker. The third party solution also lies behind SSL -- where
a certificate authority is used to prevent man-in-the-middle
attacks -- and behind Kerberos.
The third party solution is safest in a contained environment. But
putting a third-party solution like Kerberos behind SAML allows the
solution to provide single sign-on for Web sites. Single sign-on is
kind of a Holy Grail of web services, so far as the user experience
goes. Single sign-on can be accomplished by authenticating the user at
the first site she logs in to, or by letting sites pass credentials to
some trusted third party. It's the responsibility of such a third
party to authenticate the user and offer her a token that can be
passed to all the other various sites that she goes to, whether it's
to buy something or set up a vacation or use government services.
Single sign-on certainly introduces new risks, but users already
create similar risks informally. You know how many people use the same
password to login to a dozen different sites. It's not considered a
good idea, but they do it out of ignorance or laziness. (Sites that
require a fixed form of input, such as a credit card number, compound
the problem by institutionalizing the bad practice.) Someone who
breaks the password on one site can make a good guess that it will let
him in to other sites as well. So single sign-on is probably more
secure than the status quo.
Before we go further and look at the technologies, it's very
important to look at the social infrastructure that will enable single
sign-on. It's a big responsibility to authenticate someone: it
involves making sure to check her identity carefully before giving out
a secret key, and to use robust algorithms and protocols to verify the
secret key when she logs in, and to lock up identifying information
carefully against intrusion. A travel agent probably doesn't want that
kind of responsibility, nor does a university or other likely
destinations. This sort of function is likely to be centralized and
contracted out. That's why Microsoft was hoping to make big bucks off
of Passport. Much of the work now being done by standards committees,
such as the Liberty Alliance, is precisely in harmonizing the security
of different parties who want to collaborate so they can trust each
other.
Whenever you participate in some kind of token-passing scheme in
which you trust the authentication performed by another site, you have
to check out how well that site is run. As you know, a tiny flaw or a
failure to upgrade fast enough can leave a system compromised. And if
you accept authentications from another site, you have to worry about
two sites: your own and the other one you trust.
Single sign-on is an improvement over the insecure situation we
have now where users fall into the habit of using the same password
everywhere. Fundamentally, given the difficulties of letting users
join a system security -- verifying them when they sign up and get
their secret keys -- and the diff iculties of doing authentication in
a robust manner, it's a good process to leave up to the
experts. Someone who does it for a business is probably going to do it
better than you.