Protocol Design: Reliablity and Security
by Itamar Shtull-Trauring
August 25, 2004
Protocols are part of the infrastructure used to implement a
service: delivering an email message, requesting a web page. These
services have security and reliability requirements that need to be
implemented at different layers of the infrastructure. For example,
emails should not be dropped along the way, web pages should be
returned untampered by malicious third parties. When designing a
protocol it is important to understand what guarantees the protocol
and the infrastructure layers beneath it must provide, and at what
layer they should be implemented.
Reliability, Security, and Other Guarantees
Some of the guarantees that protocols might want to provide include
privacy, integrity, identity, reliability, and freshness. Privacy
requires that third parties won't be able to get information about the
contents of the messages, their destination, or other similar
properties that depend on the requirements of the protocol. Integrity
means the contents of messages will not get changed or
corrupted. Identity lets the various sides of a conversation have
some knowledge of who they are communicating with.
Reliability means
that data arrives and is not dropped along the way. Freshness means
that data that arrives is known to be up-to-date and not, for example, a
copy of an old message. Depending on the service the protocol is
intended to provide, some or all of theses guarantees may be required
or supported as options.
TCP
What guarantees does TCP provide? TCP makes sure the packets it
transmits do not get corrupted in transit, and that they arrive
at the destination IP address only once and in the correct order. TCP
provides almost no privacy. At the very least any router along the
path between the two IP addresses can read the contents of the TCP
connection.
Malicious attackers, by controlling the routers or by
other means, can change the contents of data in the TCP connection,
reroute the connection to a different destination that appears to have
the correct IP address, and so on. It's also worth noting that TCP's
concept of identity (an IP address) is tied to a specific connection,
whereas most protocols want identity in terms of users or application-specific addresses (e.g. instant-messaging nicknames or email
addresses).
TLS
The Transport Level Security (TLS) protocol, the latest version of
the SSL standard, was developed in order to provide stronger
guarantees than TCP alone can. TLS was designed as a layer on top of
TCP that has similar semantics to TCP (i.e. a reliable, ordered stream
of data) while providing additional guarantees. As a result, protocols
that run on top of TCP can be modified to run on top of TLS with
minimal or no changes. For example, HTTPS URLs on the Web are loaded by
running HTTP over SSL or TLS (which are running over TCP).
What additional guarantees does TLS provide? TLS provides some
amount of privacy, as the contents of messages are encrypted. It
provides a more robust concept of identity, by using public/private
keys to identify the end points of the connection. TLS uses
cryptographically secure checksums to ensure that the data is not
tampered with, corrupted, or replayed, thus providing integrity and
freshness beyond what TCP provides.
Even though TLS provides extra services that TCP cannot, TLS
doesn't and can't provide services that even basic protocols
require. The problem is that both TCP and TLS are tied to a
connection, but protocols may implement services that need to work
beyond the span of a connection.
A file transfer protocol, for
example, should verify that the file was actually stored correctly on
the destination machine's storage, as delivering the non-corrupted
file to the destination is the goal of the protocol. Unfortunately
this is something that TLS cannot possibly provide, because storing a
file to disk happens outside the scope of the TCP connection.
The protocol, on the other hand, can provide integrity and reliability by
asking, once the transfer has completed, for a checksum of the stored
file and comparing it to the checksum of the local file. If the
checksums don't match, the file can simply be transferred again. Even
though the communication is done over a reliable connection, TCP or
TLS can't provide these guarantees, they can only guarantee the data
they transmit in the scope of their connection.
Even when it comes to privacy TLS is not always sufficient. When
transferring email using SMTP, messages get passed from SMTP server to SMTP
server, until they reach the destination server, which delivers them to the
destination user's mailbox. The SMTP mail-delivery model is similar to that
used by postal mail, where the letter gets passed by truck to the mail
sorting center and passed on by stages until it reaches the postal carrier, who will
deliver it to the appropriate mailbox.
The problem is that even if the
connections between the SMTP servers are done using TLS rather than TCP, the
SMTP servers will still be able to read the contents of the email, as they
will be decrypted before reaching the protocol layer. TLS provides privacy
from third parties: they can't read the email because the TLS protocol
encrypts its data. On the other hand, TLS cannot provide privacy from the
SMTP servers. The problem is that the privacy requirements for transferring
an email are not tied to a specific TCP connection, and that is all
that TLS can provide.
Besides privacy, TLS cannot provide other services we might want
for email. TLS cannot indicate the identity of the sender of an
email, it can only prove the identity of a specific connection. As
with file transfer, TLS can't provide reliability to email
delivery. Nor can TLS provide freshness or integrity, as SMTP servers
can duplicate, modify, or delay emails they are relaying if they wish.
Layers Upon Layers
How then would the concept of identity and privacy be added to
SMTP? Or, more specifically, how would one ensure that the recipient
can verify the identity of the sender of an email and that only the
intended recipient can read it? The common solution is not,
interestingly enough, implemented in SMTP. Instead, it is implemented
on top of SMTP, in the contents of the messages.
PGP and S/MIME, the
two main alternatives, use public, key-based identities to encrypt and
checksum the email itself -- which is unlike TLS's encryption of the
communication that transfers the emails. The email is already encrypted
before delivery to the SMTP server. As a result, the privacy, identity, and
integrity are all implemented in such a way that the SMTP protocol,
and the servers that implement it, need not be involved. Unlike TLS,
these services are preserved even in the face of malicious or
malfunctioning SMTP servers.
Given the use of PGP or S/MIME, using TLS
for SMTP communication does not add much. All it does is prevent third
parties that are eavesdropping on the SMTP servers from discovering the sender
and recipient email addresses of emails, at the price of putting a
potentially large computational burden on the servers in order to
implement TLS's cryptographic functionality.
As with privacy and integrity, reliability (knowing whether an
email was delivered) can't be implemented on the connection
level. Notification of failed delivery (known as a "bounce") and of
message receipts are done as additional, specially formatted email
messages forwarded on top of the SMTP delivery mechanisms.
Protocols need to provide a number of guarantees regarding
reliability, integrity, and so on. When designing a protocol it is
important to understand where these services are best implemented,
what problems and potential attackers or failures they are intended to
guard against, and what tradeoffs are involved in implementing
them.
While some of these services can be implemented at the lower
levels of the protocol stack, in many cases they are better
implemented or must be implemented at higher levels. This principle is
known as the end-to-end argument and is fundamental to the design of the
Internet.