Designing Schemas for Business to Business E-Commerce
by Leigh Dodds
June 15, 2000
In a fast-paced session at XML Europe, Arofan Gregory, Lead Scientist
and Manager of the XML Common Business Library, provided an overview
of the role of XML Schemas in e-commerce and gave some guidelines
for good design.
The Role of Schemas
In the introduction to his presentation, Arofan Gregory stressed the
importance of XML schemas in developing robust, extensible business-to-business (B2B)
e-commerce applications. Gregory believed the lack of formal validation in
EDI messaging leaves the framework open for "abuse". Custom extensions
to standards, leading to "tag bloat", means that frequently the same information
can be expressed in multiple ways.
Use of XML brings the key advantage of providing DTD-based validation using generic
tools. However, DTDs do not have the data-typing features which are essential
for EDI applications. This is an area in which XML schema languages
excel, making them an enabler for XML/EDI application design.
Schema languages not only provide the rich data typing associated with ordinary
programming languages, but also include the capacity for new types to
be defined. This means that typing can be customised for particular application
contexts, e.g. to enforce particular number formats, field lengths, etc.
The second key advantage of schema languages is the provision for
breaking a schema into separate components. This encourages reuse of
existing defintions, leading to a greater chance that interoperability
can be achieved. Reuse of schemas definitions was a central theme in
Gregory's presentation.
Design Guidelines
Throughout the presentation, and in the accompanying
conference paper Gregory provided many useful guidelines for creating well-designed e-commerce schemas. As an example
of what not to do, Gregory observed that early efforts to apply
XML simply redefined the EDI message using an XML syntax. This
yields little advantage as no extra semantics have been added.
Gregory commented that there is a lot to be gained from looking at
existing EDI frameworks and standards. EDI applications have been in
use for nearly 10 years, so the analysis to define data types has been
well-tested in the field. Existing type definitions can be used as a
starting point when developing your schemas. The key issue is to get the
data types clearly defined and agreed with trading partners. Structural
differences are less important, as they can be removed by
transformations.
Data types should be defined with extension and refinement in mind. To
this end, the core structures should be minimal--additional constraints
can be applied though additive refinement of the type. Gregory stressed
that subtractive refinement (starting with a complex type definition,
and removing unnecessary constraints) is not as useful and difficult to
reuse.
The schema itself should be designed from the perspective of the
business process, and not a particular application. This gives a greater
degree of flexibility and better future-proofing as it makes no
assumptions about either who or what will be processing the messages.
As use of XML schemas grows, the potential for reuse expands. Alongside the Common Business
Library (xCBL) Gregory highlighted several
other efforts worthy of a closer look. BizTalk provides a great repository of
schemas, and RosettaNet is an excellent source of information on process and data set analysis. However
Gregory singled out ebXML as the activity most likely to yield successful results and
close the gap between the XML and EDI communities.
With respect to particular schema languages, Gregory clearly supported
the efforts of the W3C Schema Working Group. He also confirmed that
while xCBL currently holds schema components in SOX and XDR formats,
these will shortly be supplemented with XML Schema definitions.