XML and the World-Wide Web Consortium Leverage Action Project
by Brian Matthews
The World-Wide Web is based on some very simple technologies.
In particular, the Hypertext Markup Language (HTML), is a simple
language for describing documents. However, HTML is severely limited
as a information management medium. HTMLs mix of structure and
presentation means that reformatting the data to give different
views is hard. Further, the lack of domain specific data modelling
in HTML has made accurate searching for information on the Web
difficult and has made it hard to interact with databases. Thus
the very features which led to the widespread acceptance of HTML
are limiting the utility of the Web itself.
In response the World-Wide Web Consortium (W3C) has developed
the Extensible Markup Language (XML) (http://www.w3.org/XML/).
XML is not intended as a replacement of HTML, but rather as a
more flexible alternative for the representation of data across
the Web. XML is intended to allow new data formats to be defined
while maintaining the universality of HTML.
XML is based on the existing Standard Generalised Markup Language
(SGML). The key concept brought to XML from SGML is that of a
Document Type Definition (DTD). This is a declaration of the correct
markup structure for a class of XML documents against which documents
can be validated. Thus the logical structure of a class of valid
documents is defined and used by applications to manipulate a
document.
Thus XML can be used to generate new document markup which is
closer to the intended use of the document in a flexible yet universally
interpretable way. DTDs can then be given for a wide variety of
application domains and data formats.
The W3C-LA project between INRIA and RAL and also the W3C offices
at SICS, GMD, CWI, and FORTH, has been exploring the use of XML
within several different demonstrators:
- MathML - an XML based method of representing mathematics across
the web has been implemented in the Amaya reference browser at
INRIA
- Hyperglossaries - RAL has been collaborating with the Virtual
Hyperglosssary Group in using XML to provide glossary information
for terms within web documents
- RDF - RAL is producing a demon-stration of the use of the W3Cs
XML based metadata description language RDF within a workflow
application
- SMIL - RAL, in collaboration with CWI, has produced a demonstrator
of the vendor-neutral Synchronised Multi-media Integration Language,
which uses the XML standard to transmit multi-media across the
Web
- Schematic Graphics Markup Language - RAL has submitted a proposal
to the W3C to provide an XML based standard for representing graphical
objects using a schematic representation, as opposed to a binary
format. This proposal has been prototyped in the Amaya browser
at INRIA
- XML Browser - Work is underway in producing a general XML browser
based within Amaya
The common driving force behind these initiatives is the desire
to transmit and present new kinds of information across the WWW
in a flexible and open way. They also demonstrate the widely differing
application domains offered by XML, and its potential to enhance
the capacity of the WWW. Further information on W3C and W3C-LA
activities can be found by contacting the W3C at INRIA, or the
W3C offices established at RAL, SICS, GMD, CWI, and FORTH.
Please contact:
Brian Matthews - CLRC
Tel: +44 1235 44 6648
E-mail: b.m.matthews@rl.ac.uk