Next: 2. Semantics of Web Up: Semantic Verification of Web Previous: Semantic Verification of Web

1. Introduction

Conceiving and maintaining a Web site is a difficult task. It is far simpler to discover inconsistent information than a well maintained site on the Internet. Our goal is to study and construct the tools that are necessary to design, produce, and maintain complex and coherent Web sites. Most efforts done in this domain concern the syntactical structure of Web sites, leading to XML. But only a little part of the semantics can be handle by syntactical constraints, and we would like to address all the semantics of a Web site.

After introducing the current possibilities of representing semantics of Web pages and more generally Web sites (using HTML and XML), we present our main objective related to supporting the designers and the web-masters in specifying and verifying semantically Web sites. This problem of adding semantics is more and more addressed by the different working groups of the W3C (see RDF [W3C-RDF1999,W3C-RDF-Schema1999], XML Schema [W3C-XML-Schema2000]...) and also by the ontological approach issued from AI researchers (SHOE [Heflin, Hendler, and Luke1999], On2broker [Fensel et al.1998b]). But the main motivation of such works is to improve information retrieval providing a better indexing.

Our motivation is slightly different as we want to help in designing, specifying and checking Web sites. Very few works address the semantic verification of web pages. Two of them are WebMaster [van Harmelen and van der Meer1999,van Harmelen and Fensel1999], and works by PCR99 that uses attribute grammars.

Our approach is inspired from previous works done in semantics of programming languages, drawing a parallel between the syntax of programming languages and the structure of Web sites (or semi-structured documents), and between the semantics of programs and the semantics of Web sites, applying some notions of types and semantic rules to documents on the Web. To achieve this goal, we have used the Centaur system (a generic programming environment generator, http://www.inria.fr/croap/centaur/centaur.html) and its semantics specification formalism Typol to construct a prototype of a Web site verification system by means of inference rules using natural semantics [Despeyroux1987,Kahn1987,Despeyroux1988,Borras et al.1988].

We illustrate this method by applying it on two examples of Web sites, a thematic directory (like Yahoo) and an institutional site. The use of natural semantics shows clearly the difference between syntactical checking (for example verifying a page against a DTD, like in an XML validator) that is context free, and a semantical computation that is context dependent. The example of thematic directory shows the possibility of using external resources tools (thesauri, ontologies).

Next: 2. Semantics of Web Up: Semantic Verification of Web Previous: Semantic Verification of Web

Thierry Despeyroux
Thu May 4 16:00:23 MEST 2000