Seminar by Paolo Papotti, Universita Roma Tre, Italy
"Core Mappings: Schema Mapping Revolution"
November 3, 2009, 14:00, room G008

Abstract

Schema mappings are high-level specifications that describe the
relationship between database schemas. They are an important tool in
several areas of database research and have a central role in data
exchange and data integration.

Research has investigated mappings under two perspectives. On one
side, there are studies of practical tools for schema mapping
generation (e.g., Clio at IBM Almaden). These works focus on
algorithms to generate mappings based on visual specifications
provided by users. On the other side, there are theoretical researches
about data exchange. These study how to generate a solution – i.e., a
target instance – given a set of mappings. In this context, the notion
of a core of a data exchange solution has been formally identified as
an optimal solution. However, until recently, the only way to produce
core solutions were algorithms for the post processing of an
intermediate materialization, since a mapping system supporting core
computation was lacking.

In this talk I will start with a short history of schema mapping
systems in recent times. I will then present algorithms that have
contributed to bridge the gap between the practice of mapping
generation and the theory of data exchange. I will focus on techniques
to generate "core schema mappings", that is, mappings that are able to
materialize core solutions without post-processing computation. I will
show that by using core schema mappings on top of common runtime
engines, it is possible to achieve performances orders of magnitudes
better than computing the core as a post-processing step.
Finally, I will discuss an application of schema mappings in the
specific context of the automatic extraction and integration of data
from the Web. The talk ends with a discussion of current and future
lines of research.