The October 2013 issue of the ACM Transaction on the Web includes an article of ours on bottom-up domain model design of connected web data sources. This is becoming a more and more important problem as a wealth of data services is becoming available on the Web. Indeed, building and querying Web applications that effectively integrate Web content is increasingly important. However, schema integration and ontology matching with the aim of registering data services often requires a knowledge-intensive, tedious, and error-prone manual process. In the paper we tackle this issue as described below.
The paper has been authored by Stefano Ceri, Silvia Quarteroni and myself within the research project Search Computing.
The full paper is available for download on the ACM Digital Library (free of charge, courtesy of the ACM Author-izer service) through this URL:
http://dl.acm.org/citation.cfm?id=2493536
This is the summary of the contribution:
![]() |
We present a bottom-up, semi-automatic service registration process that refers to an external knowledge base and uses simple text processing techniques in order to minimize and possibly avoid the contribution of domain experts in the annotation of data services. The first by-product of this process is a representation of the domain of data services as an entity-relationship diagram, whose entities are named after concepts of the external knowledge base matching service terminology rather than being manually created to accommodate an application-specific ontology. Second, a three-layer annotation of service semantics (service interfaces, access patterns, service marts) describing how services “play” with such domain elements is also automatically constructed at registration time. When evaluated against heterogeneous existing data services and with a synthetic service dataset constructed using Google Fusion Tables, the approach yields good results in terms of data representation accuracy.
We subsequently demonstrate that natural language processing methods can be used to decompose and match simple queries to the data services represented in three layers according to the preceding methodology with satisfactory results. We show how semantic annotations are used at query time to convert the user’s request into an executable logical query. Globally, our findings show that the proposed registration method is effective in creating a uniform semantic representation of data services, suitable for building Web applications and answering search queries.
The bibtex reference is as follows:
@article{QBC2013,
author = {Quarteroni, Silvia and Brambilla, Marco and Ceri, Stefano},
title = {A bottom-up, knowledge-aware approach to integrating and querying web data services},
journal = {ACM Trans. Web},
issue_date = {October 2013},
volume = {7},
number = {4},
month = nov,
year = {2013},
issn = {1559-1131},
pages = {19:1--19:33},
articleno = {19},
numpages = {33},
url = {http://doi.acm.org/10.1145/2493536},
doi = {10.1145/2493536},
acmid = {2493536},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {Web data integration, Web data services, Web services, natural language Web query, service querying, structured Web search},
}
To keep updated on my activities you can subscribe to the RSS feed of my blog or follow my twitter account (@MarcoBrambi).