Enterprise knowledge graphs (EKG) in resource description framework (RDF) consolidate and semantically integrate heterogeneous data sources into a comprehensive dataspace. However, to make an external relational data source accessible through an EKG, an RDF view of the underlying relational database, called an RDB2RDF view, must be created. The RDB2RDF view should be materialized in situations where live access to the data source is not possible, or the data source imposes restrictions on the type of query forms and the number of results. In this case, a mechanism for maintaining the materialized view data up-to-date is also required. The purpose of this paper is to address the problem of the efficient maintenance of externally materialized RDB2RDF views.
This paper proposes a formal framework for the incremental maintenance of externally materialized RDB2RDF views, in which the server computes and publishes changesets, indicating the difference between the two states of the view. The EKG system can then download the changesets and synchronize the externally materialized view. The changesets are computed based solely on the update and the source database state and require no access to the content of the view.
The central result of this paper shows that changesets computed according to the formal framework correctly maintain the externally materialized RDB2RDF view. The experiments indicate that the proposed strategy supports live synchronization of large RDB2RDF views and that the time taken to compute the changesets with the proposed approach was almost three orders of magnitude smaller than partial rematerialization and three orders of magnitude smaller than full rematerialization.
The main idea that differentiates the proposed approach from previous work on incremental view maintenance is to explore the object-preserving property of typical RDB2RDF views so that the solution can deal with views with duplicates. The algorithms for the incremental maintenance of relational views with duplicates published in the literature require querying the materialized view data to precisely compute the changesets. By contrast, the approach proposed in this paper requires no access to view data. This is important when the view is maintained externally, because accessing a remote data source may be too slow.
This work was partly funded by FAPERJ under grant E-26/200.834/2021; by CAPES under grants 88881.310592 – 2018/01; and by CNPq under grant 305587/2021-8. The support of the Universidade Autónoma de Lisboa Luis de Camões is also greatfully acknowledged.
Vidal, V., Magalhães Pequeno, V., Moura Arruda Júnior, N. and Casanova, M.A. (2022), "Publication and maintenance of RDB2RDF views externally materialized in enterprise knowledge graphs", International Journal of Web Information Systems, Vol. 18 No. 5/6, pp. 255-285. https://doi.org/10.1108/IJWIS-02-2022-0043
Emerald Publishing Limited
Copyright © 2022, Emerald Publishing Limited