To read this content please select one of the options below:

Semantic Disclosure Control: semantics meets data privacy

Montserrat Batet (Internet Interdisciplinary Institute (IN3), Universitat Oberta de Catalunya, Barcelona, Spain)
David Sánchez (Department of Computer Science and Mathematics, CYBERCAT – Center for Cybersecurity Research of Catalonia, Universitat Rovira i Virgili, UNESCO Chair in Data Privacy, Tarragona, Spain)

Online Information Review

ISSN: 1468-4527

Article publication date: 11 June 2018

501

Abstract

Purpose

To overcome the limitations of purely statistical approaches to data protection, the purpose of this paper is to propose Semantic Disclosure Control (SeDC): an inherently semantic privacy protection paradigm that, by relying on state of the art semantic technologies, rethinks privacy and data protection in terms of the meaning of the data.

Design/methodology/approach

The need for data protection mechanisms able to manage data from a semantic perspective is discussed and the limitations of statistical approaches are highlighted. Then, SeDC is presented by detailing how it can be enforced to detect and protect sensitive data.

Findings

So far, data privacy has been tackled from a statistical perspective; that is, available solutions focus just on the distribution of the data values. This contrasts with the semantic way by which humans understand and manage (sensitive) data. As a result, current solutions present limitations both in preventing disclosure risks and in preserving the semantics (utility) of the protected data.

Practical implications

SeDC captures more general, realistic and intuitive notions of privacy and information disclosure than purely statistical methods. As a result, it is better suited to protect heterogenous and unstructured data, which are the most common in current data release scenarios. Moreover, SeDC preserves the semantics of the protected data better than statistical approaches, which is crucial when using protected data for research.

Social implications

Individuals are increasingly aware of the privacy threats that the uncontrolled collection and exploitation of their personal data may produce. In this respect, SeDC offers an intuitive notion of privacy protection that users can easily understand. It also naturally captures the (non-quantitative) privacy notions stated in current legislations on personal data protection.

Originality/value

On the contrary to statistical approaches to data protection, SeDC assesses disclosure risks and enforces data protection from a semantic perspective. As a result, it offers more general, intuitive, robust and utility-preserving protection of data, regardless their type and structure.

Keywords

Acknowledgements

This work was partly supported by the European Commission (projects H2020-644024 “CLARUS” and H2020-700540 “CANVAS”) and by the Spanish Government (projects TIN2014-57364-C2-R “SmartGlacis”, TIN2015-70054-REDC “Red de excelencia Consolider ARES” and TIN2016-80250-R “Sec-MCloud”). The opinions expressed in this paper are those of the authors and do not necessarily reflect the views of UNESCO. M. Batet is supported by a Postdoctoral grant from Ministry of Economy and Competitiveness (MINECO) (FPDI-2013-16589).

Citation

Batet, M. and Sánchez, D. (2018), "Semantic Disclosure Control: semantics meets data privacy", Online Information Review, Vol. 42 No. 3, pp. 290-303. https://doi.org/10.1108/OIR-03-2017-0090

Publisher

:

Emerald Publishing Limited

Copyright © 2018, Emerald Publishing Limited

Related articles