Existing algorithms for predicting suicide risk rely solely on data from electronic health records, but such models could be improved through the incorporation of publicly available socioeconomic data – such as financial, legal, life event and sociodemographic data. The purpose of this study is to understand the complex ethical and privacy implications of incorporating sociodemographic data within the health context. This paper presents results from a survey exploring what the general public’s knowledge and concerns are about such publicly available data and the appropriateness of using it in suicide risk prediction algorithms.
A survey was developed to measure public opinion about privacy concerns with using socioeconomic data across different contexts. This paper presented respondents with multiple vignettes that described scenarios situated in medical, private business and social media contexts, and asked participants to rate their level of concern over the context and what factor contributed most to their level of concern. Specific to suicide prediction, this paper presented respondents with various data attributes that could potentially be used in the context of a suicide risk algorithm and asked participants to rate how concerned they would be if each attribute was used for this purpose.
The authors found considerable concern across the various contexts represented in their vignettes, with greatest concern in vignettes that focused on the use of personal information within the medical context. Specific to the question of incorporating socioeconomic data within suicide risk prediction models, the results of this study show a clear concern from all participants in data attributes related to income, crime and court records, and assets. Data about one’s household were also particularly concerns for the respondents, suggesting that even if one might be comfortable with their own being used for risk modeling, data about other household members is more problematic.
Previous studies on the privacy concerns that arise when integrating data pertaining to various contexts of people’s lives into algorithmic and related computational models have approached these questions from individual contexts. This study differs in that it captured the variation in privacy concerns across multiple contexts. Also, this study specifically assessed the ethical concerns related to a suicide prediction model and determining people’s awareness of the publicness of select data attributes, as well as which of these data attributes generated the most concern in such a context. To the best of the authors’ knowledge, this is the first study to pursue this question.
This material is based upon work supported by the National Science Foundation REU site grant no. IIS-1950826 “Data Science Across the Disciplines.” The authors also thank Dr Jordan Smoller and his colleagues at Harvard Medical School and in the Psychiatric and Neurodevelopmental Genetics Unit (PNGU) at Massachusetts General Hospital for their feedback and support.
Zimmer, M. and Logan, S. (2021), "Privacy concerns with using public data for suicide risk prediction algorithms: a public opinion survey of contextual appropriateness", Journal of Information, Communication and Ethics in Society, Vol. ahead-of-print No. ahead-of-print. https://doi.org/10.1108/JICES-08-2021-0086
Emerald Publishing Limited
Copyright © 2021, Emerald Publishing Limited