Data Reuse and the Problem of Group Identity

Publication date: 30 June 2017


Reusing existing data sets of health information for public health or medical research has much to recommend it. Much data repurposing in medical or public health research or practice involves information that has been stripped of individual identifiers but some does not. In some cases, there may have been consent to the reuse but in other cases consent may be absent and people may be entirely unaware of how the data about them are being used. Data sets are also being combined and may contain information with very different sources, consent histories, and individual identifiers. Much of the ethical and policy discussion about the permissibility of data reuse has centered on two questions: for identifiable data, the scope of the original consent and whether the reuse is permissible in light of that scope, and for de-identified data, whether there are unacceptable risks that the data will be reidentified in a manner that is harmful to any data subjects. Prioritizing these questions rests on a picture of the ethics of data use as primarily about respecting the choices of the data subject. We contend that this picture is mistaken; data repurposing, especially when data sets are combined, raises novel questions about the impacts of research on groups and their implications for individuals regarded as falling within these groups. These impacts suggest that the controversies about de-identification or reconsent for reuse are to some extent beside the point. Serious ethical questions are also raised by the inferences that may be drawn about individuals from the research and resulting risks of stigmatization. These risks may arise even when individuals were not part of the original data set being repurposed. Data reuse, repurposing, and recombination may have damaging effects on others not included within the original data sets. These issues of justice for individuals who might be regarded as indirect subjects of research are not even raised by approaches that consider only the implications for or agreement of the original data subject. This chapter argues that health information should be available for reuse, information should be available for use, but in a way that does not yield unexpected surprises, produce direct harm to individuals, or violate warranted trust.




We are grateful to Alexis Juergens for assistance with the research for this chapter. We are also grateful for support from the S. J. Quinney College of Law fund for excellence in faculty research and teaching.


