Technology: Interaction data may allow identification of anonymized individuals over time
Nature Communications
January 26, 2022
Records of people’s interactions could be used to identify individuals in anonymized datasets across long periods of time, suggests a study published in Nature Communications. The findings suggest current practices when handling this type of data may not meet anonymization standards set by the European Union’s General Data Protection Regulations.
Fine-grained interaction data is collected by messaging apps, mobile phone carriers, social media providers and other apps in order to operate their services or for research purposes. It has been used to study the interaction patterns of individuals, forecast the spatial spread of epidemics, and the effects of friendships on political mobilisation. Under current data protection regulations this data can be shared and sold without the user’s consent, providing it is anonymized.
Yves-Alexandre de Montjoye, Ana-Maria Cretu and colleagues found that people’s interaction data remains stable over long periods of time and that this could be used to identify individuals in anonymized datasets. The authors developed a deep learning-based model, which they trained to identify individuals based on their interaction network, and applied it to a dataset of over 40,000 individuals collected over different periods of time. The model was able to identify 52% of individuals based on their 2-hop interaction network (interactions with individuals twice removed from the target individual). Using an individual’s direct contacts, the model could identify people 15% of the time. As the interactions remain stable over time, the authors were also able to identify 24% of people after 20 weeks using their 2-hop interaction network. When the model was applied to a Bluetooth close-proximity dataset of 587 people it could identify individuals more than 26% of the time. However, the authors note that they do not believe their model would be applicable to contact tracing protocols, such as Google and Apple’s Exposure Notification.
The authors argue their results demonstrate that anonymized and disconnected interaction data may be identifiable over long periods of time, which has implications for compliance with privacy legislations. They suggest that security measures including access controls and privacy-enhancing systems could be used to protect against this.
doi:10.1038/s41467-021-27714-6
Research highlights
-
Mar 29
Materials: Yolk proteins make an eggcellent addition to Old Masters’ oil paintingsNature Communications
-
Mar 28
Geoscience: Water on the Moon stored in beads of impact glassNature Geoscience
-
Mar 28
Health: Positive effects of regular physical exercise for cognition might be negligibleNature Human Behaviour
-
Mar 28
Environment: Microplastic consumption may alter seabird gut microbiomesNature Ecology & Evolution
-
Mar 24
Zoology: Numerical abilities may be hardwired in newly-hatched zebrafishCommunications Biology
-
Mar 23
Ecology: Australian reef species decline following decade of warmingNature