A collection of ethical requirements for conducting research using hacked data is presented in a Perspective published in Nature Machine Intelligence.
In recent years, research in computational sciences and machine learning has accelerated as large amounts of data have become publicly available. Datasets that have become available through hacking — such as the WikiLeaks datasets or data leaked from the dating website Ashley Madison in 2015 — can be unique and valuable resources for scientific research. However, ethical dilemmas in using human data where no individual consent has been given need to be urgently addressed.
Marcello Ienca and Effy Vayena discuss and critically evaluate the advantages and disadvantages of using hacked data, drawing upon historical examples of scientific misconduct, as well as current research ethics guidelines. The authors conclude that although it may be lawful for researchers to use hacked data if they are publicly available, responsible research practices still require clear ethical justification for doing so. As such, they propose six ethical and procedural requirements that need to be addressed. For example, researchers need to demonstrate the uniqueness of the hacked dataset in question, and also show that there is no viable, alternative method to collect similar data. Researchers should additionally conduct a risk–benefit assessment and take measures to ensure that individual privacy is preserved.
By proposing this set of ethical requirements, the authors intend to stimulate a debate in the scientific community to clarify when — if at all — hacked data can be used in research, and under what conditions.
This press release refers to a Nature Machine Intelligence Perspective piece, not a Nature Machine Intelligence research article. Perspectives are intended to provide a forum for authors to discuss models and ideas from a personal viewpoint. They are peer reviewed.
Sports: Little evidence that host countries win more Olympic medalsScientific Reports
Evolution: Group-living mammals may live the longestNature Communications
Education: Over one third of a year’s learning lost to COVID-19 pandemicNature Human Behaviour
Astronomy: Machine learning combs radio signals from spaceNature Astronomy
Animals: Cat-egorising play and genuine fighting in catsScientific Reports