Ethical Considerations in the Use of Big Data, AI, and Real-Time Information for Prediction of Behavioral Health Outcomes

January 2022

Jordan Smoller, Harvard T.H. Chan School of Public Health
Matthew Nock, Harvard Faculty of Arts and Sciences

Psychiatric and behavioral conditions are responsible for an enormous burden of disability, morbidity, and mortality; however, options for accurate prediction and effective intervention remain limited. Advances in big and rich data analytics and mobile technology are rapidly creating new opportunities in psychiatry to improve prediction of behavioral health outcomes. Increasingly, it is possible to access data from multiple sources and build models using machine learning to predict whether and when these critical outcomes may occur. This raises novel ethical dilemmas as we consider the use of these models to intervene and attempt prevention. Notably, dilemmas arise regarding (1) what data are accessed and linked across sources—for example, the implications of linking medical data with datasets containing arrest records or social media posts. We also must consider (2) when data are accessed and on what time horizons they are used to predict outcomes and (3) how data are modeled, given that black-box uninterpretable models may also inherit and pass along biases present in their training data sets. In an effort to balance the potential value of data-driven prediction with potential costs, ethical violations, and inequities, it is important to create a forum for dialogue and generativity in this space. As these forms of data and modeling techniques are rapidly becoming more common in academia, medicine, and industry, now is an ideal time to bring together experts across disciplines to build a fuller accounting of the ethical dilemmas, practical challenges, and potential solutions and guidelines in pursuing this important work.