I agree my information will be processed in accordance with the Scientific American and Springer Nature Limited Privacy Policy.
A new artificial intelligence (AI) tool can forecast a person's risk of developing more than 1,000 diseases, in some cases providing a prediction decades in advance.
The model, called Delphi-2M, uses health records and lifestyle factors to estimate the likelihood that a person will develop diseases such as cancer, skin diseases and immune conditions up to 20 years ahead of time. Although Delphi-2M was trained only on one data set from the United Kingdom, its multi-disease modelling could one day help clinicians to identify high-risk people, allowing for the early roll-out of preventive measures. The model is described in a study published today in Nature.
The tool's ability to model multiple diseases in one go is "astonishing," says Stefan Feuerriegel, a computer scientist at the Ludwig Maximilian University of Munich in Germany, who has developed AI models for medical applications. "It can generate entire future health trajectories," he says.
If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.
Researchers have already developed AI-based tools to predict a person's risk of developing certain conditions, including some cancers and cardiovascular disease. But most of these tools estimate the risk of only one disease, says study co-author Moritz Gerstung, a data scientist at the German Cancer Research Center in Heidelberg. "A health-care professional would have to run dozens of them to deliver a comprehensive answer," he says.
To address this, Gerstung and his colleagues modified a type of large language model (LLM) called a generative pre-trained transformer (GPT), that forms the underpinning of AI chatbots such as ChatGPT. When asked a question, GPTs provide outputs that, according to their training on vast volumes of data, are statistically probable.
The authors designed their modified LLM to forecast a person's likelihood of developing 1,258 diseases on the basis of their past medical history. The model also incorporates the person's age, sex, body mass index and health-related habits, such as tobacco use and alcohol consumption. The researchers trained Delphi-2M on data from 400,000 participants of the UK Biobank, a long-term biomedical monitoring study.
For most diseases, Delphi-2M's predictions matched or exceeded the accuracy of those of current models that estimate the risk of developing a single illness. The tool also performed better than a machine-learning algorithm that uses biomarkers -- levels of specific molecules or compounds in the body -- to predict the risk of several diseases. "It worked astonishingly well," says Gerstung.
Delphi-2M worked best when forecasting the trajectories of conditions that follow predictable patterns of progression, such as some types of cancer. The model calculated the probability of a person developing each illness for a time period of up to two decades, depending on the information included in their medical records.
Gerstung and his colleagues tested Delphi-2M on health data from 1.9 million people in the Danish National Patient Registry, a national database that has tracked hospital admissions for almost half a century. The authors found that the model's predictions for people in the registry were only slightly less accurate than they were for participants in the UK Biobank. This demonstrates that the model could still make somewhat reliable predictions when it's applied to data sets from national health systems other than the one it trained on, says Gerstung.
Delphi-2M is an "intriguing" contribution to the burgeoning field of modelling multiple diseases at once, but it has its limitations, says Degui Zhi, a bioinformatics researcher who develops AI models at the University of Texas Health Science Center at Houston. For instance, the UK Biobank data only captured participants' first brush with a disease. The number of times someone has had an illness is "important for the modelling of personal health trajectories," says Zhi.
Gerstung and his colleagues will evaluate Delphi-2M's accuracy on data sets from several countries to expand its scope. "Thinking about how this information can be combined for developing even more precise algorithms will be important," he says.