Doctors, as even they will tell you, they’re no match for the complexity of the human body. Heart attacks in particular are hard to anticipate. Now, scientists have shown that computers capable of teaching themselves can perform even better than standard medical guidelines, significantly increasing prediction rates.
If implemented, the new method could save thousands or even millions of lives a year.
“I can’t stress enough how important it is,” says Elsie Ross, a vascular surgeon at Stanford University in Palo Alto, California, who was not involved with the work, “and how much I really hope that doctors start to embrace the use of artificial intelligence to assist us in care of patients.”
Each year, nearly 20 million people die from the effects of cardiovascular disease, including heart attacks, strokes, blocked arteries, and other circulatory system malfunctions. In an effort to predict these cases, many doctors use guidelines similar to those of the American College of Cardiology/American Heart Association (ACC/AHA). Those are based on eight risk factors—including age, cholesterol level, and blood pressure—that physicians effectively add up.
But that’s too simplistic to account for the many medications a patient might be on, or other disease and lifestyle factors. “There’s a lot of interaction in biological systems,” says Stephen Weng, an epidemiologist at the University of Nottingham in the United Kingdom. Some of those interactions are counterintuitive: A lot of body fat can actually protect against heart disease in some cases. “That’s the reality of the human body,” Weng says. “What computer science allows us to do is to explore those associations.”
In the new study, Weng and his colleagues compared use of the ACC/AHA guidelines with four machine-learning algorithms: random forest, logistic regression, gradient boosting, and neural networks. All four techniques analyze lots of data in order to come up with predictive tools without any human instruction. In this case, the data came from the electronic medical records of 378,256 patients in the United Kingdom. The goal was to find patterns in the records that were associated with cardiovascular events.
First, the artificial intelligence (AI) algorithms had to train themselves. They used about 78% of the data—some 295,267 records—to search for patterns and build their own internal “guidelines.” They then tested themselves on the remaining records. Using record data available in 2005, they predicted which patients would have their first cardiovascular event over the next 10 years, and checked the guesses against the 2015 records. Unlike the ACC/AHA guidelines, the machine-learning methods were allowed to take into account 22 more data points, including ethnicity, arthritis, and kidney disease.
All four AI methods performed significantly better than the ACC/AHA guidelines. Using a statistic called AUC (in which a score of 1.0 signifies 100% accuracy), the ACC/AHA guidelines hit 0.728. The four new methods ranged from 0.745 to 0.764, Weng’s team reports this month in PLOS ONE. The best one—neural networks—correctly predicted 7.6% more events than the ACC/AHA method, and it raised 1.6% fewer false alarms. In the test sample of about 83,000 records, that amounts to 355 additional patients whose lives could have been saved. That’s because prediction often leads to prevention, Weng says, through cholesterol-lowering medication or changes in diet.
“This is high-quality work,” says Evangelos Kontopantelis, a data scientist at the University of Manchester in the United Kingdom who works with primary care databases. He says that dedicating more computational power or more training data to the problem “could have led to even bigger gains.”
Several of the risk factors that the machine-learning algorithms identified as the strongest predictors are not included in the ACC/AHA guidelines, such as severe mental illness and taking oral corticosteroids. Meanwhile, none of the algorithms considered diabetes, which is on the ACC/AHA list, to be among the top 10 predictors. Going forward, Weng hopes to include other lifestyle and genetic factors in computer algorithms to further improve their accuracy.
Kontopantelis notes one limitation to the work: Machine-learning algorithms are like black boxes, in that you can see the data that go in and the decision that comes out, but you can’t grasp what happens in between. That makes it difficult for humans to tweak the algorithm, and it thwarts predictions of what it will do in a new scenario.
Will physicians soon adopt similar machine-learning methods in their practices? Doctors really pride themselves on their expertise, Ross says. “But I, being part of a newer generation, see that we can be assisted by the computer.”