RCAC, College of Pharmacy partner on AI tool to identify adverse drug effects from social media posts
Adverse drug effects are thought to be vastly underreported, but a team from Purdue’s College of Pharmacy is working with the Rosen Center for Advanced Computing (RCAC) to use artificial intelligence to change that.
RCAC senior research data scientist Sarah Rodenbeck partnered with Kyle Hultgren, the director of Purdue’s Center for Medication Safety Advancement and a clinical assistant professor of pharmacy practice, to design and implement a natural language processing (NLP) tool that can be used by a computer to identify social media posts regarding medication use and recognize the ones mentioning harm or side effects associated with them.
The tool identified potential adverse drug events (ADEs) in social media posts and gave researchers a way to interact with both the obtained data and the Food and Drug Administration (FDA)’s adverse event reporting system (FAERS) data. The model was successful in achieving 93% accuracy in classifying ADE tweets from non-ADE tweets.
ADEs are thought to be underreported because not everyone reports them to their doctors, for example, if a side effect is mild, or people do not think to report anything back to medical professionals.
Social media users seldom use precise medical terminology. Instead, the text is filled with slang, pop culture references, and jokes or sarcasm. Many NLP models face challenges to correctly interpret this type of text. However, the tool RCAC developed made it possible to interpret such language.
“By identifying ADEs in social media we hope to augment the view researchers have about how medications are used and experienced,” said Rodenbeck. “Combined with other data, this could eventually be used to identify additional side effects not already included on a drug label or to discover potential manufacturing problems. This could result in anything from additional safety studies all the way to partnering with manufactures to improve drug design.”
The NLP model was part of a larger effort to improve and enhance the HubZero SafeRX tool, with Rajesh Kalyanam and Shawn Rice from RCAC also playing vital roles. Rice worked on updating the HubZero SafeRX database ingestion tool and Kalyanam was responsible for deploying the model.
This study illustrates how technology like GPUs and high-performance computing can be used to benefit a previously underrepresented domain such as pharmacy. In addition to the interdisciplinary partnership between the College of Pharmacy and RCAC, this work makes use of Anvil, Purdue's recently built $10 million NSF-funded supercomputer.
Anvil was used to train the NLP model and perform binary classification inference on tweets collected each month. Anvil GPUs were used to train the model, and Anvil CPUs were used to perform inference with the trained ML model. A startup allocation of 100,000 service units on Anvil is being used to collect Twitter data and run the inference on a regular basis by manual job submission via Anvil's Slurm scheduler.
“I really want to emphasize how important this collaboration has been for my team,” says Hultgren. “Working with RCAC has truly made a powerful impact on our work and our ability to understand more about how the US population is interacting with prescription medications. Ultimately, it is collaborations like this that will help us make a clinical impact in our communities as we strive to reduce the harm associated with medication use.”
This project was supported by funding from the FDA.
The researchers intend to propose phase 2 of this initiative early in the new year, based on the FDA's feedback. Phase 2 of the study will focus on classifying ADEs using hierarchies of medical terms rather than just identifying them.