#30 - From Reddit To Rescue: Real-Time Signals Of The Opioid Crisis Artwork

Code & Cure

Decoding health in the age of AI

Hosted by an AI researcher and a medical doctor, this podcast unpacks how artificial intelligence and emerging technologies are transforming how we understand, measure, and care for our bodies and minds.

Each episode unpacks a real-world topic to ask not just what’s new, but what’s true—and what’s at stake as healthcare becomes increasingly data-driven.

If you're curious about how health tech really works—and what it means for your body, your choices, and your future—this podcast is for you.

We’re here to explore ideas—not to diagnose or treat. This podcast doesn’t provide medical advice.

All Episodes

Code & Cure

#30 - From Reddit To Rescue: Real-Time Signals Of The Opioid Crisis

February 05, 2026 • Vasanth Sarathy & Laura Hagopian

0:00 | 18:39

What if the earliest warning sign of an opioid overdose surge isn’t locked inside a delayed report, but unfolding in real time on Reddit? In this episode, we explore how social media conversations, especially pseudonymous, community-led forums, can reveal emerging overdose risks before traditional surveillance systems catch up.

We unpack research that analyzed more than a decade of posts to show how even simple drug mentions sharpened forecasts of overdose death rates. The signal was especially strong for fentanyl, exposing where existing public health tools lag and why online communities often see danger first. Along the way, we explain the mechanics in plain language: how time-series models respond faster than surveys, why subreddit structure filters noise, and how historical archives enable rigorous validation.

But it doesn’t stop at counting mentions. We dig into what happens when posts are classified by lived experience: overdose stories, sourcing concerns, or test strip discussions. We also examine what broke during COVID, when behavior and access shifted overnight, and how to detect those regime changes before models start to fail.

The takeaway is urgent and practical. Social data won’t replace public health surveillance, but it can make it fast enough to save lives. We share a field-ready playbook for turning online signals into timely interventions, and show how feedback from the same communities can explain why a response worked—or didn’t—so teams can adapt quickly. If you care about real-time epidemiology, harm reduction, and responsible AI in healthcare, this conversation connects raw text to real-world impact.

Reference:

Monitoring the opioid epidemic via social media discussions
Delaney A Smith et al.
Nature NPJ Digital Health (2025)

Credits:

Theme music: Nowhere Land, Kevin MacLeod (incompetech.com)
Licensed under Creative Commons: By Attribution 4.0
https://creativecommons.org/licenses/by/4.0/

SPEAKER_00 0:01

What if the earliest warning signs of the opioid epidemic aren't in hospitals or public health reports, but buried in Reddit posts at 2 a.m.

SPEAKER_01 0:19

Hello and welcome to Code and Cure, the podcast where we discuss decoding health and the age of AI. My name is Vasant Sarathi, and I'm an AI researcher and cognitive scientist. And I'm here with Laura Hagopian.

Limits Of Traditional Surveillance

SPEAKER_00 0:32

I'm an emergency medicine physician and I work in digital health.

SPEAKER_01 0:35

Today we're going to talk about uh the opioid epidemic and specifically different ways of potentially sort of monitoring it. Right. Yeah. Yeah.

SPEAKER_00 0:46

I mean, oftentimes this is done by the CDC. And they um they have some limitations. I think one of the biggest problems with these public health agencies is that they're not fast. These reports don't happen in real time. They can even take months to come out.

SPEAKER_01 1:04

Yeah. And you know, I think so. This is based off of a paper, right? So one of the interesting things that I found was, yeah, I mean, there the the techniques that people use now to monitor for this sort of thing are um, you know, don't leverage all the available data out there. And that's kinda, I think that that's kind of the point of this podcast, too. But I think that as I learned about that, I learned how little, in fact, they use uh the potential of available data. So, for example, they do surveys with the community epidemiologists, or uh they rely on transcriptions or whatever of forensic lab reports.

SPEAKER_00 1:45

Or they cover certain geographies, but not all of them, because that would be too much information, right?

SPEAKER_01 1:50

Right, right, right. And it's so it's it's sparse, it doesn't have full coverage, but it's also kind of there's a huge time lag, right?

SPEAKER_00 1:58

And there's potential for bias to be introduced uh in like this this type of a survey format, too.

SPEAKER_01 2:04

That's true. Yeah. There's yeah, that's right. And so it's very uh to me at least, it f I find it very interesting that the data, I mean, this is supposedly feeding into a predictive model of some sort uh to monitor what's happening and potentially figure out, okay, when the next, you know, kind of the next surge of these cases is going to go up or whatever. And it it a model is gonna be very lacking if that's all the data that they that it has going in.

Rethinking Methods With AI

SPEAKER_00 2:29

I mean, it it's interesting because you think, okay, what do you learn in school? What do you learn in epidemiology? And um oh, I've never I don't I don't have a master's in public health. However, yeah, however, there's like the traditional way of doing things, right? And now that you have AI, you can kind of flip that script if you want. Yeah. Right. And there is this concept that, like, hey, we have all this extra data out there in the world now. Can we harness it? Do we need to use the traditional methods, or are there other methods out there that can mine the data and give us that same information and maybe do it in a way that, well, at least in some ways, could be better, right? In this situation, it's like, well, if we know that an opioid overdose is going to be happening, or more, you know, the opioid epidemic is getting worse, then we can do something about it. Yeah. And so I think that's that's kind of the key here is hey, can we leverage different types of information that's not traditional information and use that to make decisions, yeah, to make changes, to intervene.

Why Social Media Matters

Reddit’s Structure And Anonymity

SPEAKER_01 3:43

Yeah, and we've talked about this sort of thing before in the podcast too, where we use data in different ways and unexpected ways and so on. Um, and here it's not super unexpected necessarily, but what they're doing is utilizing social media data, right? And that was interesting too. There was a whole section on the different social media platforms and how they are the data can be different across the board. Like for instance, um, Twitter is really great in providing uh information, geolocation information, telling you where people are, when they're saying things, but they are restricted in terms of how much um text one can type in that, how much they can express in that, right? So there's that issue. Um, and it's kind of open-ended. Whereas there's other social media platforms like Reddit, um, which are you can you can write longer posts, uh, but also it's community driven. So Reddit has, for those who don't know, Reddit has what they call subreddits, which is essentially little communities, little groups within Reddit uh for different topics, different groups, and so on. There's one for politics, and there's one for uh pen and ink drawings, and there's one for you know various other things. There's I think there's one even for opioid um uh, you know, uh help.

SPEAKER_00 4:57

And so Yeah, there's one like that's you know, a subreddit for addiction is one of the ones they called out in the article.

SPEAKER_01 5:02

Right, right, right. And and each of these subreddits is essentially a little community that you can join once you're part of the Reddit uh user base, and there's a moderator, there are rules, uh things you can and cannot say in that community. And people, you know, usually have you know, Reddit accounts are somewhat anonymous, like you you your Reddit, you know, handle is whatever it is you want it to be, and so it's not your name or anything. And so you get to be a little bit anonymous, but at the same time, there's a community of people, and it's not just the open world, it's people who are willing to follow the rules of that community when conversing with each other.

SPEAKER_00 5:36

And I think that sort of pseudo-anonymity is like really important in this situation because there's a lot of stigma associated with opioid use, and so having the ability to to anonymously discuss that with a with a community of people is part of the appeal.

SPEAKER_01 5:54

Yeah.

Building A Large-Scale Dataset

SPEAKER_00 5:55

Um and one of the other things that's interesting about Reddit and true on some other social media platforms as well, is that the historical activity is is still there. So sometimes I'll look something up on the internet and uh subreddit from you know 2006 will pop up with an answer to that question. So the historical information is there, yes. Um, which means that they were able to like study what happened in the past over like 10 or more years, what sort of drug discussions were happening, and and they tried to use geolocation as as much as they were able to here as well, and then kind of predict, okay, can this help predict what might happen in the future? So what's or in real time?

SPEAKER_01 6:42

Yeah, no, that's right. And what's interesting also is that with social media, you get first-person reports. So, unlike things like a forensic lab report or a survey performed on epidemiologists, here you have first-person reports discussing that drug. And that is potentially extremely valuable, right? Um, and so I think that that's the other piece here that social media brings that other forms of data don't bring. And in this study, you can do this in a very large scale. And I think that in this study, they selected over one and a half million people who they followed over 10 years across different locations, um, as those people, you know, typed on Reddit or whatever, and tracked that kind of conversation and they tried to extract out uh mentions, uh, which is whenever they said something about a particular drug. And I think there's a list of drugs that they were specifically following too, right?

Forecasting Models And Lags

SPEAKER_00 7:36

Yeah, exactly. And it's interesting because it's not that they had to say, Oh, I used a drug or someone overdosed on the drug. It was literally that they just commented about the drug, it was just a comment, simply that, and that still was enough to predict, help predict opioid overdoses.

SPEAKER_01 7:53

Yeah, and and the thing that we're trying to predict was the CDC's overdose death rates, right? And those are typically predicted by the things that I mentioned earlier, uh, the surveys and such. Um, but they, and you know, you use what's called a time forecasting model to do something like this, which is nothing really that all that fancy. It's a crew, it's used in financial prediction, all kinds of other prediction modes, but it's like it just looks at the data and looks at windows of the data as it moves and tells and then sort of tracks how it changed uh over time. And it's able to do that, and it's able to predict based on these changes what is the most likely kind of next change coming up and going forward. And um the survey data was fine for a lot of that, but um it obviously lagged, right? It didn't, it didn't, it wasn't like frequent enough. And then when they were able to overlay the Reddit data on top, the number of mentions of disease and such, it improved the quality of predictions quite a bit, uh, which I thought was super interesting.

Real-Time Signals And Harm Reduction

SPEAKER_00 8:48

Yeah, so basically, like the comments about opioids increased or around the same times that the overdose rates, death rates increased. Yeah. Except now, because of Reddit, if you were gonna use this like in the future, right, you would have that information in real time rather than in a six-month lag that you might get from the CDC, for example.

SPEAKER_01 9:10

Exactly. That's the big value here, yes.

Beyond Mentions: Mining Nuance

SPEAKER_00 9:13

And when you have that information, this is where my mind goes, you can do something about it, right? It's like if you know that, for example, like fentanyl is being mentioned more. This is a synthetic opioid, this is one of the ones from the study that that did very well with Reddit. And um, if you know that it's being mentioned and you know that's fast acting, you know it's very strong, people overdose on it more often, then that's like an early warning sign that you need to do something about it. And in this case, it could be like, hey, let's distribute a lot of naloxone or narcan in the community. Let's make sure everyone has some on hand so that if someone does overdose, we can try to reverse it, right? Right. Or maybe we could have like uh distribution of drug testing strips so we can understand if if something that we thought was heroin is actually fentanyl and is stronger than we think it would be. So there's a lot of like harm reduction methods that are out there to actually intervene and try to reduce the opioid death rate if you have some sort of reason to think that it's going out. But in this case you would, right? Because Reddit users are saying a lot about it.

SPEAKER_01 10:28

Right. And I think that to me, there's a conceptually interesting piece too, which is we're not hearing about the specific usage or the behaviors of those people, right? Who are using the drugs. We're just you observing linguistic mentions of those drugs in the in the Reddit post.

SPEAKER_00 10:47

And that's all it took.

SPEAKER_01 10:48

Well, that's all it took to improve the quality of the prediction, but could we do even better? Is there something more being said that suggests, you know, that is more nuance, that potentially suggests more in the Reddit posts themselves?

SPEAKER_00 11:00

Oh, this is a good question.

SPEAKER_01 11:01

So, like if your intervention idea was a particular idea and it didn't, let's say it it didn't work, right? Uh you did some intervention and nothing changed. So, but but maybe the posts have actually information for why it didn't why nothing changed to give you the idea for the next intervention, as opposed to just randomly trying different things, right? So I I I do wonder if there's more in there in those posts.

SPEAKER_00 11:24

Um that you could zoom in on and say, oh, this core, like, can we sort of code the posts? Like, oh, this one talked about an overdose, this one mentioned Narcan, this one XYZ, and like see which of those pieces correlates even more.

SPEAKER_01 11:38

Yeah, because some posts might just be about it relaying somebody's experience with using it. Some posts might be about um not relaying their personal experience using it, but maybe they have a question about it, right? It could be anything. And so I think even differentiating the type of content would be hugely valuable. Maybe, I mean, I'm guessing it'll be valuable at least from an interventional perspective, if you're trying methods to reduce that, right?

When COVID Shifts The Baseline

SPEAKER_00 12:01

Yeah, I mean, there's gotta be ways to like improve the model even more, essentially, is what you're saying, by taking a closer look. And I'm I I think that makes a lot of sense. In fact, when they were looking at this model, one of the things that affected it for the for the worse, I think, was COVID. Which is interesting, right? I mean, I guess it's not particularly surprising because when you have a monitoring system, you sort of expect it to keep plugging along at the status quo, right? Yeah. Um, and then when something major comes in and changes the world, or changes the world view, yeah, or changes people's behaviors, or is just kind of like a shock to the system, yeah, then it makes sense that the model's not going to perform as well because it's like operating in a different world than it was.

SPEAKER_01 12:50

Yeah, it's a very sudden shock to the system. And a sudden shock to the system, if you think about COVID itself, it caused people to stay at home more, it caused people to have issues, mental health issues, you know, it did a lot of other things that affected not just in this case, potentially not just the usage of the drugs, but maybe even how people talked about it.

SPEAKER_00 13:10

Or the access to it, right? There's so much going on there.

SPEAKER_01 13:13

Yeah. And that changed the underlying model completely. And so it made this sort of slow prediction, if you think about the prediction as sort of this like time window that you're moving across the um uh over time to see, okay, what's my next likely number of mentions or my estimated, you know, CDC measures for overdose rate, death rates and such. Uh maybe that just dramatically changes when you have a shock in the system, and now you can't rely on uh averaging it based on what you've seen before.

SPEAKER_00 13:42

Yeah, I mean a model is only so good as the inputs into it, right? And in this case, there's something that's affecting the inputs. Yeah. Right? Because the whole world has changed.

Model Limits And Drug Differences

SPEAKER_01 13:54

Yeah, no, exactly. Exactly. So that's I think an interesting uh thing, and that's something maybe the content of the text might might might suggest, right? That there is, in fact, a step change or a big change that's happened. I mean, we know and an event happens, obviously, in the real world, but you know, that's something to keep in mind is you know how the distribution of the data changes itself. And maybe the content of the text might give you more info there too.

SPEAKER_00 14:16

Yeah. I do think in general, it makes a lot of sense to mine whatever data is out there to get that public health information though, um, and to do it in a more real-time fashion.

SPEAKER_01 14:31

Yeah.

Turning Predictions Into Action

SPEAKER_00 14:32

And I think this is what it helped do. It didn't, it worked really well for uh fentanyl, the synthetic opioid. It didn't work quite as well for um heroin, interesting. Natural and the combo of natural and semi-synthetic. But I do think, like you said, maybe the model could be tweaked for certain things, like not just comments, but actual like specific types of mentions, like mention of overdose, mention of using Narcan, et cetera. So I think there's adjustments that could be made. But I think in general, this whole idea of hey, there's so much data out there, it's not data that we've necessarily thought of as public health data in the past. But because we now have this larger compute power, we can aggregate all of it and it's publicly available. Hey, we should be using this too and seeing if it can give us some of this information that can then lead us to intervene in a in a more time-sensitive fashion.

SPEAKER_01 15:29

Yes. That's huge. I mean, this is the public health uh challenge, right? And so we have we have the data now, so it's we should be using it, is what you're saying.

Finding The Why Behind Spikes

SPEAKER_00 15:37

We should be using it and we should also be figuring out right how how can we use it best. Yeah. Right? Because like you were saying, it's it may be the comments about opioids are one thing and it points you in the right direction. But like maybe if we looked very specifically at types of comments, that would help even more. Yeah. And so it's like, well, how can we how can we harness that data to get the information faster, better, etc.? And once we can create these predictions, what do we do about it? And I think that what do we do about it question is really important. Yeah, I agree. Because predicting something, it means nothing if you can't do anything about it, right? It's like, oh, well, that was interesting. Next, but in this case, there are things. And so you'd still have to have the public health infrastructure in place to actually like go get those Narcan kits, make sure that they're freely available, you know, distribute them at the public health department, um, you know, et cetera. And so that piece is like the really important piece where the human needs to be there and and doing those interventions and making sure that that happens in real time too, because the knowledge is not enough.

SPEAKER_01 16:48

But but also I think along the same lines, if you look back at the posts, you can you should be able to say, okay, why is it that there is this increase right now, right? Why is that happening? Is there more suggestion in the post itself, uh in those Reddit posts themselves, that tell you more about why something is happening the way it is, right? So it's not Are you volunteering?

SPEAKER_00 17:08

Are you volunteering to look through those 1.7 million users?

SPEAKER_01 17:11

No, but I'm not you know, I think that that is a that is an excellent. I mean, I think the paper themselves suggested using various AI and LLM-based approaches to mine the text for more meaning. But my point to yours is is that it's not just the interventions, but you know, sort of what can you do with the predictions, but it is what why were the predictions the way they were to begin with? Is there something systemically wrong in our system that would give us ideas for interventions based on why something was uh based on a pattern that we observe? You know, and so that's kind of what I was getting at with this piece.

Next Steps And Closing Thoughts

SPEAKER_00 17:41

Yeah, no, that makes sense too. It's like um kind of getting back to this reasoning piece of hey, how can we understand this better? How can we understand the prediction that's happening better? Yeah, the why behind it, and that can help us A, improve our model and B intervene at that level too, like in the before times.

SPEAKER_01 18:03

Right, right, exactly. I I'm excited by this research direction that uh you know the authors have been taking here, and I'm looking forward to seeing more uh, you know, what they do next with it, because there's a lot to be done here.

SPEAKER_00 18:14

Yeah, and I think social media has so much information in it, and it can absolutely help monitor public health data so that we can intervene. So we will leave you with that, and we will see you next time on Code and Cure. Thank you for joining us.

Laura Hagopian

Host

Vasanth Sarathy

Host