Code & Cure

#27 - Sleep’s Hidden Forecast

Vasanth Sarathy & Laura Hagopian

What if one night in a sleep lab could offer a glimpse into your long-term health? Researchers are now using a foundation model trained on hundreds of thousands of hours of sleep data to do just that, by predicting the next five seconds of a polysomnogram, the model learns the rhythms of sleep and, with minimal fine-tuning, begins estimating risks for conditions like Parkinson’s, dementia, heart failure, stroke, and even some cancers.

We break down how it works: during a sleep study, sensors capture brain waves (EEG), eye movements (EOG), muscle tone (EMG), heart rhythms (ECG), and breathing. The model compresses these multimodal signals into a reusable format, much like how language models process text. Add a small neural network, and suddenly those sleep signals can help predict disease risk up to six years out. The associations make clinical sense: EEG patterns are more telling for neurodegeneration, respiratory signals flag pulmonary issues, and cardiac rhythms hint at circulatory problems. But, the scale of what’s possible from a single night’s data is remarkable.

We also tackle the practical and ethical questions. Since sleep lab patients aren’t always representative of the general population, we explore issues of selection bias, fairness, and external validation. Could this model eventually work with consumer wearables that capture less data but do so every night? And what should patients be told when risk estimates are uncertain or only partially actionable?

If you're interested in sleep science, AI in healthcare, or the delicate balance of early detection and patient anxiety, this episode offers a thoughtful look at what the future might hold—and the trade-offs we’ll face along the way.

Reference: 

A multimodal sleep foundation model for disease prediction
Rahul Thapa
Nature (2026)

Credits: 

Theme music: Nowhere Land, Kevin MacLeod (incompetech.com)
Licensed under Creative Commons: By Attribution 4.0
https://creativecommons.org/licenses/by/4.0/

SPEAKER_01:

Most people think of sleep as something that happens when the day ends. But what if sleep is actually the body's nightly diagnostic, quietly signaling disease years in advance?

SPEAKER_00:

Hello and welcome to Code and Cure, where we discuss health in the age of AI. My name is Vasant Sarthi, and I'm a cognitive scientist and AI researcher. And I'm here with Laura Hagobian.

SPEAKER_01:

I'm an emergency medicine physician and I work in digital health. And I had a really good night of sleep last night.

SPEAKER_00:

Yeah, me too, actually. Yeah, and my um my my watch tells me that I had good REM sleep as well, uh, good periods of REM sleep. Um it did recommend that I sleep eight and a half hours, which is ridiculous.

SPEAKER_01:

That is a that that feels like a lot of sleep to get.

SPEAKER_00:

Yeah, I think on I think I get like seven hours of sleep at night, sometimes. I mean, sometimes a little bit less, but eight and a half, like that's a lot of hours. And with kids, I I don't know how that's even possible.

SPEAKER_01:

The normal clinical recommendation for adults is seven to eight hours a night.

SPEAKER_00:

So I don't know. My my watch thinks I need more sleep. So it must be right. It's my watch, right?

SPEAKER_01:

It can read your mind. It knows everything about you.

SPEAKER_00:

That's right. That's right, that's right. But sleep is super important. And I think this paper that we're going to discuss today is really cool. Really, really, really cool because it tells you how important sleep is in ways that you wouldn't have even thought before. Um, and people have measured sleep. You know, that's I think I think we'll we'll sort of talk about this paper from the context.

SPEAKER_01:

More than your watch measures sleep, by the way. Way more, right?

SPEAKER_00:

That's that's the other piece that I I started to appreciate and learn more about. So so maybe we'll start there, um, Laura. Like maybe we can talk about how, you know, just like data, right? What is sleep data?

SPEAKER_01:

Well, so in this study, um I'm gonna I'm gonna give a little punchline. But in this study, they basically like took sleep data and they were able to figure out that they could predict diseases, right? And so the question is like what sleep data? And what they were looking at was polysomnograms. So um a polysomnogram is often used to check for sleep disorders if you're like having trouble sleeping, if you're not sleeping well, or um it can be done later on if you've been diagnosed with a sleep disorder to see if there's improvement. And they check a lot of things on you, okay? It's more than like your wearable or your watch will do.

SPEAKER_00:

They connect a lot more things to your body to measure things.

SPEAKER_01:

Yeah, exactly. Like you're gonna get uh measures of your brain activity.

SPEAKER_00:

Okay.

SPEAKER_01:

Um, what's going on, and that can be like the sleep stages. Okay.

SPEAKER_00:

It's an EG.

SPEAKER_01:

Yeah, exactly. Uh whether you're in REM or whatever. They're gonna detect your eye movements. Wow. Okay, which also correlates with your sleep stages. Uh it's gonna detect your muscle activity. Like, are your legs moving? Um, do you have tension in your muscles? All that kind of stuff. It's going to detect your heart, your heart rate, your heart rhythm, um, how your heart is working. It's gonna detect your breathing too. So, you know, what's what's your blood oxygen level? Is the air flowing in and out of your lungs well? All of that kind of stuff. And so when you're when we say, oh, you're just measuring sleep, like you're not. You're measuring so many things at the same time because it's like, well, how do they all work together? What's the airflow like? Uh are you snoring? What's your blood oxygen level? How's your heart working? What are your leg movements like? What's going on with your brainwaves and your eyes? And so when someone has a polysomnogram, like there's so much data there.

SPEAKER_00:

I know, but like what do they use the data for right now? Like what who who gets a polysomnogram? Can I go get one?

SPEAKER_01:

I mean, do you need one? So it's usually people who aren't sleeping well. Like, for like one of the most common reasons I've seen is like, hey, someone thinks that they have sleep apnea. They're, they're they're snoring a lot. Uh, you know, their partner might say, like, oh, they're like gasping for air and waking up in the middle of the night, and then they go get tested and find out if they have sleep apnea. There's like a a ton of different reasons um why you can get one, but diagnosing a condition is like one of the most common reasons. Oh, interesting. Yeah. What you're bringing up is actually important because this study was done on people who had polysomnograms, right? Not not everybody gets a polysomagram. In the general population, like not everybody just like goes and gets one of these tests.

SPEAKER_00:

Yeah.

SPEAKER_01:

Someone gets one of these tests because they were having trouble sleeping, or there was like suspicion for you know, sleep apnea or some other disorder. So, right from the get-go, you have a selection bias here, right? Yeah. Of who made it into this study. And they made it into this study because they had a suspected issue with their sleep.

SPEAKER_00:

Got it, got it, got it. And I saw that they collected on nearly half or more than half a million hours of sleep data from over 65,000 patients.

SPEAKER_01:

Yeah.

SPEAKER_00:

Across across different medical centers, right? Different um different cohorts. Yeah. Exactly. Which is, it seems to me a pretty high amount of data. But like you said, it's potentially limited to people who have polysomniograms. So, you know, there's that. But still, it's it's a ton of data. It's a ton of data. And each of these day, each of these items of data it captures all of these different modalities. Um, and by modalities, I just mean uh brain waves or or eye eye movements or heart or breathing. Yeah, exactly. So it's capturing all of that over time, little snippets of time. And so that's incredible. A night's worth of sleep, right? Right. Oftentimes one night's worth of sleep. Exactly. And so what the this paper was trying to do was that it was talking about this notion of a foundation model. And I think that's interesting, right? Building a sleep foundation model.

SPEAKER_01:

Well, I don't even know what a foundation model is. So tell me what is a foundation model. Glad you asked. You set me up really well for that.

SPEAKER_00:

No, I didn't. I didn't. Yeah. I may have even waved my arm a little bit to suggest you say something at that moment. But uh, but yeah, no, a foundation model is just a fancy word for a uh a model that does that gets a lot of data and finds general patterns. Now that that feels very general. So I mean that feels like an LLM. Correct. Large language model. But it's it's not language, is it? But that's it. So um it used to be that you had this world of large language models, which we've talked about extensively, chat GPTs and so on. Um, and people started expanding that idea where in a large language model, you give it a bunch of text, right? Trillions of pieces of in the internet text, and you have it learn how humans write and speak. And the idea with the large language model was you predict the next word. Now, people extended that idea for vision and for images and for other things, other modalities. And you couldn't call it a large language model anymore. So someone came up with the name foundation models. Uh actually, it's a group of researchers at Stanford who came up with that name. Now, the foundation the reason it's also called foundation model is not just to capture the fact that it's more than just language, but also it's a foundation for lots of other tasks. So, like an LLM, at its core, all it's doing is predicting the next word. But we use it for writing emails. We use it for um checking if you know our tone is right in something or whatever, right? We're using it for all these different tasks. So it serves as a foundation for other things. So that's kind of how I want people to think about this as well. Um, what's cool about this now? Let's get into a little bit more of the nitty-gritties here, is that um they're taking in all of this, these pieces of data, all of the different modalities, the brain waves and the muscle um uh movements and the eye movements and all of that. And they're finding a way to aggregate that and feed that into these into the foundation model. And the idea is to predict the next time window. So they take in five chunks of five-second time windows. Um, and you think of each chunk of a time window as being like a word, right? And the idea is you have a sequence of those and you predict the next one.

SPEAKER_01:

So, like based on how you're sleeping right now, I can predict how you're gonna be sleeping in the next five seconds. That's that's what it's doing. So it's we're not at the point where it's like predicting disease.

SPEAKER_00:

No, nothing, nothing. So any of that. And in fact, there's there are other terms that people use for this, like self-supervised, whatever else, because you don't need human annotations. You don't need for a human to say, well, that was good sleep, that was bad sleep, that means this patient was sick, whatever. Doesn't matter. All it's trying to do is take all of the data and tell you what the next five seconds of the EEG is gonna look like.

SPEAKER_01:

So there's no like synthesis of it, right? There's no like, there's no like, oh, this is what this means. No, there's no meaning. Yeah, I don't know. There's no meaning in part, and it's just like objective. Here's what we predict will happen next.

SPEAKER_00:

Right. And that's how that's what it is with language models too, right? There's no synthesis, it's just predicting the next word. Um and inside of the language model, there's some, you know, stuff that's happening. The model is getting trained, and after it's trained, it's now compressed its understanding of how humans speak to the point where you give it a new word, it's able to tell you the next word, right? That's the same idea here is that you've given it enough sleep patterns, sleep sleep data, that now it's discovered the patterns, general patterns of human sleep across these different uh modalities. So that's really that is I think that is really cool. Um and and it was I think just as it was surprising for us that the LLMs, which were whose only job was to predict the next word, also encoded all of this other data, all of this other information, right? Being able to write poems and all these other things that were kind of implicit in how humans speak to each other. Similarly, here the patterns of sleep are also um in a way that they they contain uh useful information about a person's health. Now, when I say that way, it seems like unsurprising.

SPEAKER_01:

It does, I was gonna say it's not surprising to me, right? Like um a lot of times, for example, you might say, Oh, in early uh early cognitive impairment. Yeah. Like if someone's gonna go on and develop dementia or whatever, that I would expect a sleep disturbance to occur. That's usually an early sign of it. Or um, I mean, we've talked about mood a lot recently. Like if someone has a depressed mood, they may be sleeping more, for example, um, or maybe waking up earlier. So I'm not like from a clinical standpoint, I'm not surprised that sleep can help encode different disease processes because you see issues with sleep with certain disease processes, and not just, you know, I'm not just talking about like insomnia or sleep apnea that are sleep-related diseases. I'm talking about other diseases, right? And uh when you look at all the data in the polysomnogram, it's like, oh, okay, well, it's got, you know, brain data, it's got breathing data, it's got heart data. And so you, you know, it it makes sense that it could predict potentially diseases in so many different categories.

SPEAKER_00:

Yeah. No, exactly. And and I think that that's, yes, in hindsight, it seems pretty straightforward, but I think that that's the basis for it being a foundation for all those other tasks for staging sleep or detecting uh sleep apnea or predicting disease or whatever else, right? It's one model that is the foundation for many different tasks. Um, and I think that that's you know, the paper they call it the language of sleep. It encodes the language of sleep. And it's sort of that, right? It encodes all the patterns of sleep. And I think that that's kind of beautiful. That's kind of great. That's kind of a great use of something like this, right? At some level.

SPEAKER_01:

Well, you're taking it, it's not surprising to me from the clinical standpoint, but I don't think it's like being used for that that much either, right? It's not being used to be like, hey, is this person? You're not doing a polysomnogram to be like, are you gonna develop dementia? Like, that's not why we tend to order them. We tend to order them because someone's not sleeping well and we want to figure out what's going on. So it's like, hey, there's all this other data encoded in here that could potentially be predictive of what happens to someone. And in the study, they said, Oh, you know, who's gonna develop X disease within six years of the polysomagram?

SPEAKER_00:

Yeah, and they in fact had even a separate data set that they were never they had never shown the model that before. And it was able to use the foundation model as a basis for predicting disease in that other data set, which is really cool, which is it shows that it can transfer its knowledge. It's got in it's got internal knowledge enough to be able to transfer that and use that for it in a different, completely different data set setting, which is pretty cool.

SPEAKER_01:

Yeah. So I want to rewind for a second though, because we you were discussing okay, what a foundation model is and how its job is to predict like the next five seconds of sleep. So, how did we leap from there to being able to predict a disease?

SPEAKER_00:

Oh, yeah. So once the foundation model is trained, you have now just think of this as a black box where you feed it some time series data for the different modalities. Uh, just you just feed it sleep data, and it outputs all its job is to output the next five seconds, and then you put that five seconds back in and uh back in the input, and then you get five more seconds, and so on and so forth. So you can generate a whole bunch of additional sleep data. But that seems not anything to do with prediction, right? Uh but what what the researchers do, and this is what people do with large language models too, is they add on a neural network at the top end of it, at the end of it, and they fine-tune that neural network model. So what they do is they take a uh, you know, they they use the sleep data, they put this other neural network at the end of it, and by fine-tuning, I mean they only they freeze the original foundation model. They do no changes to that, but they change and update the the the uh you can sort of think, think of the the neural network at the end as like a hat. They just kind of put a hat on it, uh hat on this thing, and that's all they change and and update. And that's enough. Um you know, the with leveraging the foundation model's base knowledge, that neural network is now very good at predicting whatever task you want it to predict. So the output of this whole system, the foundation model plus this extra neural network, will give you not the next five seconds, but will give you a prediction for a certain disease, for example. And you can put in whatever you want, right?

SPEAKER_01:

Yeah. So here they said, okay, let's see if we can predict like 130 different diseases.

SPEAKER_00:

Yeah.

SPEAKER_01:

And it, I mean, it's not perfect, right? But it did a really good job with certain certain conditions like um Parkinson's disease, uh, you know, hypertensive heart disease, um hemorrhage inside the brain. It did a it did a good job with several kinds of cancer, like prostate and breast. Um, it was highly accurate for death. It was accurate for heart failure, chronic kidney disease, dementia, stroke. So all of these, it's like, wow, there's so much data encoded inside inside of there. Yes. And what was interesting to me and not surprising was like when they started to look under the hood, because it sounds like they could, at least to some degree, like they there were different pieces of data from the polysomnogram that were more predictive for certain conditions. So, like, for example, your brain waves, your EEG, is gonna be more predictive for things like Parkinson's and dementia brain conditions. Which makes sense, yeah. Right. And like the respiratory signals are gonna be more uh predictive of like respiratory disorders. Um, your heart, your EKG signals, they're gonna be more predictive of a circulatory disease. So it's like there's so much data that's going in there, and it makes clinically it makes sense that the the data that you get out, this prediction is correlates with you know the the the type of data that you would expect it to.

SPEAKER_00:

Yeah. And I I think that that that's it. But I think that because you'll get the sleep patterns from so many different modalities, you can predict a lot of different things, not just the brain, you know, yes, the brain data was more sensitive to the brain diseases, other, you know, other way around, but you know, you you're you're recording other things, so that's why you're able to track all these other diseases.

SPEAKER_01:

So it's like multimodal. So it's like, okay, maybe something was respiratory dominant, but the other signals also played a role in some of this prediction.

SPEAKER_00:

Yeah, and I think it would be surprising if we've discovered some diseases that we didn't expect uh to have correlations with certain modalities. You know, I I don't know if that's something that's potential future work, but it feels like that could be very interesting. Um, you know, I think from a future perspective, I would also be interested to see, okay, what is the correlation between um, you know, these sleep, these big um, you know, all these different modalities from like something like your wearable. Like, is there a way to do these sorts of predictions at a maybe a lower scale level uh or boost the performance of the your your wearable devices, which are recording much less, right?

SPEAKER_01:

Yeah, they're recording way less. But you're getting, in a way, you're getting you're getting a different amount of data though. It's not just like one night on your wearable. You can wear your wearable every night, but you're not getting the amount of data and the amount like you know, you're not getting the brain waves, for example. Yes. Um that you'd be getting with a polysomnogram. So it'd be, yeah, interesting.

SPEAKER_00:

I also think that it is worth noting that you know you do get a limited cohort year because it is, but the you know, everybody who's doing this test, uh all the PSG um patients were all patients. They were all they all had a test for a reason.

SPEAKER_01:

A test for a reason. They had a sleep issue or a sleep disturbance of some sort. Somebody ordered this test on them.

SPEAKER_00:

So it'd be it'd be interesting to expand this foundation model to people, um, you know, just healthy people, normal people who are volunteering for this and and being able to record their sleep data as well, um, and seeing if that helps with other that improves the prediction rates in general. Um, you know, I would imagine the foundation model would be more robust if it's also got healthy people sleep data, right?

SPEAKER_01:

Yeah, I mean, like, you know, from a scientific standpoint, you're like, well, I I can only apply this study to the types of people who are studied. You can't just like apply it to the general population. Yeah, yeah. Um, so I agree with that.

SPEAKER_00:

Yeah.

SPEAKER_01:

So I I have a question for you.

SPEAKER_00:

Uh-oh.

SPEAKER_01:

Well, like, would you want to know? Wait, like, say you had a polysomnogram. Would you want something afterwards that was like, here's your risk for 130 different diseases?

SPEAKER_00:

I mean, if I can do something about it, then yes, I would want to know.

SPEAKER_01:

Yeah, I mean, I think that's fair. There are some of them where you could maybe do something about it. Like uh, they mentioned hypertensive heart disease, for example. Um, and it's like, hey, there's something you could do to prevent that, like taking medication for blood pressure, um, you know, decreasing the salt in your diet, exercising more, whatever. So, but but then there's other stuff where like it's if it's just gonna happen, like um dementia or Parkinson's disease, do you like want to know six years ahead of time?

SPEAKER_00:

Yeah, maybe, because you know, here's what I mean, in my perspective at least, if I know, for instance, if it predicts something six years down the line from me, I might make lifestyle choices right now. Now, they might be hypotheses, they might not might not work out, but I might be able to say, hey, I'm going to try doing something, right? Some kind of lifestyle choice, medication, whatever, for the next year. And then a year later, I I I would like to run the sleep study again or run it through the foundation model again and see if that that's still true. If it's still predicting six years from now, five that point, five years from now, if that if if I've improved my chances or not. And you know, that's how I would think about it. But I don't know.

SPEAKER_01:

I feel like I'm the opposite. I would be like, uh, if I if I like I just want to like live my life and not feel anxious about this thing that's gonna happen that I have like no control over.

SPEAKER_00:

But that's my point. If you have no control over it, then definitely I don't want to know. But my belief is that all of these things I do have potential control over because everything is based on my food, my lifestyle, my, you know, just medications that I but some of it's based on like genetics and age and things that you like can't control. Yeah, yeah. Well, that's fair.

SPEAKER_01:

That's I think that's tough. And the other thing that strikes me is like well, say you got this result back, and it was like, you have a 93% chance of developing Parkinson's disease. You're like, well, am I in the 7% that won't get it? And like, how am I just gonna have that hanging over my head for the next six years, wondering if I am going to develop it or not?

SPEAKER_00:

Yeah.

SPEAKER_01:

That's like, and that's one of the ones that I predicted with high accuracy. But there are some where it's like, hey, um, you know, your your chance of chronic kidney disease is 82%. You're like, well, what if I'm in the 18% that doesn't get it and now I'm just gonna worry about it for six years?

SPEAKER_00:

But that's my point. If you can do something and then a year later recheck it and your numbers have gone down, then that whatever thing you did is gonna make you feel better and keep up with that, right? Or maybe you did nothing. I don't know. That would be interesting too, if you did nothing and your numbers changed. Because, you know, presumably if your sleep doesn't change, then the outcome won't change either. But if for some reason your sleep changes and all those changes are recorded, then it's not causal, right?

SPEAKER_01:

The sleep is just like correlational.

SPEAKER_00:

Yeah, yeah, yeah. But the sleep changing suggests that something else might have changed in your life.

SPEAKER_01:

Yeah, maybe. I do think there's this component of like if you test enough, if you test for 130 diseases, something's gonna pop back, like, oh, you have a high chance of X and Y out of the 130, right? Like it's just gonna happen. And if that happened to me, I'd be like, I don't know that I wanted to know that, especially for something that like I could not have an impact on.

SPEAKER_00:

Yeah.

SPEAKER_01:

I think, you know, there's a lot of like anxiety people can develop about their health. Sure, it might change the course of your life. You might decide, hey, like before I develop Parkinson's, I'm gonna go on this big vacation that I've been waiting for. Um, but on the other hand, then you're like, oh, I've got Parkinson's disease hanging over my head, and I wouldn't have known that for five years, and maybe I would have rather like lived in peace.

SPEAKER_00:

I don't know. Yeah, I that's fair. I mean, I think both views on that are completely fair and it's hard to know.

SPEAKER_01:

Yeah, it's something that, you know, maybe a provider could know without a patient knowing. I don't know. I don't that's sort of a weird dynamic then too. It's it's an interesting question. There's like there's I think there's such a thing as like knowing too much though, and maybe not wanting all that information, like too much testing.

SPEAKER_00:

Yeah. Yeah. I'm not sure. I you know, it'd be curious also to ask people who uh ha do have those conditions if they would have liked to know sooner, you know, in hindsight. And what you know, if not, would they have just, you know, were they happy being ignorant until the point of the disease and then just dealing with it as it comes, or would they have preferred to know five years before, knowing that there's really no thing, not much they could have done anyways.

SPEAKER_01:

Well, I think you you do highlight like an important difference. Like there are some of these where you could maybe do something to prevent them, and others where you just like maybe probably can't. Yeah. And there's a there's a distinction there because I think I would want to know if I could prevent something from happening if I, you know, maybe just needed to take a medication or whatever. Um it's very different if I can't prevent something from happening.

SPEAKER_00:

Yeah. I mean, I think regardl, regardless, I think this was a very interesting uh interesting approach. And the idea of having a model that generalizes sleep patterns, which indicate other things, I think opens the doors to a lot more potential uses for a foundation model like this. Even outside of predicting disease, there could be other use cases for something like this.

SPEAKER_01:

And there's a ton of data captured in the polysomnogram. I think this like question of, oh, can we do more with it than just like predict a sleep disorder is a really good one. Cause like from here, the answer is like probably yes. Like there's so much data encoded in there in that one night of sleep that you're getting, because you're getting respiratory, circulatory, brain, like eye movement, all this other stuff is getting fed into this system. And it's like, well, now that we have the ability to do computations on such large amounts of data with the click of a button, like, why not? Why shouldn't we try? And I think that's that's a great use.

SPEAKER_00:

Yeah, yeah, yeah.

SPEAKER_01:

I think with that we can we can close out here and um yeah. Thanks for joining. We'll see you next time on Code and Cure.

unknown:

Bye bye.