Code & Cure

#38 - Using AI Can Make You Look More Guilty In Court

Vasanth Sarathy & Laura Hagopian

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 22:54

What happens when AI spots a dangerous finding on a scan and the radiologist disagrees? In theory, “human in the loop” sounds like the safeguard that keeps patients safe. In practice, it raises a far more uncomfortable question: when clinicians override AI, are they exercising sound judgment or exposing themselves to legal risk?

We explore how AI image-reading tools are reshaping radiology and why performance metrics like “96% accurate” can be misleading in real clinical settings. False positives and false negatives do not carry the same consequences, and rare diseases can sharply reduce the real-world value of even highly capable models once prevalence and positive predictive value are taken into account. As these systems flag more normal scans, a new form of defensive medicine can emerge—one where repeatedly rejecting AI recommendations begins to feel professionally dangerous, especially when those recommendations are documented in the patient record.

We also examine a study that placed laypeople in the role of jurors during malpractice scenarios involving missed diagnoses such as brain bleeds and lung cancer. The findings are revealing: when AI detects the pathology and the radiologist does not, jurors are more likely to assign blame. But when both the AI and the radiologist miss the finding, the physician gains little protection. The episode closes with what may actually reduce harm, including better education about the limitations of AI and a clearer understanding of these systems as imperfect clinical decision support—not a flawless second expert beside the clinician.

References:

Randomized Study of the Impact of AI on Perceived Legal Liability for Radiologists
Bernstein, et al. 
NEJM AI


Credits:

Theme music: Nowhere Land, Kevin MacLeod (incompetech.com)
Licensed under Creative Commons: By Attribution 4.0
https://creativecommons.org/licenses/by/4.0/

Risk And Human In The Loop

SPEAKER_01

What's riskier today? Missing a diagnosis or disagreeing with AI?

SPEAKER_00

Hello and welcome back to Code and Cure, the podcast where we discuss decoding health in the age of AI. My name is Vassant Sarathi. I am a cognitive scientist and an AI researcher, and I'm with Laura Hagopian. I'm an emergency medicine physician and I work in digital health. So on many of our episodes so far, we've talked a lot about having a human in the loop.

SPEAKER_01

Yeah.

SPEAKER_00

Right? So a human in the loop just means that an AI system that's in operation is making uh providing information, sometimes attempting to make a decision, but we want to have a human kind of confirming, checking the box and saying, yes, the AI did the right thing, or checking the AI's reasoning, or making sure that the AI's output is consistent with what they're looking for. Or saying, like, no, it didn't do the right thing, right?

SPEAKER_01

And like, oh, we need to either change the AI or we need to do something different or whatever, right? It's like that's part of being in the loop is being able to reject what the AI says too.

SPEAKER_00

That's right. And the reason for that exactly is to be able to reject it because the AI systems are imperfect. They have all these issues. We're trying to get better at them. But I thought they were magic. Well, that's the thing, right? So we do believe so so this is very interesting because they are in being used in so many different cases now and becoming more and more autonomous in their use cases. That, you know, this question of human-in the loop is a very serious question. And often the human in the loop solution is um cited as being a just a quick, hey, let's just do that, because that would resolve any issues. Like let's use the AI as a tool, like any other tool, and then ultimately the human is accountable. So having them in the loop should be good enough. Well, I think that that that's kind of where this study that we're gonna talk about today kind of turns that on the on its head a little bit. And as I mean, so really asks the question, what does it mean to be a human in the loop? And what does accountability mean in this context?

SPEAKER_01

I do want to step back too, because I I feel like, I mean, when it comes to AI stuff, I'm still I'm a white person, right? And I feel like I say, oh, AI is involved, and I sort of like automatically trust it. Like, oh, it's gonna know more, it's gonna do better. It's like an algorithm. There's no human there for error. And yes, you know, you you have to take a step back and be like, wait, the AI can be error prone, or how is AI programmed, or how is it checked, or what are the checks and balances, or all are there things that might be missing? Are there things that might be flagging that it shouldn't be? It's it's really easy to just like trust it because you're like, it's a machine, it's like ones and zeros, it's gonna do the right thing.

Why AI Feels Like Magic

SPEAKER_00

And especially when the task is machine-like, right? Yeah, like reading a bunch of you know, numbers and looking at code or looking at an image, these are all like what seems like things you can just send digitally and have it do some computations and then come out with the right answer because computations is what machines do really well, right?

SPEAKER_01

And so it should be like magic. That's like that's kind of how it computes in my brain, but it's not right, right. But it's easy to per like as a lay person to perceive that these types of things like reading a radiology image, which we'll get into, it should just be able to do and it should be perfect at it.

SPEAKER_00

Right, exactly.

SPEAKER_01

And the truth is that it's not. We know that it's not, but like it's easy from the outside to just sort of assume that it would be.

SPEAKER_00

Yes. And in past podcasts, we've talked a lot about being an AI system being good at something as having accuracy, right? The idea that it gets the things right. Like you do, you train an AI model, and then afterwards you evaluate the AI model and you come back with some number. It's 96% accurate or whatever, right? And what does that number mean? And you know, and that matters. So that number it takes into account all the possibilities, which is that the AI got it right or the AI got it wrong, right? It takes into account the possibilities that the AI got all the instances in the evaluation data set that existed, right? It also got those right. Right? It captured everything that was right in there, it got it right. And it also takes into account all the things that the AI says were right, were also correct. So it's it's a little complicated, and we'll get into the details of that a little bit. But there is a lot of nuance to even something simple as it's 96% accurate. Like what does that actually mean, right? Of all the things that it should have gotten wrong, did it get wrong? Or, you know, if it's at zero, if the answers were only zero or one or true or false, did it always get those correctly? So I think this study really goes into that and connects that beautifully with the human in the loop aspect. And we are in the world of radiology. And in the world of radiology, you have images, right? Right.

SPEAKER_01

And there's a lot of AI image reading technology out there.

SPEAKER_00

Right, right.

SPEAKER_01

Right. And the the truth is that some of it can be misleading.

SPEAKER_00

Yes.

SPEAKER_01

So even though we want to believe it, it's like we know that some of it's wrong. Yeah. And then you could have radiologists get almost influenced by that wrong information.

SPEAKER_00

Yeah. Right?

Accuracy Breaks On Rare Cases

SPEAKER_01

Like it's like especially if they know that it's gonna be like put in the patient's file, for example. Yes. And so even even if you have really, really accurate AI algorithms for you know, radiology reads of like things like x-rays and CAT scans and all that kind of stuff, it doesn't, it's not perfect. And there's examples out there of things that were like very, very sensitive, very specific, like this is a great use case for AI, but they started flagging lots and lots of cases wrongly, and because the condition was so rare. And so even though it's like very good at what it was programmed to do, when you have something that's really rare, yes, that sort of positive predictive value, even though it's very accurate, is like not that great.

SPEAKER_00

And in those, and I that's great that you brought up that rare situation because I think that that's the situation where we say, Oh, have the human in the loop because they have this expertise and intuition, that somehow they're able to identify the the one feature that made the rare thing really important, right? And and override what the AI is saying because AI is not AI is thinking in averages and not in rarity. And the human sitting there is saying, Oh, wait a minute, no, no, no, there's this thing that I can see that you know is very rare, but it matters in this case more than anything else.

SPEAKER_01

And I don't want to miss it.

The Fear Of Disagreeing

SPEAKER_00

I don't want to miss it. So that's the always that's touted as the value of having the human in the loop, right?

SPEAKER_01

But here's the thing, too, is that if if say I were a radiologist, which I'm not, but say I saw an AI read of something and I like didn't agree with it, if I if I actively disagreed with the read, but I knew, hey, this is gonna go into the patient's chart, man, that's gonna make me feel like, ooh, it's gonna make me feel like I don't know. I am I gonna be liable for this, or it it just it feels like it doesn't look good. Yeah, right? It doesn't look good if I disagree with it. Right. It would maybe be easier for me to just agree with it. Right. Right. Especially if I what if I got sued?

SPEAKER_00

Yeah, right. Exactly, exactly.

SPEAKER_01

Then it has both reads. It has the AI read and my read. And if they disagree, like what's gonna happen to me?

SPEAKER_00

Yeah, and building on the idea of it feeling kind of weird, it it would depend also on the result. You know, if you asked if you knew the result was that you were right all along, then you'd be more confident. You'd be like, great, yes, I AI got it wrong. They didn't take into account all these factors, and I did, and I got it right. If you get it, if if you got it wrong, then all of a sudden it's perceived, it could be perceived differently, right? And that's the whole kind of context of this study.

SPEAKER_01

For sure. And I think in particular, if if I, as a radiologist, thought something was normal and the AI called it abnormal, that's the situation where I would feel the most uncomfortable. Because I'd feel like, oh, does this does, you know, am I getting it wrong? Number one, but if I do believe I'm right, am I am I still increasing my own liability risk, especially if both versions are going into the chart? Um, it's almost like a penalty for using AI that you know it it's almost easier for the radiologist to agree with it, even though it might be wrong, which could lead to problems for the patient, like oh, they're diagnosed with something that they don't actually have, or they keep getting testing to see what's going on. It costs the medical system money, etc. Like there's that that could snowball in its own way, right? But getting it, getting it wrong is also a problem.

SPEAKER_00

Yes.

SPEAKER_01

And so I think it's this weird like situation where you're trying to figure out, okay, I am the human in the loop, but I also don't want my liability increased, and I still think the AI tools are useful.

SPEAKER_00

Yes.

SPEAKER_01

And I think that's what this this paper really dives into that. And um, I think we can get into what he actually studied.

A Jury Study On AI

SPEAKER_00

Let's do it.

SPEAKER_01

So what they did was they took a bunch of people, um, just like lay people who were supposed to be representative of what a jury could be like, adults in the United States who are 18 years or older, and they basically gave them a couple of case vignettes. Um, one vignette involved in in both the case vignettes, they they had the radiology sort of miss what was happening.

SPEAKER_00

Yes.

SPEAKER_01

Uh one of the case vignettes was a bleed within the brain. Um, and the second case vignette was something about lung cancer. And so in both the cases, they said, hey, the the radiologist missed the finding. And that resulted in something bad, right? And resulted in death, or it resulted in a permanent disability or whatever.

SPEAKER_00

This is a hypothetical hypothetical situation.

SPEAKER_01

Hypothetical, yes, hypothetical situation, but the radiology missed it. Radiologists missed it. And then they had different scenarios. They in in the base scenario or the control, they didn't say anything about AI.

SPEAKER_00

And they just provided the vignette and they said the radiologist missed it. Yeah.

SPEAKER_01

And they and then basically there was a question that went out to the, you know, juror that was like, hey, did the radiology radiologists meet their duty of care to the patient?

SPEAKER_00

Which is the measure of whether or not a radiologist is is liable for malpractice or not, right?

SPEAKER_01

Exactly. Yeah. And so they could either side with the, you know, the plaintiff, finding the radiologist to be liable, or they could side with the defendant, the radiologist. Right. Um, and so they compared it to this sort of control scenario where there was no AI used. And then they did different iterations of this. The the first major iteration was whether or not an AI found pathology there. So did the AI identify something wrong?

SPEAKER_00

Okay. So in this in that condition, the AI found it, found something was wrong, because there was something wrong in this vignette, right? And the radiologist missed it.

SPEAKER_01

So if that was true, yes. That was the yes, that was the second condition.

SPEAKER_00

Right.

SPEAKER_01

And so if that if the AI found a pathology and sort of disagreed with the radiologist, people were like, oh, you know, that's not good. They're more likely to side with the plaintiff and less likely to side with the radiologist.

SPEAKER_00

Yes. So there's basically, you know, in a sense, they're saying, hey, radiologist, you should have listened to the AI. It obviously knows more than you.

SPEAKER_01

Right.

SPEAKER_00

Okay. And so that's not it's already problematic in a little sense, right? A little bit.

SPEAKER_01

It's also it's not surprising to me. It's not surprising to me that people would say, hey, the the the AI flagged it, you should have seen it, because this was actually true.

SPEAKER_00

Yeah.

SPEAKER_01

At the end of the day. Yeah. This was actually abnormal at the end of the day.

When AI Disagrees You Pay

SPEAKER_00

Yeah. Or at the very least, wondered, okay, why did the AI say that? Dug deeper, and then the digging deeper would have been like the correct kind of practice, the standard of care would have been to dig deeper. And as a result, you would have found it, right? That that's kind of the implication here. I'm not I'm putting words into their mouths, but right. But that's the idea.

SPEAKER_01

It's like, hey, if the AI contradicted the radiologist's decision and then it was confirmed that that's what actually happened, they were more likely than the control scenario where there was no AI used to say, hey, yes, the radiologist messed up here.

SPEAKER_00

Yes, yes.

SPEAKER_01

So so what about the opposite scenario, which is where there still was pathology because every every case had a missed, you know, brain bleed or cancer or whatever mass. Yeah. Um, there's that the radiologist missed, but in the second scenario, the AI also missed it.

SPEAKER_00

Yes. What then?

SPEAKER_01

Yeah. So this is interesting because it wasn't any better. It was just the same as the control. It's not like it helped the radiologist out in the eyes of the of these like so-called jurors. Yes. It was the same. It was the same as if they hadn't used AI. Right. So AI hurt them when it disagreed with them, and it didn't help them when it agreed with them. Right. And so it's like a one-way street there.

SPEAKER_00

It's like, but that is kind of weird, right? Because people are saying then that, hey, it's a good well, you didn't listen to the AI, but it was wrong anyways. Um, you know, I mean, these people have the benefit of hindsight, right? They already know the right answer. So that's that I mean, and that is that is realistic, right? In a real in a real trial, they do know the right answer. So that's, you know, in a sense that this is realistic. So you're looking back at an event that happened before and asking the question, should the radiologist have dug deeper in this scenario? And they're saying, not really, nope, you know, they're they're fine. You know, they're uh the the the AI didn't disagree with them, didn't give them any reason to think otherwise, and so they're fine.

SPEAKER_01

Yeah. So I think in general, this first part of the study was like, hey, if you miss a pathology that AI finds, you're more liable. Yes. But the flip isn't true. Right.

SPEAKER_00

Right.

SPEAKER_01

And so from a liability perspective, the AI is not helping you out here.

SPEAKER_00

No, no, it's hurting you.

SPEAKER_01

It's potentially hurting you. Right. And so the I the concept would be hey, you're actually gonna be j judged worse in court. It's gonna be harsher for you to use the AI. To use the to have used AI, which is being integrated everywhere, right? It's like it's outlook not good.

SPEAKER_00

Yeah, right, right.

Teaching False Alarms And Misses

SPEAKER_01

Outlook not good if you've used AI. So then what they so that's that's important, sort of as a base scenario. And so then what they did in future scenarios was they said, hey, we're gonna educate the jurors about how AI is used and what the problems with it are, because it's not magic, right? And so they went ahead and said, in in one scenario, they said, Hey, actually, there are false alarms that the AI has, it's not perfect, it may alarm things that are actually totally normal. Yes. And then in another version, they did kind of the flip. They said, Hey, AI sometimes has false misses. Yes, it can miss things that are actually there. And so in both of these scenarios, they were basically pointing out to the jury that the AI is not perfect.

SPEAKER_00

Yes. And and just to pause there for a second, you'll see the you know, people might hear the terms false positive, false negative. These are all related to the notion of accuracy I was talking about earlier. And these are very confusing concepts because the false even the false um, you know, the the false misses and the false alarm, I think that's a better way to think about it.

SPEAKER_01

That's why I use those terms because I was like, I'm getting confused. It's a one minus, the PPV, whatever.

SPEAKER_00

Yeah, so these are all confusing, it's a little quite confusing. And I think the you know, the false, uh just to reiterate, the false alarm just means that the AI said there was a bleed, right? A brain bleed. When there actually wasn't. When there actually wasn't, right? And um the false the the the false misses is when the AI says, hey, there's no there's no brain bleed, but there actually was, right? Right. So one is slightly more consequential than the other in some in a weird way, but then having too many false alarms makes the person using the AI less want to trust it because it's raising alarms for everything. It's saying, Oh, there's a brain bleed here and a brain bleed there. And at some point, you're like, it's crying wolf. Like, there's not, you know, uh what do I actually believe? So right.

SPEAKER_01

And if that goes in the chart of like you could see how a radiologist who sees this over and over again might be like, oh, maybe I should start agreeing with it. Like there's, you know, it it could cause this sort of defensive medicine bias where you're like, well, I actually, you know, don't want to keep rejecting what the AI says. I'd rather um, you know, minimize the cost of the error rather than the number of errors. That's called an adaptive bias. And so it makes the radiologist sort of more vulnerable in that way.

SPEAKER_00

Yes. And if you if you think about it for a second, if I told you there's this AI that gave you this answer, that gave you the right answer, right? And and let's take, let's go back to our conditions, right? Um, because in one of the conditions, they basically had the same situation as before, which is the AI got it right and the radiologist got it wrong, but they told a jury that the AI gets it wrong, you know, uh has these false alarms, right? Has a fair number of false alarms.

SPEAKER_01

Exactly. And like those half the time.

SPEAKER_00

Right. And in those cases, they found the radiologist um less culpable, less at fault.

SPEAKER_01

Exactly.

SPEAKER_00

Because they were like, okay, you know, if you're the radiologist, then presumably you it's not fair for for us to expect you to always listen to this AI. That's clearly problematic. Exactly. Right.

SPEAKER_01

Exactly. I think that's the main point. It's like, hey, we're we know we're using AI as tools for radiology. Yes. That's like a given. It's it's already happening, it's out there. Yes. But when we think about medical malpractice and liability coming down the line, which it will and it is, right? Then you have to think about how how are the AI tools going to affect perception of a jury.

SPEAKER_00

Right.

SPEAKER_01

And like we started talking out with, a lot of people think this is magical, that it's just like if the AI was right and the radiologist was wrong, then, you know, it's the radiologist's fault. And that's kind of, you know, that's it, it's kind of that base scenario where you know that that's gonna be a problem. And so then the question becomes, how can we use these AI tools and mitigate that risk in court? Because what we don't want is like more malpractice lawsuits or more people being found liable when it when the like base condition hasn't really changed. We're just using a new tool, which is supposed to be helpful.

SPEAKER_00

Yeah, right, right, right.

SPEAKER_01

And so I think in part they started to answer that question by showing that the AI is not magical and by attorneys like, you know, potentially coming in and saying, Hey, did you know this is a false positive rate? You know, this is this is how many times there are false alarms, this is how often there are false misses. Because the jury really needs to be able to interpret the AI not as magic, but as a tool that was used that is imperfect.

Tool Or Partner And Closing

SPEAKER_00

Exactly. Exactly.

SPEAKER_01

And at the end of the day, I think this concept of an AI penalty is a real one. It's like you're using AI, it can be a helpful tool, but if it increases your liability risk, then there's a penalty that goes along with it. And so it's our job as providers, it's our job as attorneys, it's our job as the public to be able to understand that, to bring that up. And if there is a malpractice case, to bring the information in that will help mitigate some of that AI penalty out there by presenting the data that exists and being able to interpret it, which is hard. This like we're stumbling sometimes, even talking about false alarms and false misses and all of this kind of stuff. But I think it's really important to be able to interpret within that context and to not sort of assume that the AI is always right because we know it's not.

SPEAKER_00

Right. And I think also because from a perception perspective, we talked about it being magical versus not magical. It's also in this context, is it a tool or is it a partner? And what do people perceive the AI doing with the radiologist, right? So, like, is is this should the AI be viewed as another radiologist? And if there were two radiologists looking at the same chart and they were disagreeing, there might be a conversation between them to figure out what exactly is the agreement.

SPEAKER_01

Right.

SPEAKER_00

But in an AI case, that is not what's happening at all. There's no conversation. And even if there was a conversation, it wouldn't be exactly the same in that sense. And the problem is the perception of the jury looking at that, looking at this m quote unquote magical AI tool and thinking that it's another radiologist. It's not, right? It's a f it's a flawed tool that has issues that may or may not be helpful in certain circumstances. And it's still up to the radiologist to kind of make that decision for themselves. But it's almost uh it's a you know, it's it it we have to you're right. We have to reduce that magical thinking about the AI tools and realize that there is these issues, these are the kinds of issues, and this is what might have impacted the radiologist.

SPEAKER_01

Yeah. And if you have an attorney presenting a case and defending a radiologist in court, they've got to bring this kind of information in. Yes. In order to make sure that the perception of the jury is that it's not magical. Right. Right. Um, because this is who you're being judged by, is people in the public who may or may not know a ton about AI tools. Right. And so this has to be in the armament of things you discuss.

SPEAKER_00

Exactly.

SPEAKER_01

Well, I think we can end here. Thank you so much for joining us this week.

SPEAKER_00

Thank you again.