Welcome to 2022, where I’m going to ease into things with another edition of “Revising and extending remarks previously made on Twitter.” Specifically, this thread about an article in the New York Times about pre-natal tests for rare disorders that have high false positive rates.
As I said in the tweet linked there, I went to read this expecting that I’d end up doing an easy dunk tweet about the base rate problem; see, for example, Kevin Drum (not the cheapest dunk, by any means, just the one that was in my RSS feed this morning). It was pretty obvious from the title and graphic what the stats issue involved was, and the NYT has a long track record of running parenting articles that mostly convince me that everyone they interviewed is faintly awful. Somewhat to my surprise, though, I came away from it with a bit more sympathy than I expected, because we went through a version of this about fourteen years ago.
Back when Kate was pregnant with SteelyKid, she had a first-trimester appointment with her doctor that we expected to be pretty routine, so she went solo (I had a class that day, or a meeting, or something; I forget what). She called me from the parking lot afterwards all giddy and excited because they had done an early ultrasound and she’d seen the heartbeat. Then an hour or two later called me on the verge of breakdown because the doctor had called back asking us to come in for a consult because something was wrong.
Kate’s appointment had just been with a tech who took pictures, not the doctor who interpreted them, thus the whipsaw effect— when the doctor looked at the pictures in a quantitative way, the relative size of some region on there indicated an elevated risk of some really horrible genetic conditions. I remember them saying there was something like a one-in-ten chance of some abnormality. Instead of happily announcing the pregnancy to all and sundry, we were scheduled for a whole battery of new tests of various types.
Happily, these all came up clear, and SteelyKid was born happy and healthy and is now a pretty awesome teenager. The story ends well, not sorry for the spoilers.
(The last of the tests was with a pediatric cardiologist who looked a bit like James Cromwell. The referral form he’d gotten was missing key information, so when he came in he asked “Why are you here?” We explained the story, and he said “Ah, got it, I know what I’m looking for.” Then he did some wizardry with high-resolution Doppler ultrasound looking at blood flow in the heart (it was seriously cool tech…). After he shut down the machines, he turned back to us and said “Why are you here, again?” then smiled, and said “There’s absolutely nothing wrong.” He wins the bedside manner prize for this whole experience…)
As is my wont, I spent a bunch of time in this period fuming about the medical establishment for putting us through all this, but on a more rational, intellectual level, I know they were doing the right thing. A one-in-ten chance of one of the conditions they listed is a huge and scary increase over the base rate for parents of our age and background, which I recall as something like one-in-800 (for the least severe of the various options). Once they had the information suggesting that eighty-fold increase in the odds of Bad Things, it would be medically irresponsible not to investigate further.
At the same time, though, one-in-ten is not a very likely outcome. It’s a huge increase in the odds of Bad Things, but still, 90% of the time, it’s going to be a false alarm. Which puts it pretty much in the same ballpark as those stories in the New York Times piece.
On a purely numerical, rational level, this is, in fact, just the base rate problem: given a sufficiently rare condition, even a test with a remarkably low false-positive rate will be wrong more often than not. The conditions those tests in the Times story are looking for are even more rare than the stuff we were worried about with SteelyKid, so it’s not surprising that a large percentage of the results turn out to be false positives.
At the same time, the fact that these conditions are so rare and devastating makes even a one-in-ten chance of having them incredibly alarming from a medical perspective. It would be wildly irresponsible not to inform the parents and investigate further once you have those results; there’s probably a spirited argument to be had about whether it would be ethical to not do the tests.
(For the record, that’s what we opted for with The Pip, three years later: given the experience with SteelyKid, we specifically asked them not to do the first-trimester screening test that had so alarmed us. They can identify all the relevant conditions later in pregnancy with tests that have lower false-positive rates.)
The problem is, this isn’t just a question of numbers, but of parental psychology. Specifically, you have the base rate problem colliding with loss aversion, which has a powerful impact on how those odds are perceived. Low odds of a Very Bad Thing loom much larger than the higher chance of a positive outcome, particularly one that you’ve already been expecting.
I say this with the confidence that comes from experience. Our doctors were very responsible, clearly informing us that we were looking at one-in-ten odds of bad things, and I’m someone who works with numbers for a living. I know about base rates, and I know that one-in-ten isn’t a very likely outcome. And I can still vividly remember the sleepless nights between that initial consult and the final “all clear.” Those weeks fucking sucked for reasons that had nothing to do with stats and probability.
So, as I said on Twitter, I don’t think this is actually an easy dunk— there’s a genuinely hard problem here. On the one hand, these are really rare conditions, and the false-positive rates here are necessarily going to lead to a lot of expecting-parental angst that will turn out to be nothing in the end. On the other, these are really awful conditions, and it would be borderline unethical not to inform parents of the elevated risk, even if that risk is still very low in absolute terms. Threading that needle to provide the necessary information while minimizing undue anxiety is not easy.
I’ve seen a lot of glib Twitter dunks about how this just shows the need for more statistical literacy, but as someone who’s well above the median of numeracy I can tell you that the psychology here plays a huge role. It’s one thing to know the math and another to sit in a medical office waiting for those results. The Times piece reads a bit like they want you to come away thinking these tests shouldn’t be used at all, but I don’t think that really works, either— given how rare these conditions are, a 1-in-10 chance is actually remarkably good, and given how horrible they are, I think doing the tests is entirely justified. This is a genuinely difficult situation.
The only clear villain in this piece is whoever writes the ad copy for those tests, touting their accuracy in terms that are just about optimized for generating the greatest possible level of parental alarm. Fuck those guys.
Just so this ends on a more upbeat note, here’s SteelyKid now (well, back in the fall, anyway…):
(I told you the story would end well…)
So, yeah, um, Happy New Year. If you like this, here are some buttons:
(Though I fervently hope that most of the content you’ll get in your inbox that way involves less personally painful memories…) The comments to this post will also be open, should you be so moved.