In the last few months there have been tectonic shifts in the world of cancer screening. Peer-reviewed journals, lay press outlets, and prominent medical groups have all published new material with far-reaching implications. This post is about mammography and the import of these recent events, including an unfortunate review in the New England Journal of Medicine. In the coming weeks I will post about PSA testing.
For three decades public opinion and public policy on mammograms have been perfectly synchronized, and perfectly wrong. What started with a seemingly minor error in study interpretation snowballed into a prickly political movement with lobbyists, deep pockets, and a blind faith in early detection. Unfortunately, the original error, one I first described in Hippocrates’ Shadow, led to years of misunderstanding and misinterpretation. But recent study reports have fueled a new interest in the truth about mammography.
Dr. Laura Esserman published the most powerful and important of these articles in the Journal of the American Medical Association in 2009.[i] Dr. Esserman and her colleagues explore U.S. public health data, and point out hard truths. For starters they note that cancer diagnoses increased, and never came back down, after mammography screening was introduced. This is not minor. An effective screening test shouldn’t increase cancer diagnoses. It should allow doctors to find the same cancers, but earlier. There may be a bump when screening begins, but the numbers should soon return to their baseline. Mammograms, however, permanently increased these numbers, which means they identify ‘cancers’ that never would have been found. That’s not early detection, that’s extra detection.
This would be fine if the extra detection came with a life-saving benefit, i.e. if the dangerous cancers were being found early enough to make a life-saving difference. But Esserman points out that deaths never dropped (as they should have) after the introduction of mammography.
Less than a year after the Esserman paper, a study published in the New England Journal of Medicine reported breast cancer mortality statistics from Norway before, during, and after the introduction of mammography.[ii] Norway’s experience is notable because they used mammography in selected regions only. Breast cancer deaths dropped where mammography was phased in. Encouraging. But deaths also dropped where mammography wasn’t available, and the drop was of equal magnitude. This strongly suggests that breast cancer treatments, and not mammography screening, were responsible for the decrease.
A few months later, in March 2011, Dr. Peter Gøtzsche published a study of mammography. Gøtzsche, best known for authoring one of the first major reviews to find no mortality benefit with mammography, has now published dozens of follow-up papers including three this year. His March publication re-examined the mammogram data from yet another angle and again concluded no benefit.[iii] Gøtzsche is an expert in biostatistics and study methodology and his papers tend to be squeaky clean, which is often infuriating to his many detractors and challengers. Despite the often very emotional responses to Gøtzsche’s work,[iv],[v] one finding is clear: in randomized trials there is no evidence of an overall mortality benefit to screening mammograms.[vi]
The most recent addition, and arguably most crippling to the popular view of mammography, is an Archives of Internal Medicine study published online in October (as of this posting it is not yet published in print).[vii] It is a rather blunt report demonstrating that the lives of women whose breast cancers are detected by mammography are rarely, if ever, saved by the mammogram. In other words, the time difference between detection by mammogram and detection clinically (as a lump) is virtually never the difference between life and death. This dismantles the fundamental claim of mammography and discredits the I-had-a-mammogram-and-it-saved-my-life argument, an anecdotal and therefore inarguable refrain that is ubiquitous in popular discussions of mammography.
These recent studies have gained surprising traction in the popular press, culminating in a New York Times piece in November that began,
“After decades in which cancer screening was promoted as an unmitigated good, as the best — perhaps only — way for people to protect themselves from the ravages of a frightening disease, a pronounced shift is under way.”[viii]
But change is hard and there is always a fist-shaking codger. In September the New England Journal of Medicine took on that role by publishing a review outlining the wonderful benefits of screening mammography.[ix]
I feel sheepish pointing out the Journal’s lazy predilection for reviews that flout evidence and ennoble false expertise. But I also feel obliged to point out how an educated author, writing for an educated editorial board, at an occasionally wonderful journal, went off the rails. It is a cautionary tale.
Dr. Ellen Warner, the author of the review, is an established expert on breast cancer screening and breast imaging. She begins her September 15th review article with a dubious claim: that the small improvement in breast cancer deaths seen over the past twenty years is due in equal parts to advances in treatment and increases in mammography. Her reference is a widely publicized NEJM article from 2006,[x] which drew its conclusions using complex computer models. Computer models, of course, need inputs in order to produce outputs. So what were the inputs that led a computer to believe that mammography saves lives? They were obviously not trial data, as Dr. Gøtzsche has shown. Nor were they public health data, as Dr. Esserman has shown.
I emailed the author of the original paper (the paper itself was vague on the point), Dr. Don Berry. He was gracious and prompt, and explained to me that the computers were fed data regarding the stage and type of breast tumors diagnosed by mammography, and compared these data to stage and type of tumors found by other means. The computers then projected the benefits of mammography based on these differences.
This is weird. Predicting the behavior of cancer cells is famously shaky, a great failing of most early detection programs. Prostate specific antigen (‘PSA’), chest x-ray screenings for lung cancer, and ultrasound screening for ovarian cancer, for instance, have all failed. In each case the problem was prognoses based on tumor data. If looking at cancer, either on imaging or under a microscope, could bring certainty then all of these tests would save lives. But predicting the behavior of complex cells inside of an infinitely complex human body is profoundly difficult. Tumor characteristics are therefore about as reliable as a fortune cookie. To take these fortune cookies and use them in another arena entirely, i.e. to determine whether mammograms save lives, is just kooky.
In addition, the biases in this type of data are too many to count. Even if we believed the predictive ability of stage and size data, there are many reasons that mammogram-detected tumors are different than others. For one, they are famously from the subtypes of tumors that are less aggressive, and more likely to be ‘cancer’ that would never spread or cause a problem. In addition, any tumors found between mammograms are aggressive and rapidly growing (that’s why they were detected in between mammograms rather than during a mammogram). Which points out another bias: mammogram-detected tumors are typically slow-growing, which is why a mammogram detected them—and also why they’re more easily a treatable subtype. These are just a few reasons why mammogram-detected tumors enjoy a distinct and powerful false advantage in any comparison of mammogram-detected tumors to other tumors.
Despite all of this, soon after the original 2006 Berry paper, Don Berry’s study group began working with the United States Preventive Services Task Force.[xi] Thus tumor stage data became the bedrock for the USPSTF mammography guidelines. And in a wonderful example of closing the loop, those guidelines are cited as supporting evidence for Dr. Warner’s NEJM review. In other words, in 2006 Dr. Berry’s group fed biased tumor data to a computer, the NEJM published the results, the USPSTF then used these results for its recommendations, and the NEJM then published a review by Dr. Warner citing both of these prior papers as proof. This is a game of evidence telephone—and the very first call was a prank.
Her reliance on Berry’s tumor data is just one of Dr. Warner’s contortions. She also repeatedly asserts that mammography reduces mortality. With the right qualifier she might be able to get away with this, but she fails to make the distinction between breast-cancer mortality and overall mortality. Most authors, keenly aware of this controversy, are careful to distinguish.
What controversy? I’ve blogged and also published an article for the Journal of the National Cancer Institute on this topic, and briefly it goes like this: despite a half million women enrolled there is no mortality benefit in mammogram trials, as Dr. Gøtzsche has shown. This result was shocking when it was first published in 2000 because early trials had claimed that mammograms reduce breast cancer mortality. But mammograms never reduced overall mortality. Overall, or all-cause mortality, is the only valid way to measure the effects of any medical intervention.
Why? As noted, mammograms lead to more diagnoses. Not just earlier diagnoses, more diagnoses. And while many are benign, virtually all are treated. A fast-talking surgeon I knew once said, “when in doubt, cut it out.” And that’s just what he did. Thus women are more likely to have surgery and more likely to have radiation and chemotherapy if they have mammograms. These treatments occasionally cause fatal complications. But if the only outcome that we talk about from trials is ‘breast cancer mortality’ then fatal complications from unnecessary treatments are never counted.
The same is true for any medical treatment. In the 1980’s promising therapies for heart patients were withdrawn after large reviews showed that all-cause mortality was higher.[xii] [xiii] [xiv] In one case cholesterol-lowering drugs reduced deaths from heart attacks, but increased deaths overall.[xv] The pills were abandoned. (These were the fibrate drugs, incidentally, a drug class on the comeback; medical memories run short.) Thus it is essential to count all deaths that occur in any trial of a medical treatment, so that side effects and unanticipated consequences are all counted. In some cases these side effects negate the beneficial effect of the treatment.
The early confusion about how to count mortality outcomes led to a mistaken conclusion: that mammograms saved lives overall. But now we know better. In the massive Cochrane review published in 2000 by Gøtzsche (72 pages long, more than 450 references), breast cancer mortality was 15% lower in the mammography group. But all-cause mortality was the same in the two groups. In other words, mammograms didn’t save lives. Whatever life-saving benefit might have been attributable to mammography was offset by fatal consequences, such as complications from treatment.[xvi]
Thus the only way for Dr. Warner to retain even a soupçon of truth in her claims that mammography reduces mortality would have been to specifically state that she meant breast cancer mortality, and not overall mortality. She didn’t, making her statements incontrovertibly wrong.
Finally, Dr. Warner makes an argument that I have not heard before. She curtly declares that randomized trials of mammography screening are “irrelevant” because of changes in the disease and its treatment. She is, of course, entitled to this opinion and she may even be right. However this is an opinion that instantly renders most medical treatments irrelevant, since very few are based on recent trials. And in fact some mammography trials have just closed, or are ongoing, making this a bizarre proposal that reeks of an excuse to ignore their findings. Worse still, if Dr. Warner truly believed that there were no relevant trial data available then she could not logically claim a proven mortality benefit with mammography, something she does repeatedly. One cannot eat the mammography cake and have it too.
Breast cancer is a disease of overwhelming importance. Resources should be devoted to extinguishing it. Reviews like Dr. Warner’s are a travesty precisely because they ensure that the focus of research and innovation will be misplaced. Mammography is a failed intervention; if breast cancer screening is to be fruitful, we must look elsewhere. Moreover, money squandered on mammography shamefully deflects resources from the exciting progress in breast cancer treatment.
I am frustrated. The NEJM review was more than an opportunity to correct the record on breast cancer screening. It was a chance to rejoin science and society at a time when policy and data are increasingly distant, and increasingly in need of each other. Yes, there is controversy. Yes, some will be angry. But that is our own fault. It is time to confront our errors and correct our future, and it should be science, scientists, and peer-reviewed journals that lead the way.
[xii] Echt DS, Liebson PR, Mitchell LB, et al. Mortality and morbidity in patients receiving encainide, flecainide, or placebo. The Cardiac Arrhythmia Suppression Trial. N Engl J Med. 1991;324(12):781–788.
[xiii] The Cardiac Arrhythmia Suppression Trial II Investigators. Effect of the antiarrhythmic agent moricizine on survival after myocardial infarction. N Engl J Med. 1992;327(4):227–233.
[xiv] Lechat P, Packer M, Chalon S, et al. Clinical effects of beta-adrenergic blockade in chronic heart failure: a meta-analysis of double-blind, placebo- controlled, randomized trials. Circulation. 1998;98(12):1184–1191.
If you have suggestions, requests, or questions about a particular NNT review, please send us a message and we’ll try to address it as soon as possible.