Commentary
Bias in the Design, Interpretation, and Publication of Industry-Sponsored Clinical Research
By Dale Hammerschmidt, M.D.
Physicians have an obligation to be skeptical consumers of research, especially as they make care decisions based on the findings.
During the past year, articles in the lay press have called attention to problems in the published evidence that physicians use to make clinical care decisions. Industry-sponsored research has come under particular scrutiny because of concern that the profit motive may lead to distortions in the evidence gathered, in the choice of evidence to publish, and in the interpretations of evidence that are offered. That concern is legitimate, and one can easily construct worst-case scenarios in which misleading evidence is deliberately promulgated to the harm of patients. But I also think that the profit motive is neither automatically iniquitous nor the only source of bias and distortion in research.
There are two basic dangers in interpreting industry-sponsored research on health care: first, failing to recognize the inherent biases in such studies and, second, discounting the value of such research because of concern about those biases. Industry-sponsored research, even when it is well-designed and rigorously carried out, often has as its focus the demonstration of product licensability, the establishment of one or more marketing indications, or the identification of a market niche in which the product may outperform its competitors. This has implications for study design, subject selection, publication strategy, and post-publication dissemination of results. The problem is there is no “bright line” between a deceptive research strategy and one that is simply likely to demonstrate a product’s advantage.
Another danger is making the facile assumption that the biases are unique to industry-sponsored research. The incentive structure in academic research may be different, but it can lead to study designs that ask a specific question particularly well while sacrificing the study’s ability to generate information that can be more confidently generalized. The problem is that if information resulting from these studies is perceived as broadly applicable, when in fact it is not, there is a risk that it will be too broadly used. For example, a drug may be thought to be ideal for an elderly patient with many comorbidities when, in fact, it has only been tested on younger patients with a single medical problem.
In October of 2003, a lead article in the New England Journal of Medicine reported a comparison of a new heparin-like drug and old-fashioned unfractionated heparin for the treatment of uncomplicated deep-venous thrombosis. The study was designed as a noninferiority trial, seeking to show that the new drug was at least as good as the established drug. Noninferiority was its conclusion: Within the limits set by the sponsor and the investigators, the new drug was not found to be any worse than plain old heparin. The new drug had some advantages—it was much easier to give in an outpatient setting; so “noninferiority” could easily translate to “very useful stuff.” Upon reflection, a “yeah-but” or two arose. By the time this paper appeared, there was already quite a bit of evidence that low-molecular-weight heparins were as good as ordinary heparin in this setting and could be given without admitting the patient to the hospital; this had become the standard in our hospital and others several years earlier. So one could restate the conclusion of the paper as: “The new drug is not demonstrably worse than what we used to do five or six years ago.” It also makes one ask how the new drug would compare with our current practice.
This study is a favorite example not because it is egregious but specifically because it is not. The study was scientifically sound, and it appears to have been carried out rigorously and in an ethically appropriate way. It’s just that it asked the question of greatest value to the manufacturer, rather than the question of greatest value to the clinician. Clinicians need to know how a new drug fits in to the larger picture: Who should receive it? When, if ever, will it be better than other similar drugs on the market? Are there circumstances in which it will be inferior to the drugs we already have? Without answers to those questions, the drug may not be used wisely and patient care may suffer.
The Power of Positive Results
Another type of problem in both industry-sponsored and academic research is the selective publication of data that are useful to the person or corporation publishing them. If only the favorable data see the light of day, then the efficacy of a new therapy will be systematically overestimated; a more balanced (and often less optimistic) view may take some time to emerge. This is hardly a new problem and is certainly not limited to industry-sponsored research. Sir William Osler made the wry observation that “One should treat as many patients as possible with a new drug while it still has the power to heal.”1
This tendency to publish only positive results has several origins, only one of which is the narrow self-interest of the sponsor. A positive result is inherently more interesting than a negative one, and it may more readily be demonstrated to reasonable certainty. When the results are encouraging, it is easier to find the enthusiasm to prepare a manuscript (and to do it well), and it is easier to convince a journal editor to accept such a manuscript for publication. I spent 15 years as a journal editor and share in communal guilt for fostering this bias. Even if such a bias is not sinister, it may have a serious adverse impact on clinical practice and even on the development of guidelines for care. As we try to base clinical practice more and more on evidence rather than opinion or anecdote, flaws and biases in that evidence grow in their ability to work mischief.
A particularly instructive example of how this plays out is a study published early this year by Turner et al.2 Responding to concerns about the efficacy and risk of antidepressant medications, these authors obtained study reviews from the U.S. Food and Drug Administration and compared them with the corresponding articles published in medical journals. They found that only about two-thirds of the 74 studies reported to the FDA had been published. Of those that had been published in medical journals, all but one showed benefit from the drugs. Not surprising, meta-analyses based on the published data made the individual drugs look quite a bit better than they looked in meta-analyses using the entire FDA datasets. Almost simultaneously, Barbui and colleagues analyzed paroxetine trials for major depression specifically trying to include as many unpublished trials as possible.3 They found that although paroxetine was associated with better control of depressive symptoms than placebo, it was also associated with more frequent discontinuation for side effects and a higher incidence of suicidal thought. If they took staying on the drug as an endpoint (reflecting efficacy and tolerability), it was hard to show the benefit of paroxetine.
In a newly released study, van Luijn and colleagues found that most of the comparative clinical trials of new medications are published only after the drugs are marketed and that many of the late-appearing publications contain information important to assessing the risks and benefits of the agents.4
Meta-analysis, in which the results of multiple trials are combined, is a powerful tool for making sense out of disparate results. In the simplest example, several nominally negative trials may be combined to increase statistical power and thereby determine whether a trend they share is meaningful. But as Noble has pointed out, the utility of meta-analysis depends not only on the ability to compare trials but also on their completeness.5 Meta-analyses that depend on only favorable subsets of data can lead to a false impression that high-quality evidence exists, when in fact it does not. If a meta-analysis is performed only using published data, and if that analysis is used to set policy or practice guidelines, the result may reflect publication bias more than it reflects wisdom.
Weighing the Evidence
Another bias deserves mention but needs little exposition. When testing a new therapy, researchers deliberately choose as subjects those patients who have the best chance of responding and who have the fewest confounding factors. Quite simply, the question Does it work at all? comes before the question How well does it work in typical settings? Even without publication bias per se, early results may be misleadingly optimistic.
The last two decades have seen the rise of evidence-based medicine, which calls on physicians to examine the process by which clinical decisions are made, to be aware of the evidence that influences the clinical decisions they make, and to use the available evidence to make the decisions that are best for their patients. Evidence-based medicine can only work if the available evidence is of high quality and is applicable to the clinical decision at hand. If flawed evidence is used to generate practice guidelines, considerable harm may result.
This certainly raises ethical concerns. But I would suggest there’s more to this than what has been raised in news reports—that industry sponsors taint research results simply because a study has been designed to demonstrate a postulated strength of a product; instead, the ethical weight begins to accrue when the results are used in a misleading way. Additionally, I would ask whether there is not an ethical obligation on the part of those who would use published data to make care decisions and in the preparation of clinical guidelines—to do so cautiously and wisely.
How might that obligation be discharged? There are, I believe, two basic defenses against the biases I have mentioned. The first is enthusiastically practicing the fine art of skepticism. We all need to recognize when a bias is likely to exist, and we need to refrain from embracing an outcome for what we would like it to mean. The second is having broad access to data in their raw and unvarnished form, whether or not they have been published. Many journals are now requiring that clinical trials be registered and that their data be available for alternative analyses; the same sort of transparency for unpublished industry-sponsored trials would be a step forward.
Finally, I think it is often good to resist the temptation to be judgmental. The incentive structure in which all types of research is proposed, carried out, and promulgated has plenty of power to distort, without the need to impute unsavory motives. MM
Dale Hammerschmidt is a hematologist/oncologist at the University of Minnesota, who serves as the department of medicine’s due diligence officer for human subjects research. For 15 years he was an editor of the Journal of Laboratory and Clinical Medicine.
References
1. Alpert JS. The triumph of hope over experience. Amer J Med. 2006;119(12):1003-1004.
2. Turner EH, Matthews AM, Linardatos E, Tell RA, Rosenthal R. Selective publication of antidepressant trials and its influence on apparent efficacy. New Engl J Med. 2008;358(3):252-60.
3. Barbui C, Furukawa TA, Cipriani A. Effectiveness of paroxetine in the treatment of acute major depression in adults: a systematic re-examination of published and unpublished data from randomized trials. CMAJ. 2008;178(3):296-305.
4. van Luijn JC, Stolk P, Gribnau FW, Leufkens HG. Gap in publication of comparative information on new medicines. Br J Clin Pharmacol. 2008; Feb 21 [Epub ahead of print].
5. Noble JH Jr: Meta-analysis: Methods, strengths, weaknesses, and political uses. J Lab Clin Med. 2006;147(1):7-20.