Evidence-based medicine requires use of the best research studies in taking care of patients. Systematic reviews are designed to combine multiple research studies to produce answers to important questions and therefore can be important to the practice of evidence-based medicine. Everyone loves systematic reviews, starting with physicians who need answers and may not have the time to read individual studies but also including patients who increasingly are more educated and review medical information they can find online.
Medical journals also like publishing systematic reviews, as these are often cited by other research studies and raise the prestige of a journal by boosting what is called the impact factor. There has been a surge in the publication of systematic reviews in all fields, including otolaryngology – head and neck surgery and my specialty of sleep surgery. Although the public thinks everything published in a medical journal is of high quality, the reality is that this is not always the case. I personally have been uncomfortable with the systematic reviews in our field that claim to provide definitive answers for questions that I felt were not clearly answered by the available studies. I explained my concerns to my good friend Martin Burton, the Director of Cochrane UK and one of the world’s foremost experts in systematic review, and he encouraged me to take a closer look for myself.
AMSTAR 2 Criteria for Systematic Reviews
Experts in systematic review have developed methods for assessing proper performance of systematic reviews, initially with the AMSTAR (A MeaSurement Tool to Assess Systematic Reviews) in 2007 that had some minor revisions to the AMSTAR 2 in 2017. These experts identified 16 research methods that should be part of a high-quality systematic review (and possible meta-analysis, or analysis of data from the multiple individual studies). The goal of the AMSTAR 2 is to determine whether readers should have confidence in the findings of a systematic review, as not all published systematic reviews are of high quality.
Applying the AMSTAR 2 Criteria
With some colleagues (including a resident and medical students) at the Keck School of Medicine of USC, we identified 499 papers that identified themselves as systematic reviews published in the 10 otolaryngology journals with the highest impact factor from 2012-17. We used the AMSTAR 2 criteria to assess their quality and had some important findings that were recently published in the medical journal Otolaryngology – Head and Neck Surgery. Overall, it was surprising how large the gap was between accepted methods and reality in otolaryngology and sleep surgery published systematic reviews.
- Many studies stated that they were systematic reviews but were not actually systematic reviews.
- There were 236 actual systematic reviews that evaluated interventions (medications or surgery), and 19 of these were in sleep surgery.
- 99% of the systematic reviews (and 100% of those in sleep surgery) provided critically low confidence in the results, based on their methods.
- Giving all studies credit for following two of the critical methods (even though most did not indicate this in the publication), 18/19 systematic reviews in sleep surgery still provided low or critically low confidence in the results.
Why Is This Happening?
I firmly believe there are a few reasons to explain the disappointing quality of systematic reviews in otolaryngology and sleep surgery, in particular. First, medical journals have not demanded that researchers adhere to these methologies prior to publication (for example, in instructions for authors), as there is a tremendous desire just to publish them. Second, the peer review process requires that colleagues have the necessary expertise to evaluate a manuscript properly. The problem is that systematic review is an entirely different research field, requiring an unique skill set that is not found often in otolaryngology or sleep surgery (or in many other fields). In short, the desire to perform and publish systematic reviews had outpaced the training and foundation to do them following accepted methods.
Third, there are many interesting research questions but not enough high-quality individual studies to answer these questions in a systematic review properly. The result is that someone develops a question and then wants to complete a systematic review that probably should not have been performed at all. The authors can have good intentions and ideas (a good question to answer) and try their best, but ultimately the review should not be performed if the individual studies do not support doing a high-quality systematic review. This issue is actually not even captured by the AMSTAR 2 criteria but can be another major weakness of a systematic review. I think this occurs often in sleep surgery, as there are not enough high-quality studies in sleep surgery for many of the specific questions that exist. In published sleep surgery systematic reviews, I often see mistakes in the research methods (not following the AMSTAR 2 criteria) and, in addition, the desire to combine studies that should not be combined because they are so different (often evaluating completely different interventions). Combining different types of studies just so the systematic review will look like there are more studies addressing the question is not good science.
Where Do We Go From Here?
Medical journals need to stop publishing systematic reviews until we have enough high-quality studies to answer a question properly and can use the right research methods, with enough people to perform peer review adequately. This may mean that none are published in sleep surgery for the next 5-10 years, but we need to be willing to accept this. Taking shortcuts is a disservice to everyone involved, leaving us with results in which we can have no confidence. In short, good intentions are not good enough when it comes to publishing these reviews that can have a major impact on the care of patients.