Prominent hospital rating systems are highly influential, but they also can generate contradictory results that do not always align with clinicians' own assessments, a group of authors write in NEJM Catalyst, highlighting the shortcomings of several major hospital rating systems.
Get 1-page cheat sheets on how hospital quality ratings programs actually work
The Catalyst article was authored by eight health care experts, including six individuals whom authors describe as "physician scientists with methodological expertise in health care quality measurement." These six served as the group's hospital rating system "evaluators."
The authors note that in recent years, hospital rating systems have grown in both number and influence, but it remains "unclear whether current rating systems are meeting stakeholders' need." Namely, the authors write different rating systems give conflicting ratings and there's often a "disconnect" between institutions that ratings recognize as leaders and those clinicians recognize as major referral centers.
To provide a clearer picture of hospital rating systems, the authors launched their "rating the raters" initiative. They focused on four major hospital rating systems:
The authors established six major categories to assess each rating system:
For their project, the authors relied on both "objective and subjective criteria" to develop a "point-by-point analysis of the strengths and weaknesses of each rating system." The authors gave each rating system the opportunity to review the fact sheets and provide input as well as correct any errors.
Evaluators were also asked to assign each rating system a letter grade, ranging from A to F, based on the analysis, and those grades were averaged. Evaluators had in-person interviews with "leaders and/or methodologists from each of the rating systems" to clarify any issues and learn more about their systems.
The grades were as follows:
In the NEJM Catalyst article, the authors also provided a summary of common problems among the rating systems. For instance, the authors noted five problems that were present in each of the rating systems examined.
1) Reliance on limited data. According to the authors, most of the rating systems use administrative data collected for billing instead of clinical purposes. Typically, the data are limited to those 65 and older who are part of the Medicare Fee-for-Service program. These data "lack adequate granularity to produce valid risk adjustment," the authors write.
2) Lack of robust data audits. "[W]hen the rating systems generate their own data through surveys, these data are not always made available publicly for analysis to allow for independent assessment of validity and reliability," the authors write. They also note that many of the rating systems rely on self-reported hospital data that is not subject to audits.
3) Varying methods for compiling and weighting composite measures. The authors note that each rating system used a different method for developing composite measures, resulting in hospitals' overall scores or grades varying greatly. In addition, the authors note "there is often limited rationale for the selection and weighting of different elements in the composite," and in some cases the chosen weights differ from how a stakeholder would view them.
4) Difficulty managing outcomes measurement at small hospitals. The authors note that small hospitals typically have less reliable performance estimates due to their lower volumes. To account for that, the authors write most system methodologies "smooth or shrink rates essentially pushing lower-volume hospitals toward the mean." That makes it extremely hard for small hospitals to be recognized as top or bottom performers, according to the authors.
5) No formal peer review. While each of the examined rating systems used expert panels to some degree, the authors note that these panels typically "provide input intermittently and without detailed methodological review."
The authors also discussed potential financial conflicts, such as hospitals paying rating systems to display their performance. The authors write that such practices could "create unfortunate incentives" and raise the "concern that the business of selling these ratings leads to a model that encourages multiple rating systems to intentionally identify different 'best hospitals.'"
The authors identified four ways the rating systems could improve:
Ben Harder, chief of health analysis at U.S. News, said that looking at rating systems is important. "The systems themselves deserve to have their tires kicked and people really scrutinizing them," he said.
However, some raters felt the authors' methods were flawed. Leah Binder, group president and CEO for Leapfrog, said, "Th[is] piece conflates two of Leapfrog's programs in a way that vastly misrepresents both, and makes demonstrably false statements about the intensive audit process Leapfrog conducts for over 2000 hospitals every year."
Similarly, Mallorie Hatch, director of data science for Healthgrades, said the authors "misrepresented" the methodology for Healthgrades' overall hospital award. Hatch also said that Healthgrades' "feedback was not incorporated" in the article (Bilimoria et. al., NEJM Catalyst, 8/14; Goldberg, Crain's Chicago Business, 8/14).
Create your free account to access 2 resources each month, including the latest research and webinars.
You have 2 free members-only resources remaining this month remaining this month.
Never miss out on the latest innovative health care content tailored to you.