Aggregate Confusion: The Divergence of ESG Ratings

Aggregate Confusion: The Divergence of ESG Ratings
Florian Berg, Julian F Kölbel, Roberto Rigobon
Review of Finance, Volume 26, Issue 6, November 2022, Pages 1315–1344, https://doi.org/10.1093/rof/rfac033

ESG (environmental, social, and governance) rating agencies have become influential institutions in today’s financial markets, affecting a wide range of important decisions. Yet, ESG ratings from different providers diverge, introducing uncertainty into those decisions and creating a challenge for decision-makers.

This paper investigates why ESG ratings diverge. The analysis is based on data from six prominent ESG rating agencies:

  • Sustainalytics
  • Moody’s ESG (formerly known as Vigeo-Eiris)
  • S&P Global (formerly known as RobecoSAM)
  • Refinitiv (formerly known as Asset4)
  • MSCI
  • KLD (discontinued 2017)

We document an average pairwise correlation between these ESG ratings which ranges from 38% to 71%. The divergence is illustrated in Figure 1 (shown below). It shows that while ESG ratings rarely provide diametrically opposing assessments, the dispersion makes it difficult to tell firms that are ESG leaders from average performers.

To explain the reasons for divergence, we use a dataset that consists of the aggregate ratings along with the full set of underlying indicators. We categorize all indicators into a common taxonomy of 64 attributes and re-estimate the original ratings based on this taxonomy. This allows us to compare different rating methodologies in one coherent framework.

We decompose the divergence into three components: scope, measurement, and weights. Scope divergence means that ratings are based on different sets of attributes. Measurement divergence means that rating agencies measure the same attribute using different indicators. Weights divergence means that rating agencies assign different weights to attributes when aggregating them to a single rating. We find that measurement is responsible for 56% of the overall divergence, scope for 38%, and weights for 6%.

We also detect a rater effect where a rater’s overall view of a firm influences the measurement of specific categories. This suggests that measurement divergence is not just randomly distributed noise but has structural reasons as well, which calls for more research investigating why this is the case.

ESG rating divergence does not imply that measuring ESG performance is a futile exercise. However, it highlights that measuring ESG performance is challenging, that attention to the underlying data is essential, and that the use of ESG ratings and metrics must be carefully considered for each application. Investors can use our methodology to reconcile diverging ratings and focus their research on those categories where ratings disagree. For regulators, our study points to the potential benefits of harmonizing ESG disclosure and establishing a taxonomy of ESG categories. Harmonizing ESG disclosure would help to provide a foundation of reliable data. A taxonomy of ESG categories would make it easier to contrast and compare ratings within this common taxonomy.

Figure 1

Scroll to Top