One of the received truths from the era of corporate scandals earlier this decade is that corporate governance matters. As a result, a high-profile part of the current assessment of any company is whether or not the company practices “good” governance. Even though the evaluation of any particular company’s governance has an eye-of-the-beholder aspect, several different commercial enterprises have emerged in recent years, each offering to provide their subscribers with objective governance ratings.


In the space of just a few short years, these governance ratings have become ubiquitous. They are now a critical part of company evaluations for investors, regulators, the financial press, and even D&O insurance underwriters. The quick acceptance of these ratings suggests that they meet a widely perceived need. However, their wide acceptance notwithstanding, it is still worth asking what exactly these ratings actually tell us about the companies and future company performance.


In a June 26, 2008 paper entitled “Rating the Ratings: How Good Are Commercial Governance Ratings?” (here), Stanford Law Professors Robert Daines and Ian Gow, and Stanford Business School Professor David Larcher examine four leading ratings firms’ ratings and analyze “the association between these ratings and future firm performance and undesirable outcomes such as accounting restatements and shareholder litigation.”


The authors reach a number of provocative conclusions, including in particular their finding that “the level of predictive validity for these ratings is well below the threshold necessary to support the bold claims made for them” by the commercial ratings firms.


The authors examined the corporate governance ratings produced by Audit Integrity, RiskMetrics (previously Institutional Shareholder Services), Governance Metrics International, and The Corporate Library. The authors compiled ratings for U.S. firms for each of the four ratings services, cover the period from late 2005 to early 2007. The analysis was primarily focused on ratings available as of December 31, 2005, as that was the earliest date at which the authors established “a sizeable cross-section of ratings across the four ratings firms.”


The authors first looked at whether the various ratings were at least consistent with each other. The authors noted that “if, as seems to be often posited, there is an agreed upon definition of ‘good governance’ and each of these commercial measures seeks to measure it, then we would expect these measure to be highly correlated.”


However, the authors found that there is “surprisingly little cross-sectional correlation among the ratings.” Indeed, the ratings are “close to being uncorrelated.”


In particular, the authors found that in certain instances, the various ratings rated specific companies dramatically differently. The authors concluded that “either the ratings are measuring very different corporate governance constructs and/or there is a high degree of measurement error (i.e., scores that are not reliable) in the rating process across the firms.”


With respect to future outcomes, the authors found that three of the ratings have “a very modest ability to predict accounting restatements” and two of the ratings have “a very modest ability to predict class action lawsuits.” The authors further concluded that at least one rating firm’s ratings exhibited “virtually no predictive validity.” Overall, the authors concluded that “the level of predictive validity even for the best ratings is well below the threshold necessary to support the bold claims by the corporate rating firms.”


The authors’ observation about the lack of agreement between the four ratings is, to me at least, unsurprising, as they various ratings clearly aim to measure different things, based on different visions of “good governance.” Even though “good governance” is a widely used term, there is no consensus definition. As the authors themselves note, “defining good governance and distinguishing good governance from bad governance has proved…elusive.”


The authors’ conclusions about the ratings’ relative lack of predictive power undoubtedly will be disputed by the ratings firms themselves. From my perspective, the authors’ overall conclusion about the ratings’ overall lack of strong predictive power is unsurprising, particularly as it relates to predicting securities class action litigation.


In my prior life running a D&O underwriting facility, my colleagues and I spent a great deal of time and effort attempting to determine what factors might predict securities litigation. We had conjectured early on that corporate governance might afford a useful tool in segmenting litigation risk. Over many years’ time,  we concluded that corporate governance alone was not sufficiently predictive of securities litigation risk, and that certain other criteria (including company size, industry, and age) were much more highly correlated with securities litigation risk.


Because of this experience, my colleagues and I were always somewhat skeptical of commercial governance-based securities litigation prediction tools. In my own experience, these tools are at their best when used negatively, that is, when identifying companies to avoid, but they were less helpful when used to determine which risks to accept, which is of course how D&O underwriters earn their keep.


The authors’ conclusions are more or less consistent with my own experience on these points. However, the real value of the authors’ thorough examination of these issues is that it will likely start a dialogue on these issues. It may well be that a different analysis or a different approach might support a different conclusion about the predictive power of the ratings.


Indeed, the authors themselves expressly acknowledge that they might not have used the “right model” to measure the ratings, and that “given the right model specification,” the ratings “might well prove to be significant and informative.” The authors state that, to the degree this is true, then the ratings firms should “disclose the ‘right’ model” and disclose “how well their ratings predict future performance using the ‘right’ model.” This disclosure “would enable investors to evaluate the net benefits produced by their purchase of the ratings.”


The authors’ interesting analysis and discussion undoubtedly will provoke debate, particularly in the corporate governance community itself. I would welcome responsible comments from the representatives of the ratings firms who might wish to respond on this blog to the authors’ conclusions about the ratings’ predictive power. (PLEASE see below for responses.)


 A June 30, 2008 Stanford Law School press release describing the article can be found here. A June 26, 2008 Fortune report discussing the article can be found here.


Very special thank to the authors for their permission to quote their article on this blog.

UPDATE: In response to my invitation to the governance rating firms to respond to the authors’s study, Ric Marshall, the Chief Analyst at The Corporate Library, and Kimberly Gladman, the Director of Research and Ratings at The Corporate Library, submitted this repsonse:

Thank you for your invitation to respond to the recent Stanford study regarding the predictive value of governance ratings, including those of The Corporate Library (TCL).
The study found that TCL’s ratings have a statistically significant ability to predict accounting restatements and future operating performance. In addition, at the extremes (very poor or very good ratings), the study found that our ratings correlated with future alpha.  The authors state that these relationships are modest, and suggest that they may be inadequate to make our ratings useful. Our clients, however, who come from a wide range of industries, do find value in our ratings. For example, a portfolio manager who has used our ratings over the past four years in stock selection has told us that doing so has contributed substantially to his returns; our insurance clients regularly and successfully employ our ratings to identify companies that warrant greater due diligence and may present higher risk.
We understand that, as you point out in your recent posting about the study, it is not always effective to use governance indicators alone as a guide to financial or litigation risk.  Indeed, most of our clients combine our ratings with a number of other tools, and our own Securities Litigation Risk Analyst Ratings (which were not examined in the Stanford study) combine governance data with a number of financial and industry-related variables. We also agree with your assessment that governance indicators are often most useful in identifying areas of concern, rather than strengths; this is the essence of our approach. Client feedback has shown that taking governance ratings into account, especially in cases where doing so helps chiefly to avoid problems, brings substantial benefits to their businesses.  The Stanford study suggests that corporations themselves tend to agree: according to interviews the authors conducted with boards of directors, they often use low governance ratings as a red flag indicating that they should step up monitoring.
The authors’ surprise at how “little cross-sectional correlation” they found among these ratings reveals the study’s chief flaw, which is the assumption that, “there is an agreed upon definition of ‘good governance’ and each of these commercial measures seeks to measure it.” This is not the case at all, as each of the four rating services reviewed have a different focus, and are employed in different ways by a wide range of commercial clients. While Riskmetrics and GMI do both measure ‘good governance’ against specific standards, our own focus is on identifying governance weaknesses and thereby companies for clients to avoid. We have always taken issue with the notion that any one governance model can be the most effective for every company.

Special thanks to Ric and Beverly for taking the time to provide a detailed written response.

FURTHER UPDATE: Jack Zwingli, the CEO of Audit Integrity, has also provided a detailed response to the academic study. Jack’s response can be found here.