Will Goodhart's Law come to USNWR's Hein-based citation metrics?

Economist Charles Goodhart has an old saying attributed to him: “When a measure becomes a target, it ceases to be a good measure.” In the office setting, or in government, or anywhere, really, if you identify a measure as a target, people begin to behave in a way to maximize the value of that measure, and the measure loses its value—because it’s no longer accurately measuring what we hoped it would measure.

The recent announcement from U.S. News and World Report that it would begin to incorporate Hein Online citation data into a rankings formula offers much to discuss. Currently, the “quality” of a law school faculty is roughly measured by a survey of 800 law school faculty, asking them to rank schools on a 1-to-5 scale (which results in likely artificially low and highly compressed rankings).

This isn’t a remarkable proposition. In the first Maclean’s ranking of Canadian law schools in 2007, Professor Brian Leiter helped pioneer a rankings system that included faculty citations. Every few years, a ranking of law school faculty by Professor Greg Sisk, building off Professor Leiter’s method, is released.

The significance, however, is that it is USNWR doing it. The USNWR rankings seem to have outsized influence in affecting law school behavior.

That said, USNWR has announced that, for the moment (i.e., the rankings released in 2019), the ranking will be separate and apart from the existing USNWR ranking. Much like USNWR separately ranks schools by racial diversity, debt loads, or part-time programs, this would be an additional factor. That means it may have much less impact on how schools respond. I imagine prospective applicants will still likely rely primarily on the overall score.

In the future, it may (this is an open question, so let’s not freak out too quickly yet!) be used in the overall rankings. On the whole, it may well be a good change (but I imagine many will disagree even on this hesitant claim!). Rather than the subjective, perhaps too sticky, assessments of faculty voting, this provides an objective (yes, even if imperfect!) metric of faculty productivity and influence. In the event that it used in the overall score in the future, it is one more small component of an overall ranking, it strikes me as appropriate.

There are downsides to any citation metric, of course, so we can begin with that. What to measure, and how, will never be agreed-upon terms. This metric is not exempt from those downsides. To start, USNWR announced that it would use the “previous five years” of citations on Hein. ”This includes such measures as mean citations per faculty member, median citations per faculty member and total number of publications.”

To name a few weakness, then: Schools with disproportionately more junior faculty will be penalized; so, too, will schools with significant interdisciplinary publications or books that may not appear in Hein. (It’s not clear how co-authored pieces, or faculty with appointments at more than one institution would be treated.)

But I’m more concerned with this Goodhart principle: once we start measuring it, what impact might it have on faculty behavior? A few come to mind.

The temptation to recast senior would-be emeritus faculty with sufficient scholarly output as still full-time faculty members, among other ways of trying to recategorize faculty. Perhaps these are marginal changes that all schools will simultaneously engage in, and the results will wash out.

There is a risk of creating inflated or bogus citations. This is a much more realistic risk. Self-citations are ways that scholars might try to overstate their own impact. Men tend to self-cite at disproportionately higher rates than women. Journals that have excessive self-citations are sometimes punished in impact factor rankings. Pressure to place pieces in home institution journals may increase.

Hein currently measures self-citations, which would be a nod in this direction. Some self-citations are assuredly acceptable. But there may be questions about some that are too high. The same might be true if colleagues started to cite one another at unusually high rates, or if they published puff pieces in marginal journals available on Hein with significant citations to themselves and their colleagues.

My hope (ed.: okay, the rosy part begins?) is that law professors will act without regard to the new measure, and that law school administrators seeking to improve their USNWR ranking do not pressure law faculty to change their behavior. And perhaps Hein and USNWR will ensure that its methodology prevents such gaming.

The same holds true for books or interdisciplinary journals that don’t appear on Hein. My hope is that schools continue to value them apart from what value they receive as a component of the rankings. (And it’s worth noting that this scholarship will continue to receive recognition to the extent that the “peer review” voting reflects the output of faculty, output of all types.) (Another aside—perhaps this offers Hein some leverage in seeking to license some interdisciplinary journal content in the years ahead….)

This hope is, perhaps, overly optimistic. But I think the school that starts to exert pressure on changing the kind of scholarship they are doing would receive significant backlash in the legal community. In contrast, the call will probably be greater on those faculty that are currently not producing scholarship or not receiving citations to their work—a different pressure.

It will be much easier for schools to “game” the median citations—finding the single faculty member in the middle, and trying to climb the ladder that way. Median is probably a better metric, in my view (because mean can disproportionately be exaggerated by an outlying faculty member), but it also more likely to be susceptible to Goodhart’s Law. Mean citations would be a tougher thing to move as dramatically or with such precision.

The news from USNWR also indicates it will measure total number of publications published. It’s an output metric in addition to the “influence” metrics of mean and median citations. That could benefit the more junior cohorts of faculties, which tend to produce more at a higher rate as they strive for tenure. (One such illustration is here.)

Finally, this could have an impact on how resources are allocated at institutions. In the past, scholarly output was not a part of the USNWR rankings formula. If it becomes part of the formula in the future, it will become a more valuable asset for institutions to fund and support, and a different way of valuing faculty members.

There are lots of unknowns about this process, and I look forward to seeing the many reactions (sure to come!), in addition to the final formula used and what those rankings look like. And these are all tentative claims that may well overstate or understate certain things—I only observe a few things in a short period of time for now! Time will tell.