Life in America these days has become a vast numbers racket. That is, most Americans are, cannily or not, ensnared in the numbers game called metrics, or what Jerry Muller in his latest book terms the “metrics fixation.” This fixation is founded on the assumption that “If you cannot measure it, you cannot improve it.” In virtually every sector of American life, the tyranny of numbers, of measurement, has become our daily cross. It is an affliction, a merciless goad, and a godawful bore. In the language of its proponents (and they are legion), metrics is “measured performance” in the name of “accountability” and “transparency.” Sound familiar? To be fair, Muller’s compact study is not a screed against all metrics. He concedes that, properly administered, metrics can beget desirable results. Yet his primary concern is to demonstrate that the negative results, more often than not, outweigh the positive ones. In education, in corporate America, in our medical institutions, in policing and the military, in government and foreign policy—just to single out the most prominent areas of abuse—metrics tends to measure what is most easily measurable, but what is often of lesser importance, and to oversimplify what is complex; it tends to measure “inputs” rather than “outcomes,” process rather than product; by standardization, it degrades the quality of the information it generates. Even worse, it promotes “gaming” of the system, or outright cheating.
Shall we begin with the No Child Left Behind (NCLB) fiasco? Inaugurated by G.W. Bush in 2001, NCLB was from the outset a “standards-based” reform which established “measurable goals” as the key to promoting improvements in individual student performance. More specifically, it was intended, as its full title indicated, “to close the achievement gap with accountability, flexibility and choice . . . ” Under the Act any public school receiving federal funding was required to administer annual testing to ensure rising levels of proficiency in math, reading, and science between grades three and eight. While each state was authorized to establish its own testing standards and methods of measurement, federal oversight mandated a variety of penalties for schools that failed to demonstrate yearly improvement. Special emphasis was given to closing the long-standing achievement gap between blacks and Hispanics on the one hand, and whites (and Asians) on the other. After 14 years of implementation only slight average improvements in test scores were achieved, and even those were no higher than the rates of improvement documented in the decade preceding implementation. Even worse, as many critics noted, the punitive nature of the testing regime (which threatened underperforming teachers and principals with diminished salaries or even dismissal) resulted all too often in the phenomenon known as “teaching to the test,” meaning that teachers naturally tended to focus classroom time narrowly on the skills and knowledge most demanded by the tests, while neglecting other important (but not so easily measurable) skills. After a few years, not surprisingly, many states began to dumb-down achievement goals to accommodate struggling districts whose test scores were lowering state averages. Scores for minority students on average demonstrated no appreciable improvement after eight years of testing. As Muller wryly notes, “[I]n the absence of discernable progress in results, the [amount of] resources devoted to ongoing measurement becomes itself a sign of moral earnestness.”
At the college and university level, the fixation with metrics has created a somewhat different set of problems, but measurement and accountability are still the operative catchwords. Himself a university professor, Muller is acutely aware of the dilemma faced by higher education. As ever larger numbers of applicants clamor for admission, driven by dreams of upward mobility and a democratic ethos that demands equality of opportunity for every citizen, our colleges scramble to accommodate students who are ill-prepared to meet the traditional demands of postsecondary pedagogy. Admissions standards are quietly lowered (or “tweaked”) in the competition for head counts, and remedial courses are offered that pretend to prepare students who, in many cases, have graduated high school with what, just a generation ago, would have been considered ninth- or tenth-grade levels of expertise in math and English. Yet all of these colleges, public and private, depend in part for their funding on ranking systems, often by states which mandate “outcome-based” remuneration. In plain English, what this means is that “success rates” must be measured in the only way that bureaucrats and politicians understand—quantitatively. If 30 students sign up for a course and only 17 receive a passing grade, then the success rate just barely surpasses 50 percent, which is abysmal, and the blame falls squarely on the shoulders of the instructor. No consideration is given to the level of preparation of those who failed (or any other contributing factor, since none of these is easily measurable). Faculty members, a whopping number of whom these days are adjunct instructors with little or no job protection, can easily see the writing on the proverbial wall: Pass more students or lose your position (or remain stranded at your present pay grade). Of course, this is just the tip of the iceberg. Ill-conceived metrics are pervasive in academe: in the accreditation process, in tenure evaluation, and more.
Perhaps more notably, if only because so many millions of lives are at stake, the ill-conceived use of metrics is even more alarming in the medical field. Most egregious has been the introduction, since the 1990’s, of “pay-for-performance” (P4P) metrics and the widespread use of physician “report cards.” As the cost of medical care and health insurance in the U.S. has risen precipitously, medical providers have responded with a slew of metrics that, as proponents argue, will allow patients to become knowledgeable consumers. Accountability and transparency will ensure that these ailing consumers will find their way to the specialists with the “highest performance scores,” and insurers will “flock to hospitals and providers who supply the best care at the lowest price.” Given that healthcare has become a highly competitive business, especially with the emergence of enormous conglomerates like the Hospital System of America, cutting costs and driving up those performance scores has become a pressing matter. Yet, as Muller shows, studies of schemes like P4P and medical report cards—studies that tout their positive impact—are often seriously flawed. They have tended to evaluate “process and intermediate outcomes rather than final outcomes.” A report by the Rand Corporation, which surveyed scientific studies of such “brand management” schemes, found that the studies with the most rigorous methodologies revealed little or no improvement in final outcomes (patient recovery rates) where P4P and physician rankings were factored in. One problem with P4P is that, just as in academe, it rewards only what is measurable and shifts the doctor’s gaze toward “what can be measured rather than what is important.” P4P also promotes risk aversion. If, for example, a surgeon’s report-card ranking depends on declining mortality rates among the patients he treats, he will be more reluctant to risk operating on patients whose probability of death is higher. Some states, like New York, issue public reports on post-operative mortality rates (in coronary bypass cases, et al.), making transparent the percentage of patients who survived for more than 30 days after surgery. In these instances, reporting does indeed produce declining rates of mortality, not because care has improved but because surgeons simply decline to operate on patients whose likelihood of survival for 30 days or more is low. Alternately, physicians may use aggressive and costly procedures to keep post-op patients alive just long enough to meet the 30-day requirement. This sort of gaming of the system also occurs in response to reporting on readmission rates. For example, Medicare requires acute-care hospitals to report rates of readmission for patients with life-threatening conditions (heart disease, COPD, strokes, etc.). The assumption is that readmission rates reflect quality of care—an assumption that may be simplistic. In 2012 Medicare went beyond reporting and began to impose penalties for higher readmission rates. In some cases hospitals responded with genuinely improved care, but in many instances Medicare’s punitive measures have provided incentive for deception. For example, patients are placed on “observation status” rather than formally readmitted; or they are billed as “out-patients” or treated in the E.R. As Muller makes clear, such manipulation of the system is extensive.
In the world of business and industry, one might assume, pay-for-performance metrics should find a more appropriate arena. After all, this is the sector where the mania for metrics first gained traction. Muller grants some truth to this assumption but with a caveat (here quoting a study of workers and job satisfaction): “Extrinsic rewards become an important determinant of job satisfaction only among workers for whom intrinsic rewards are relatively unavailable.” If the work is repetitive, if it involves little or no creativity or initiative, if it requires a minimum of individual effort, or when it involves sales, then compensation (extrinsic rewards) tied to measured performance can be an effective way to boost productivity. The problem is that, especially in recent decades, the jobs most Americans do are not so standardized; they consist of a variety of activities that can’t be easily reduced to metrics. Moreover, if metrics are used by management to rank employees on a comparative scale, the long-term effect may be to rob employees of much of their intrinsic motivation. One 2006 study that surveyed some 200 human-resource managers found that employee rankings (especially when tied to compensation) led to “lower productivity, inequity, skepticism, decreased employee engagement, reduced collaboration, damage to morale, and mistrust in leadership.” That metrics of this type are often based on information gathered through various forms of managerial surveillance is especially troubling, as anyone who has worked in corporate America for any length of time is well aware. Metrics-driven tunnel vision in the business world often produces unfortunate but unintended consequences. In some cases, however, the question of intentionality can be ambiguous, especially in the financial realm. As has been well publicized, in 2011 the banking giant Wells-Fargo set up pay-for-performance quotas encouraging middle-level employees to sign up customers for additional services. In fact, the quotas were unrealistically high, and branch bankers by the thousands resorted to “low-level fraud,” enrolling customers without their knowledge in online or debit card accounts. The rest is recent history: Wells Fargo fired over 5,000 employees and was massively fined by the Consumer Financial Protection Bureau. Muller suggests that such “malfeasance” was not the intention of Wells Fargo management, but is that so certain? Could they really have been so naive as not to see that their “measured performance” scheme would trigger some employees to fudge the distinction between aggressive sales and outright fraud?
Beyond the realms of education, healthcare, and business the metrics bandwagon is crowded with boosters in policing, in the military, in government bureaucracy at every level, in nonprofits, and even in foreign-aid programs. While there is insufficient space to cover all of these here, some attention must be given to Muller’s brief but incisive account of the historical origins of the metrics fixation. While there were precursors across the Atlantic, in the U.S. the first major step toward measured performance in the workplace was taken by Winslow Taylor, who introduced the notion of “scientific management” in 1911. Hence “Taylorism” is generally associated with time and motion studies intended to streamline the manufacturing process and to standardize production. A key point is that “Taylorism was based on trying to replace the implicit knowledge of the workmen with mass-production methods developed, planned, monitored and controlled by managers.” Increasingly, during the first half of the 20th century and especially after World War II, major corporations like General Motors began to subordinate every aspect of production to managerial principles, while the workers themselves were subjected to an unprecedented routinization. Another key influence emerged from the major schools of business which, beginning in the 1950’s, began to reshape the managerial ethos as “a distinct set of skills and techniques, focused upon a mastery of quantitative methodologies.” Managerial gurus like Robert McNamara (who began his career as a professor at Harvard Business School) pushed the idea that a successful general manager need not possess any particular knowledge of an industry, but “should be adept at calculating costs and profit margins.” The working assumption here is that the particularities of private corporations, government bureaucracies, universities, etc. are “less important than the similarities.” Thus, “If you can’t count it, you can’t manage it” became the first commandment of the new managerial cult. Of course, other factors played a role in fueling the rise of managerialism. One of these, perhaps the most significant, has been the erosion of traditional modes of authority. As social trust diminishes, as it has tended to do in our egalitarian and highly mobile society, faith in metrics rises. In a meritocratic system, executives and managers especially are more insecure in their positions. The numbers racket, with its aura of objectivity, functions as a security blanket. Numbers are “‘hard,’ and thus a safer bet for those disposed to doubt their own judgments.”
Finally, in view of Muller’s extensive and accurate understanding of the role of managerialism in fostering the metrics fixation, it is decidedly odd that he never once mentions James Burnham, the political philosopher whose pioneering work on the managerial “revolution” has been the foundation for virtually everything written about the topic since the 1940’s. Had Muller paid more attention to Burnham, he might have recognized that the erosion of traditional sources of authority is not simply something that emerged “in the culture,” but has been in large part the result of a direct attack on those authorities by a managerial elite driven to establish its own power and control. Yet its claim to legitimacy, based on ever-increasing efficiency and profits (and rising numbers of happy, pacified consumers of goods and services), necessarily separates this elite from any broader loyalty to the nation as a whole, what we sometimes call the “commonwealth,” not to mention the thousands of local communities of which the nation is comprised. The true wealth (and health) of any nation is unmeasurable. It will not be enough, as Muller himself seems to suggest, to tinker with the numbers racket until its abuses are eliminated. No, it is the managers themselves, understood as a self-perpetuating class of experts, who are the problem.
[The Tyranny of Metrics, by Jerry Z. Muller (Princeton, NJ: Princeton University Press) 240 pp., $24.95]