Bad Proxy Variables and Institutional Ethnocentric Oversimplification

By Karl T. Muth - 03 December 2013
Karl Muth talks generally about proxy variables and then focuses in on race as a proxy variable for social and economic class, arguing it is a troubling one and (mis-)used far too often, particularly in higher education.

Often, in any empirical area of the social sciences, whether it’s public health or development or economics or legal studies, we find ourselves in a less-than-desirable situation: the variable we really want to measure is expensive to measure or impossible to measure or impracticable to measure. So we find a proxy variable. Sometimes these work, and sometimes they don’t.

I have a colleague who has been following the rate of development in Singapore for some time. You might think this is easy to study, but it’s not. Singapore has an enormous bureaucracy around building and often permits are issued for buildings that are never built, or buildings are planned so far in advanced that financing falls through when the time comes to break ground, or a change in local politics means one project is favoured over the project that was on the horizon for years. As a result, she uses satellite photography and checks, block-by-block, which areas have been built. We call this a direct observation (some will quibble with this and say that it’s indirect, as the satellite is actually making the observation, but I’ll set those technical arguments aside for another time). By looking at these photos, she can see the exact pace of building, but building is different from development.

However, my friend also realises that this is flawed in that a one-city-block parking garage is counted as the same development as a one-city-block office building or apartment building. To correct for this, she examines power grid loads and utility data for various areas. Since a parking garage uses little power, it does not register in these statistics. However, areas like Google’s server hive in Singapore or the local stadium draw substantial amounts of power. The special lighting system that illuminates the city for nighttime Formula One racing draws an incredible amount of power – but is not in itself an indicator of “development” in any broad sense. As you can see, electricity use is a tempting proxy for development, but perhaps not a very good one.

Economists and other social scientists rate proxy variables based on their correlation with the thing being studied. This is in itself dubious methodologically in some cases, however, since the very reason for using a proxy variable is that the value of the thing being studied is not known. A contemporary example is happiness. While there are various indices of happiness, they are all an assembled cocktail of proxy variables. It isn’t that living in a place with low infant mortality causes euphoria, it’s that this is a factor that is generally included in happiness index calculations (partly because it hints at other things, like the availability of good hospital care).

The other problem with proxy variables is that their utility changes over time. This can be seen in the Dow Jones Industrial Average, which is essentially a proxy variable for the performance of American equities. To make this role clear, the constituent companies whose stocks are used to create the Dow are called “Dow components” since they are components in a weighted average calculation. Since a basket of a dozen American companies cannot possibly reflect what is happening in the US equity markets generally, the Dow was expanded to thirty components. Every single original Dow component has been replaced except for General Electric in an effort to keep the Dow a faithful proxy variable for the performance of American equities. And it does a surprisingly good job, as can be seen by comparing S&P500 performance (a basket of five hundred firms) with the far smaller Dow (thirty firms). However, the Dow’s good tracking performance (“tracking performance” is the operative metric for financial proxy variables) is due to its careful curation and the replacement of some companies with others over time.

Let’s examine a more controversial proxy variable application relevant to American policymakers. In estimating disadvantage in the university admissions context, race is used as a proxy (either as a matter of policy or in practice apart from written policy). Few would contend that campus diversity is not a valuable thing; many, including myself, would assert that campus diversity is an amazing aspect of university life and that being on a campus with people from dozens of countries, ethnic groups, and backgrounds is a valuable part of the educational experience. It is, in fact, one of the things that attracted me to studying at the London School of Economics.

However, race’s value as a proxy for opportunity and class privilege has changed wildly over the past fifty years. Though many factors have contributed to this, three are particularly relevant in the context of American and British universities. For the purposes of this discussion, I’m focusing on elite universities with selective admissions. And, in the spirit of full disclosure, I serve on the Admissions Committee of a graduate department at the University of Chicago, one such institution; these comments are my own opinions and not made in that capacity.

The three driving factors I identify as having eroded the value of race as a proxy for privilege are: 1) enormous financial windfalls to minority groups, 2) immigration of a type and scale that did not exist in the 1960’s when these policies were set, 3) the inadequate and overly-simplistic bifurcation of minorities into “under-represented minorities” and (though no one says this) “over-represented minorities.”

Taking these in order, the increase in windfalls to minority groups is incredible in scale and frequency. The odds that an applicant today from China or Nigeria will be from not only a wealthy family in China or Nigeria, but a wealthier family than most in the U.S. or U.K., is significant. On the first day of law school, I sat next to a woman from a very wealthy family in Nigeria and we quickly became friends. One of the things I contemplated several times was the simplicity of race as a problematic proxy for privilege or access and the fact that many of her peers had social connections and financial resources that most Americans – white, black, or otherwise – could only dream of. The creation of millionaires by the thousands in Brazil, China, India, Singapore, and other places is another example of race being decoupled from financial access to a degree it was not in the 1960’s. While these numbers are small in the greater scheme of things, they are large enough to be significant since it is children from these minority families who are more likely to be applicants at top overseas schools in the U.S. and U.K. Even excluding the question of international students, the entire global culture of race opportunity and wealth has changed. Most new billionaires are Chinese. The wealthiest man on earth is Mexican. The commodity mineral that has seen the largest increase in value in the last five years is almost exclusively controlled by black Africans (coltan).

Turning to immigration, the type of immigration occurring today is unprecedented in world history. The concept of race and “affirmative action” in the United States did not consider the type of student migration we see today and did not examine the possibility that an African applicant might have a substantial advantage (often) over African-American applicants, for instance (both in terms of wealth and in terms of education and pre-collegiate opportunity). These fixed, mid-Twentieth-Century concepts of race failed to recognise that the Asian population might be hyper-mobile and that wealthy Chinese families would routinely send their children to be educated in the U.S. or U.K. with no intent they would stay there after graduation. The assumption, by both U.S. and U.K. universities that foreigners would stay in the country of their education (many students today would turn down a job in New York for an investment banking job in Hong Kong or Singapore) is clearly antiquated, but it takes time for institutions to adjust.

Finally, the concept of an “under-represented” minority in the U.S. (and more recently the U.K.) is usually taken relative to the total population. By this measure, in the United States, Asians are “over-represented” in the field of medicine but “under-represented” in the field of professional ice hockey (an absurd argument to illuminate the issue at hand: given that both the top U.S. medical schools and the National Hockey League draw from all over the world, shouldn’t 18% of both doctors and hockey players be Han Chinese, as this is the worldwide proportion?). What does this tell us? Precisely nothing. Nothing useful, anyway. By this measure, the Taiwanese Chinese are overrepresented relative to mainland Chinese in manufacturing of high-performance semiconductors per capita. Should we petition M.I.T. and Caltech until they create special programs to make sure mainland Chinese are receiving opportunities to study making computer chips? Probably not. Let’s look at a thought experiment. Suppose there were no underrepresented minorities. Suppose that a third of people graduating from the University of Chicago’s law, medical, and business schools were black (Chicago is 32.9% black). What would this indicate to us about the level of racism in Chicago? Or the state of the medical profession? Or the quality of care patients were receiving? Without additional data, it would tell us… nothing.

Of course, I have no illusions that we live in a post-race or non-discriminatory society. My parents were discriminated against as a mixed-race couple looking for a home in the 1970's. And I’ve seen friends from a variety of racial backgrounds endure far worse discrimination than I have ever been subject to. But the time for convenient proxies at the institutional level has ended: There are better, more nuanced, and more descriptive ways to examine a person’s experiences, resources, and credentials than to pull out a Pantone wheel.

Here, we return to an early assertion in this blog: use of proxy variables is appropriate where it is impossible or impracticable to ascertain the actual thing one wants to measure. But if what universities want to measure is privilege, we have ways to more-or-less directly measure that. If what employers want to know is whether a person would add diversity to the workplace, there are far better metrics than race. We don’t need a proxy variable. We don’t need to keep claiming (erroneously) that race is a good proxy for opportunity or wealth or access.

There have been shouts from the left in recent years that interviews (the interview process having been aimed at attacking exactly the more nuanced questions addressed here) at Cambridge and Oxford are disguised racism. From what little is publicly-available on the topic, I disagree. But the issue came to the forefront of British political discussion in 2010, when the Guardian reported (and Cambridge and Oxford did not refute) that twenty-one Oxbridge colleges did not offer admission to a single black student (Oxford did object to and disprove the article’s assertion that Merton College had not offered admission to any black student in five years). In the year in question, out of all its colleges, Oxford accepted one black student in that year. Statistics showed that Oxford’s students were 89% from upper social classes that year (how these social class statistics were compiled, I do not know).

What the school failed to mention in its reply to the Guardian, which I think is highly relevant, is that only thirty-five black candidates applied to Oxford in the year studied. Oxford is made up of thirty-eight colleges. Some will say, “Well, most black students would look at the statistics and apply to Harvard or Stanford or Chicago instead, seeing the low success rate of black applicants.” But you don’t get into Cambridge or Oxford or any other school of that tier because you see yourself as “another minority applicant.” You see yourself as a mix of identities – and I use the second person partially autobiographically here, as I write this as someone who has experience being a mixed-race university applicant – only a few of which have any relationship to race. And the son or daughter of an Asian industrialist or a South American oil family would never think of him- or herself that way, anyway (I say this having met, and studied alongside, such sons and daughters). Having the nerve to apply is, in itself, perhaps, a mark of privilege.

The question is not one of race. The question is one of privilege. I know several non-white graduates of Oxford colleges. But they are not rags-to-riches stories. They are uniformly, unanimously riches-to-riches stories. In fact, I might argue that using race to distinguish between these applicants and white applicants who went to the same boarding schools, whose parents sat on the same corporate boards, and who kept sports cars in London for weekends is hardly a pursuit of “diversity.” It isn’t that these people did not deserve to go to Oxford – they are, without exception, brilliant. But their race has little to do with it. It’s that at any top university finding people with off-the-charts, way-beyond-the-top-one-percent economic or social privilege is like trying to find a needle in a needlestack. Some of these people are white. But their level of privilege and access is the core issue, not their superficial membership in any one group.

Some people have made efforts – and progress – on this front, but it is slow going. My dear friend Ian Simmons, who has long been an advocate for increased access and meritocracy at top institutions (first as a student at Harvard and now for years as an alumnus), has often noted that Harvard has vastly increased the number of its undergraduate students who qualify for financial aid (likely a better metric than Oxbridge’s “social class” statistic) but that there remains work to be done. Likewise, I’m happy to report that the University of Chicago, where I attended the same program my father did decades earlier, was a more diverse place than when my father attended (on not only race, but also other criteria). However, elite education is still inaccessible to most and unapproachable to many. In many ways, the small number of students admitted to Cambridge and Oxford from less-wealthy, less-connected backgrounds concerns me slightly less than the huge number of similar students who must have been discouraged from even applying.

But to lump the black student who grew up in Paris with the black student who grew up in Detroit because they happen to be of the same shade is offensive to both. To classify the Chinese student from one of the wealthiest families in Hong Kong in a category with the Chinese student from a poor family in East Los Angeles is to insult each and both. To suggest that checking a “Hispanic” checkbox should include both the child of an investment banker in Barcelona and the child of a migrant worker in Texas is to create a useless classification. These overbroad racial and ethnic taxonomies are not proxies for privilege or past discrimination – they are meaningless categories.

Where to from here?

Let’s get to a point where diversity doesn’t mean “multicoloured similar people” but actually means people who have within them different experiences, backgrounds, and challenges. Let’s stop pretending that white people are wealthy and black people are poor and realise that people are not statistical samples and demographic coagulations, but individuals. Let’s stop pretending that race is a monolithic construct or anything other than arbitrary buckets created by lazy demographers who are now dead. Let’s be honest about the dimensions of class, privilege, access, intergenerational wealth, and geography that we really care about. Let’s admit that we’ve reached a point where race is an irretrievably poor proxy for measuring something like privilege and look at that as an achievement..

Let’s stop using proxy variables like race when the actual things that need to be measured are dead center in front of us: wealth, access, political influence, and so on. It will take effort to create the kinds of diversity that matter – more effort than it has taken to attempt to discover tiny shining flakes of heterogeneity in rivers of sameness. But this effort is worth it. And needed urgently.

