Fairness, evidence, and predictive equality
10 min read

Fairness, evidence, and predictive equality

In-person exams in the UK were canceled this year because of the pandemic, so results were given using a modeling system that looked at "the ranking order of pupils and the previous exam results of schools and colleges". I don't know how the modelling system took into account previous results of schools and colleges, but I'm going to assume that students from schools with a worse track record on exams were predicted to have lower grades. This has, understandably, caused a lot of controversy.

I think this might be a good example of a case where using information feels unfair even if it makes our decision more accurate. It's very likely that previous school performance helps us make better predictions about current school performance. Yet it feels quite unfair to give people from lower performing schools worse grades than those from higher performing schools if everything else about them is the same.

To take a similar kind of case, suppose a judge's goal is to get people who have committed a crime to show up to court in a way that minimizes costs to defendants and the public. How should she take into account statistical evidence about defendants?

First, let's consider spurious correlations in the data that are not predictive. Suppose we divide defendants into small groups, such as "red-headed Elvis fans born in April". If we do this, we'll find that lots of these groups have higher than average rates of not showing up for court. But if these are mostly statistical artifacts that aren't caused by any underlying confounders, the judge would do better by her own lights if she mostly just ignored them.

Things get trickier when the correlations are predictive. For example, suppose that night shift workers are less likely to show up to court on average. Their court date is always set for a time when they aren't working, so being a night shift worker doesn't seem to be a direct cause of not showing up to court. But the correlation is predictive. Given this, the judge would do better by the standards above if she increases someone's bail amount when she finds out they're a night shift worker. This is true even if most night shift workers would show up to court.

As in the UK grades case, this feels intuitively unfair to night shift workers.

One principle that might be thought to ground our intuition for why this is unfair is the following:

Causal fairness principle (CFP): it's fair to factor properties of people into our decision-making if and only if those properties directly cause an outcome that we have reason to care about[1]

This principle looks plausible and would explain why the grades case and the night shift workers case both feel unfair. Night shift work doesn't seem to cause not showing up to court, and going to a low performing school doesn't directly cause getting a lower grade. But I think this principle is inconsistent with our intuitions in other cases.

To see why, suppose that night shift workers are more likely to live along poor bus routes. This means that they often miss their court appointment because their bus was running late or didn't show up. And this explains the entire disparity between night shift workers and others: if a night shift worker doesn't live along a poor bus route then they will show up to court as much as the average person and if a non-night shift worker that lives along a poor bus route, they will show up at court at the same (lower) rate as night shift workers that live along poor bus routes.

The judge receives this new information and responds by increasing the bail of anyone who lives along poor bus routes. By CFP her decision would be fair, since it only takes into account properties that are direct causes of the outcome she cares about. (And the outcomes will better relative to her goals because this heuristic gets at the underlying causal facts more than the night shift workers heuristic does). But I think her decision is intuitively unfair.

In response to this case, we might adjust CFP to say that a decision is fair only if the causal factors in question are currently within the control of the agent.

This addition makes some intuitive sense because factors outside of an agent's control are often not going to be responsive to whatever incentives we are trying to create. In this case, however, the place that the agent lives is at least partially in their control, even if moving would be very financially difficult for them. The behavior of people who live along poor bus routes is also likely to be responsive to incentives. People who live along poor bus routes are more likely to leave earlier to get to court if failing to show up means foregoing a high bail amount.

We also often think that it's fair to consider causally relevant factors that are outside someone's control when making decisions. Suppose you're deciding whether to hire someone as a lawyer or not and you see that one of the applicants is finishing a medical degree rather than a law degree. It seems fair to take this into account when making your decision about whether to hire them, even if we suppose that the candidate currently has no control over the fact that they will have a medical degree rather than a law degree, e.g. because they can't switch in time to get a law degree before the position starts.

These are reasons to be skeptical of CFP in the "if" direction (if a property is causally relevant then it's fair to consider it) but I believe we also have reasons to be skeptical of the principle in the "only if" direction (it's only fair to consider a property if it's causally relevant).

To see why, consider a case in which the judge asks a defendant "are you going to show up to your court date?" and the defendant replies "no, I have every intention of fleeing the country". Should the judge take this utterance into account when deciding how to set bail? This utterance is evidence that the defendant has an intention to flee the country, and having this intention is the thing that's likely to cause them to not show up to their court date. The utterance itself doesn't cause the intention and it won't cause them to flee the country: the utterance is just highly correlated with the defendant having an intention to flee (because this intention is likely the cause of the utterance). So CFP says that it's unfair for the judge to take this utterance into account when making her decision. That doesn't seem right.

To avoid this, we might try to weaken CFP and say that it's fair to take properties of someone into account only if having those properties is evidence that a person has another property that's causally relevant to the outcome. But this weakens the original principle excessively, since even the most spurious of correlations will be evidence that a person has a property that's causally relevant to the outcomes we care about. This includes race, gender, etc. since in an unequal society many important properties will covary with these properties. In an ideal world, we would only get evidence that someone has a causally relevant property when the person actually has the causally relevant property. But we don't live in an ideal world.

Perhaps we can get around some of these problems by moving to a more graded notion of fairness. This would allow us to amend the principle above as follows:

Graded causal fairness principle (GCFP): Factoring a piece of evidence about someone into our decision-making is fair to the degree that it is evidence that the person has properties that directly cause an outcome that we have reason to care about[2]

Since coincidental correlations will typically be weaker evidence of a causally-relevant property than correlations that are the result of a confounding variable, GCFP will typically say that it's less fair to take into account properties that could just be coincidentally correlated with the outcome we care about.

Although this seems like an improvement, GCFP still doesn't capture a lot of our intuitions about fairness. To see this, consider again the case of night shift workers. Suppose that we don't yet know why night shift work is so predictive of not showing up to court. By GCFP, it would be fair for the judge to assign night shift workers higher bail as long as the correlation between night shift work and not showing up to court were sufficiently predictive, since a correlation being more predictive is evidence that there's an underlying causally relevant factor. Once again, though, I think a lot of people would not consider this to be fair.

Let's throw in a curve ball. Imagine that two candidates are being interviewed for the same position. Both seem equally likely to succeed, but each of them has one property that is consistently correlated with poor job performance. The first candidate is fluent in several languages, which has been found to be correlated with underperformance for reasons not yet known (getting bored in the role, perhaps). The second candidate got a needs-based scholarship in college, which has also been found to be correlated with underperformance for reasons not yet known (getting less time to study in college, perhaps).

Suppose the candidates both want the job equally and that these properties are equally correlated with poor performance. The company can hire both of the candidates, one of them, or neither. How unfair does it feel if the company hires the person fluent in many languages but not the person who received a needs-based scholarship to college? How unfair does it feel if the company hires the person who received a needs-based scholarship to college but not the person who is fluent in many languages?

I don't know if others share my intuitions, but even if it feels unfair for the company to hire only one of the candidates instead of both or neither, the situation in which they reject the candidate who received a needs-based scholarship feels worse to me than the situation in which they reject the candidate who is fluent in several languages.

One possible explanation for this is that we implicitly believe in a kind of "predictive equality".

We often need to make decisions based on facts about people that are predictive of the properties that are causally-relevant to our decision but aren't themslves causally-relevant. We probably don't feel so bad about this if the property in question is not generally disadvantageous, i.e. over the course of a person's life the property is just as likely to be on the winning and losing end of predictive decisions.

Let's use the term "predictively disadvantageous properties" to refer to properties that need not be bad in themselves (they could be considered neutral or positive) but that are generally correlated with worse predicted outcomes. It often feels unfair to base our decisions on predictively disadvantageous properties because we can foresee that these properties will more often land someone on the losing end of predictive decisions.

Consider a young adult who was raised in poverty. They are likely predicted to have a higher likelihood of defaulting on a loan, more difficulty maintaining employment, and worse physical and mental health than someone who wasn't raised in poverty. Using their childhood poverty as a predictor of outcomes is therefore likely to result in them fairly consistently having decisions being made in ways that assumes worse outcomes from them. And it can be hard to do well—to get a loan to start a business, say—if people believe you're less likely to flourish.

Cullen O'Keefe put this in a way that I think is useful (and I'm now paraphrasing): we want to make efficient decisions based on all relevant information, but we also want risks to be spread fairly across society. We could get both by just making the most efficient decisions and then redistributing the benefits of these decisions. But many people will have control only over one of these things: e.g. hirers have control over the decisions but not what to do with the surplus.

In order to balance efficiency and the fair distribution of risks, hirers can try to improve the accuracy of their predictions but also make decisions and structure outcomes in a way that mitigates negative compounding effects of predictively disadvantageous properties.

For example, imagine you're an admissions officer considering whether to accept someone to a college and you know that students from disadvantaged areas tend to do drop out more. It would probably be bad to simply pretend that this isn't the case when deciding which students to accept. Ignoring higher dropout rates could result in applicants from disadvantages areas taking on large amounts of student debt that they will struggle to pay off if they don't complete the course.[3] But it might be good in the long-term if you err on the side of approving people from disadvantaged areas in more borderline cases, and if you try to find interventions that reduce the likelihood that these students will drop out.

Why should we think that doing this kind of thing is socially beneficial in the long-term? Because even if predictions based on features like childhood poverty are more accurate, failing to improve the prospects of people with predictively disadvantageous properties can compound their harms and create circumstances that it's hard for people to break out of. Trying to improve the prospects of those with predictively disadvantageous properties gives them the opportunity to break out of a negative prediction spiral: one that they can find themselves in through no fault of their own.

But taking actions based on predictively negative properties doesn't always seem unfair. Consider red flags of an abusive partner, like someone talking negatively about all or most of their ex-partners. Having a disposition to talk negatively about ex-partners is not a cause of being abusive, it's predictive of being abusive. This makes it a predictively disadvantageous property, since it's correlated with worse predictive outcomes. But being cautious about getting into a relationship with someone who has this property doesn't seem unfair.

Maybe this is just explained by the fact that we want to make decisions that lead to better outcomes in the long-term. Long-term, encouraging colleges to admit fewer students from disadvantageous areas is likely to entrench social inequality, which is bad. Long-term, encouraging people to avoid relationships with those who show signs of being abusive is likely to reduce the number of abusive relationships, which is good.

How can we tell if our decisions will lead to better outcomes in the long-term? This generally requires asking things like whether our decision could help to detach factors that are correlated with harmful outcomes from those harmful outcomes (e.g. by creating the right incentives), whether they could help us isolate causal from non-causal factors over time, and whether the goals we have specified are the right ones. The short but unsatisfactory answer is: it's complicated.

Thanks to Rob Long for a useful conversation on this topic and for recommending Ben Eidelson's book, which I haven't manage to read but will now recklessly recommend to others. Thanks also to Rob Long and Cullen O'Keefe for their helpful comments on this post.


I added the "have reason to care about" clause because if a judge cared about "being a woman and showing up to court" then gender would be causally relevant to the outcome they we care about and therefore admissible, but it seems ad hoc and unreasonable to care about this outcome. ↩︎

An ideal but more complicated version of this principle would likely talk about the weight that we give to a piece of evidence rather than just whether it is a factor in our decision. ↩︎

Thanks to Rob Long for pointing out this kind of case. ↩︎