We argued that HUD’s proposed changes to its disparate impact rule would undermine crucial housing protections for vulnerable communities by reducing plaintiffs’ ability to address discriminatory effects arising from the use of algorithmic models.
Re: Reconsideration of HUD's Implementation of the Fair Housing Act's Disparate Impact Standard, Docket No. FR-6111-P-02
Upturn writes to provide comments in response to the above-docketed notice of proposed rulemaking (“NPRM”) concerning proposed changes to the disparate impact standard (the “proposed rule”) as interpreted by the U.S. Department of Housing and Urban Development (“HUD”).
Upturn is a 501(c)(3) non-profit organization that advances equity and justice in the design, governance, and use of digital technology. Upturn's staff has years of experience working in partnership with the nation’s leading civil rights and public interest organizations, and has developed unique expertise at the intersection of civil rights, law, and computer science.
The proposed rule would undermine crucial housing protections for vulnerable communities. It would eviscerate HUD and other plaintiffs' ability to address discriminatory effects arising from the use of algorithmic models (hereinafter “models”), in spite of the fact that such models are “increasingly commonly used” in determining people's eligibility for a range of housing opportunities. It would effectively create a “special exemption” for parties who use such models, even though these models can have significant discriminatory effects. In sum, the proposed rule will likely result in harm to the very groups that the Fair Housing Act (“FHA”) seeks to protect, and is flatly incompatible with HUD's legal obligation to affirmatively further fair housing.
It is important to emphasize that the disparate impact doctrine is the most effective legal tool — and often the only legal tool — with which to combat discrimination arising from the use of models, in housing markets or elsewhere. Without the disparate impact doctrine, opaque, automated decisions will effectively rise above the law when there is no clear evidence of disparate treatment.
Upturn has endorsed a separate comment signed by a range of individuals and organizations with expertise in the fields of computer science, statistics, and digital and civil rights. We write separately here to further underscore some basic, technical facts that HUD must reconcile with any modification to its existing disparate impact rule.
1. Models can produce significant discriminatory effects, even when they do not rely on factors that are “substitutes or close proxies for protected classes.”
The proposed rule would allow a defendant using a model to defeat a disparate impact claim by “[p]rovid[ing] the material factors that make up the inputs used in the challenged model and show[ing] that these factors do not rely in any material part on factors that are substitutes or close proxies for protected classes under the Fair Housing Act and that the model is predictive of credit risk or other similar valid objective.”
At a basic level, this new proposed defense is overbroad because models that do not take “substitutes or close proxies for protected classes” as inputs can nevertheless have discriminatory effects. This is well-understood in the context of fair lending. For example, in 2003, Congress ordered the Federal Reserve Board ("FRB") and the Federal Trade Commission, in consultation with HUD, to conduct a study of the effects of credit history scoring, including negative or differential effects on protected classes. This substantial report was commissioned despite the fact that consumer credit files — the inputs for the models at issue — clearly did not contain “substitutes or close proxies” for protected classes. Nevertheless, Congress was interested in the effects of credit history scoring as a general matter. And rightly so: The FRB's report discovered, among other things, that the length of an individual’s credit history served as a proxy for age in ways that could not be easily addressed. The proposed rule appears to entirely disregard the need for this type of inquiry, which is only becoming more important as models become more common and more complex.
More significantly, the proposed defense is flawed as it fails to acknowledge the many phases of model development, including problem definition, data collection and labeling, model selection and training, data partitioning, and the model's actual deployment. It is widely understood that statistical models can inherit biases against protected classes at each of these steps, even when protected class attributes are not considered. Often, discrimination that arises within statistical models is not obvious: it can come from “subtle correlations discovered by training algorithms, and [is] therefore difficult to detect.” There is a rich and growing technical literature expounding on these and related issues.
These are not merely academic observations. There are many notable examples of models exhibiting a range of discriminatory effects in the real world. For example:
In the domain of criminal justice, the risk assessment instrument COMPAS has shown demonstrable bias against black individuals, resulting in longer prison sentences and harsher terms. COMPAS does not take race or ethnicity as an input, but disproportionately and incorrectly labels black individuals as highly likely to commit future crimes. There may not be one individual feature in COMPAS that is responsible for this disparity, but rather an interaction between inputs in addition to data sampling biases.
Today's facial recognition technologies are often significantly better at recognizing white faces than black and brown faces. This phenomenon has less to do with inputs to the model, but rather the skewed training data that includes a much larger volume of white, male faces than any other race or gender. This is a clear example of how machine learning can create biased models as a result of incomplete or nonrepresentative training data.
As HUD is well aware, Facebook's ad platform “optimizes” the delivery of housing advertisements on its platform, above and beyond the targeting criteria specified by an advertiser. This optimization can lead to racially skewed delivery patterns despite the fact that Facebook does not collect data about its users' race. HUD has alleged this practice violates the FHA. However, it is difficult to envision how HUD could succeed in establishing a prima facie case under the strictures of this proposed rule.
In sum, models that exhibit discriminatory effects cannot be diagnosed merely by examining its inputs. On the contrary, to properly assess a model for discriminatory effects, an investigator will likely need first understand a model's purpose, and then consult design documentation, training data, executable code, test results, and other artifacts.
2. Models can be complex, opaque, and difficult to assess — so it's unreasonable to offer blanket safe harbors based on vague or superficial criteria.
The proposed rule would allow a defendant using a model to insulate themselves from disparate impact claims by showing that, in part, “the challenged model is produced, maintained, or distributed by a recognized third party that determines industry standards” or “the model has been subjected to critical review and has been validated by an objective and unbiased neutral third party.”
These proposed defenses amount to safe harbors, which are severely out of step with adjacent regulatory regimes. For example, in the context of consumer credit, Regulation B defines requirements for “an empirically derived, demonstrably and statistically sound, credit scoring system.” This definition has become a helpful yardstick for the industry. However, unlike this proposed rule, Regulation B does not effectively create a blanket safe harbor. Rather, if a consumer credit scoring system conforms to the “empirically derived, demonstrably and statistically sound” definition, then that system is allowed to consider an applicant’s age directly — something a system would not otherwise be able to consider under ECOA. In the context of employment, the Uniform Guidelines on Employment Selection Procedures offer a nonbinding rule of thumb known as the “4/5ths rule” to help determine when an employer might be at risk of a disparate impact case. However, this rule is merely a heuristic: it is not sufficient to say whether or not an employer will ultimately be found liable for deploying a discriminatory selection procedure. Importantly, neither regime provides entities with the kind blanket immunity contemplated by this proposed rule.
Finally, as a practical and legal matter, FHA-covered entities should not enjoy immunity from disparate impact claims simply because they rely on a model provided by a third party. Such a policy would disincentivize covered entities from fully understanding and testing the models that they commission and deploy. Moreover, such an approach would create an accountability black hole, since not all model vendors are clearly covered by the FHA, and are likely to consider their models proprietary. Today, many prominent vendors, including FICO, will indemnify their customers (e.g., lenders), because those customers are ultimately responsible for their compliance with antidiscrimination laws. There is no reason to disturb this sensible structure of accountability.
3. There is no widely recognized definition of “substitutes or close proxies for protected classes.”
In two separate instances under the proposed rule, defendants can defeat a disparate impact claim in part by demonstrating that the challenged model “rel[ies] in any material part on factors that are substitutes or close proxies for protected classes under the Fair Housing Act.”
Critically, the proposed rule does not define what constitutes a factor that might be a “close prox[y],” nor does it offer a guide as to how one might determine whether or not a factor, as a statistical matter, is a close proxy to a protected class covered under the Fair Housing Act. Ultimately, the absence of clarity on how to define or measure a close proxy for a protected class — as a statistical proposition — has clear consequences: courts across the country, when faced with these affirmative algorithmic defenses, will be enlisted en masse in adjudicating statistical disputes. Not only are judges ill-equipped to arbitrate such disputes, the experts upon which courts would rely would likely be unable to offer a consensus view, needlessly drawing courts across the country into line-drawing exercises about what does and does not count as a close proxy.
4. Assessing models for discriminatory effects often requires access to data about individuals' protected class status.
The proposed rule states that HUD does not “encourag[e] the collection of data with respect to race, color, religion, sex, handicap, familial status, or national origin” and that the “absence of any such collection efforts shall not result in any adverse inference against a party.”
This provision risks discouraging covered entities from developing robust processes to test their models for discriminatory effects. There is an ongoing debate about when protected class data should be collected and for what purposes. For example, in the context of consumer credit, the FRB twice considered amendments to the Equal Credit Opportunity Act that would allow voluntary collection of protected class data for non-mortgage loan applicants in order to surface discriminatory lending decisions. To the extent HUD wishes to clarify its stance on how and when FHA-covered entities should or should not collect data about protected attributes, it should do so in a separate proceeding.
It's important to recognize that the collection of protected class data for antidiscrimination purposes has a long and important history. Banks, hospitals, housing providers, and employers routinely collect or infer race data as a key tool to measure and remediate discriminatory effects. Indeed, federal laws and regulations often mandate such collection for the purpose of measuring and addressing disparities. One example is the Home Mortgage Disclosure Act, which has made important data mutually accessible to lending institutions and community organizations — resulting in positive outcomes for banks and borrowers alike.
Looking to the future, machine learning practitioners have emphasized that awareness of sensitive attributes, such as race and gender, can be critical to detecting and remediating bias in complex models. HUD should be open to exploring how these emerging methods might help affirmatively further fair housing, rather than preemptively shuttering the door.