2024 ACM Symposium on Computer Science and Law
This paper examines an approach to algorithmic discrimination that seeks to blind predictions to protected characteristics by orthogonalizing inputs. The approach uses protected characteristics (such as race or sex) during the training phase of a model but masks these during deployment. The approach posits that including these characteristics in training prevents correlated features from acting as proxies, while assigning uniform values to them at deployment ensures decisions do not vary by group status.
Using a prediction exercise of loan defaults based on mortgage HMDA data and German credit data, the paper highlights the limitations of this orthogonalization strategy. Applying a lasso model, it demonstrates that the selection and weights on protected characteristics are inconsistent. At the deployment stage, where uniform values for race or sex are given to the model, the variations between models lead to meaningful differences in outcomes and resultant disparities.
The core challenge is that orthogonalization assumes an accurate model estimation of the relationship between protected characteristics and outcomes, which can be isolated and neutralized during deployment. In reality, when correlations are pervasive and predictions are constrained by regularization, feature selection can be unstable and driven by the efficiency of the prediction. This analysis casts doubt on the continued reliance on input scrutiny as a strategy in discrimination law and cautions against the myth of algorithmic colorblindness.