with Emily Black and Zara Hall, FAccT ’24

Recent regulatory efforts, including Executive Order 14110 and the AI Bill of Rights, have focused on mitigating discrimination in AI systems through novel and traditional application of anti-discrimination laws. While these initiatives rightly emphasize fairness testing and mitigation, we argue that they pay insufficient attention to robust bias measurement and mitigation—and that without doing so, the frameworks cannot effectively achieve the goal of reducing discrimination in deployed AI models. This oversight is particularly concerning given the instability and brittleness of current algorithmic bias mitigation and fairness optimization methods, as highlighted by growing evidence in the algorithmic fairness literature. This instability heightens the risk of what we term discrimination-hacking or d-hacking, a scenario where, inadvertently or deliberately, the selection of models based on favorable fairness metrics within specific samples could lead to misleading or non-generalizable fairness performance. We term this effect d-hacking because systematically selecting among numerous models to find the least discriminatory one parallels the concept of p-hacking in social science research of selectively reporting outcomes that appear statistically significant resulting in misleading conclusions. In light of these challenges, we argue that AI fairness regulation should not only call for fairness measurement and bias mitigation, but also specify methods to ensure robust solutions to discrimination in AI systems. Towards the goal of arguing for robust fairness assessment and bias mitigation in AI regulation, this paper (1) synthesizes evidence of d-hacking in the computer science literature and provides experimental demonstrations of d-hacking, (2) analyzes current legal frameworks to understand the treatment of robust fairness and non-discriminatory behavior, both in recent AI regulation proposals and traditional U.S. discrimination law, and (3) outlines policy recommendations for preventing d-hacking in high-stakes domains.