Debiasing convolutional neural networks via Meta Orthogonalization
Access full-text files
Date
2020-09-04
Authors
David, Kurtis Evan Alejo
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
As deep learning becomes present in many applications, we must consider possible shortcomings of these models, such as bias towards protected attributes in datasets. In this work, we focus on debiasing convolutional neural networks (CNNs), through our proposed Meta Orthogonalization algorithm. We leverage past work in debiasing word embeddings and interpretability literature to force image concepts learned by a CNN to be orthogonal to a bias direction. We empirically show through a suite of controlled bias experiments that this improves the fairness of CNNs, comparable to adversarial debiasing. We hope that this leads to new directions in debiasing and understanding deep learning models