🏁 Concept Forgetting Via Label Annealing
Published in In Submission Phase, 2024
Recommended citation: https://openreview.net/forum?id=eTba2zVEdO#discussion
The effectiveness of current machine learning models relies on their ability to grasp diverse concepts present in datasets. However, the presence of biased and noisy data can inadvertently cause these models to be biased toward certain concepts, thereby undermining their ability to generalize and provide utility. Consequently, modifying the trained model to be independent with respect to these concepts becomes imperative for their responsible deployment. We refer to this problem as concept forgetting. The primary aim of this study is to develop methodologies for forgetting specific undesired concepts from a pre-trained classification model’s prediction. To achieve this goal, we present an algorithm called Label ANnealing (LAN). This iterative algorithm employs a two-stage method for each iteration. In the first stage, pseudo-labels are assigned to the samples by annealing or redistributing the original labels based on the current iteration’s model predictions of all samples in the dataset, and further in the second stage, the model is fine-tuned on the dataset with pseudo-labels. We illustrate the effectiveness of the proposed algorithms across various models and datasets, demonstrating favorable results while maintaining computational efficiency and preserving the model’s generalization capabilities. In comparison to the baseline method, for a particular accuracy our method achieves lower concept violation approximately improvement about 26.6-28.5% on the MNIST dataset, 28.5-33.3% on the CIFAR-10 dataset, 28.5-35.7% for the miniImageNet dataset and 17.6-76.4% for CelebA dataset. ⏩ Full Paper