Filtering as Weak Unlearning for Black-Box Generative models
Generative Models output unwanted and even harmful images. To tackle this problem recent line of works explores how to unlearn so that it stops generate these output. On the other hand when these models are used as Block-Box(i.e. the models architechture, parameters and underlying dataset are unknown) in several other downstream task. In this situation unlearning the model becomes harder and only blocking gives pratical way to tackle the problem of stop showing undesired outputs. In this work we devise a way to block the generative models outputs in latent space.