[link]
**Background:** The goal of this work is to indicate image features which are relevant to the prediction of a neural network and convey that information to the user by displaying a counterfactual image animation. **The Latent Shift Method:** This method works on any pretrained encoder/decoder and classifier which is differentiable. No special considerations are needed during model training. With this approach they want the exact opposite of an adversarial attack but it is using the same idea. They want to perturb the input image so that the classifier reduces its prediction. If they just compute $\frac{\partial f}{\partial x}$ and move the pixels directly then they will get an imperceivable difference like an adversarial attack. Using a decoder they can regularize the transformation so it will only yield value images. The encoder takes the input image and encodes it into a latent representation $z$. Then the decoder reconstructs the image and feeds this image into the classifier. The gradient is computed from the output of the classifier with respect to $z$. Subtracting the gradient from z and reconstructing the image generates a counterfactual. https://i.imgur.com/iuZGUTH.gif They found that if they change the prediction by 30% the images come out pretty good. So an iterative search along the vector defined by the gradient in the latent space until the prediction is reduced by 30%. From this sequence a 2D image can be reconstructed which is similar to a traditional attribution map by taking the maximum pixel wise difference between every image and the unperturbed reconstruction. https://i.imgur.com/V3PCgXZ.png The results look great! https://i.imgur.com/DBki84c.gif https://i.imgur.com/kFfQNKD.gif In order to validate if this approach can help spot false positive predictions, two radiologists to evaluate how confident they were in a models predictions. For each image, radiologists viewed the prediction in two ways, using traditional methods or the Latent Shift images. Traditional methods includes the image gradient, guided backprop, and integrated gradients. The Latent Shift Counterfactual includes the animation as well as the 2D version. https://i.imgur.com/TlUBhzL.png What they would like to see, that for true positives, the results are all 5 and for false positives they are all 1. What they observe however, is that many false positives still cause high confidence in the model predictions but not as much as the true positives. Between these two methods they find for true positives that the latent shift counterfactuals show a significant increase in confidence which is good. > 0.15±0.95 confidence increase using the Latent Shift method (p=0.01). For false positives they find an increase in confidence but it is not significant. > 0.04±1.06 increase which is not significant (p=0.57) **Conclusions:**  Latent Shift's ability to generate counterfactuals is pretty good!  Vanilla autoencoders are sufficient for some pathologies.  StyleGAN and higher quality models should improve performance.  IoU analysis may not be the best fit.  Explainable AI methods can have an impact on the user confidence in the model. (Disclaimer: I am the author of this work) Project Website: https://mlmed.org/gifsplanation/
Your comment:
