Leo Schwinn (M.Sc.), Dr. Dario Zanca, Prof. Dr. Björn Eskofier
05/2021 – 11/2021
Deep neural networks (DNNs) are nowadays used in a variety of applications, such as image classification, speech recognition and autonomous driving . Still, recent studies have shown that DNNs are vulnerable to adversarial examples . In image classification, adversarial examples are images obtained by applying small perturbations that are barely perceived by humans, but lead to a different predicted output with a high confidence [2, 1].
To increase the robustness of a DNN there are several approaches. The network can e.g. be trained with adversarial training, where the data is augmented with adversarial examples until the neural network assigns the correct classes [5, 7]. Rakin et al. proposed a method that combines adversarial training with Parametric Noise Injection (PNI) where additionally trainable Gaussian noise is injected into each layer . Possible drawbacks of this approach are that the noise parameters tend to become quite small making the noise injection less effective over time . In preliminary experiments we also noticed that perturbing the adversarial attacks during the training of the network leads to weaker attacks and thus to less robust defenses against strong attacks.
In this thesis a noise injection approach for adversarial training will be implemented. This includes training a neural network or using an already trained robust network and afterwards training a learnable pixel-wise noise injection layer in front of the network. It is evaluated whether this approach can increase the robustness of a DNN against adversarial attacks. Additionally human visual saliency information is combined with the noise injection to determine if this can further improve the robustness of neural networks. With saliency information noise can be added specifically to each pixel depending on whether it belongs to the foreground or the background of the image. This approach aims to weaken adversarial attacks in the background of the image while the foreground is mostly left untouched. In this approach the knowledge about visual attention from the COCO data set  is used and in a second step the results from the COCO data set annotations are compared to an approach with saliency information obtained by an automatic method, e.g. see .
 Madry, Aleksander, et al.: Towards Deep Learning Models Resistant to Adversarial Attacks. 2019.
 Goodfellow, Ian J., et al.: Explaining and Harnessing Adversarial Examples. 2015.
 Jeddi, Ahmadreza, et al.: Learn2Perturb: an End-to-end Feature Perturbation Learning to Improve Adversarial Robustness. 2020.
 Rakin, Adnan Siraj, et al.: Parametric Noise Injection: Trainable Randomness to Improve Deep Neural Network Robustness against Adversarial Attack. 2018.
 Schott, Lukas, et al: Towards the first adversarially robust neural network model on MNIST. 2018.
 Lin, Tsung-Yi, et al: Microsoft COCO: Common Objects in Context. 2015.
 Athalye, Anish, et al.: Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. 2018.
 Itti, Laurent, et al.: A model of saliency-based visual attention for rapid scene analysis. 1998.