Towards Universal Adversarial Examples and Defenses

Authors

Image provided by Adnan Rakin
Adnan
Rakin
Arizona State University
Profile
Ye
Wang
Mitsubishi Electric Research Laboratories
Profile
Shuchin
Aeron
Tufts University
Profile
Toshiaki
Koike-Akino
Mitsubishi Electric Research Laboratories (MERL)
Profile
Pierre
Moulin
University of Illinois at Urbana-Champaign
Profile
Kieran
Parsons
Mitsubishi Electric Research Laboratories

Abstract

Adversarial examples have recently exposed the severe vulnerability of neural network models. However, most of the existing attacks require some form of target model information (i.e., weights/model inquiry/architecture) to improve the efficacy of the attack. We leverage the information-theoretic connections between robust learning and generalized rate-distortion theory to formulate a universal adversarial example (UAE) generation algorithm. Our algorithm trains an offline adversarial generator to minimize the mutual information between the label and perturbed data. At the inference phase, our UAE method can efficiently generate effective adversarial examples without high computation cost. These adversarial examples in turn allow for developing universal defenses through adversarial training. Our experiments demonstrate promising gains in improving the training efficiency of conventional adversarial training.

Paper Manuscript