Testing Deep Image Classifiers Using Generative Machine Learning

Abstract

Although deep neural networks (DNNs) attain excellent performance on the specific tasks they are trained for, this often seems to be obtained using easier-to-learn proxies for the truly relevant concepts. The problem with proxies is that they cannot be relied on in new situations – the proxy departs from the true concept. And DNNs will very likely be deployed in new situations because the world is always changing, and training data are never exhaustive. Unfortunately, existing approaches are limited in their ability to diagnose the bad proxies being relied on by a DNN. We also cannot accurately characterise a model’s generalisation performance on different kinds of data beyond its training task. These limitations additionally prevent us from developing systems that do not rely on bad proxies. To improve this situation, this thesis introduces two new test generation procedures for image classification DNNs. These improve on existing approaches by identifying more kinds of inputs for which a particular DNN gives incorrect outputs. This is achieved by exploiting generative machine learning to solve the test oracle problem in new ways. The first new procedure trains a generative network to directly output test cases that identify failures. The second dynamically perturbs the activation values of a pretrained generative network as it generates new examples – the perturbations adjust the features of the generated data so that they also induce failures in the DNN being evaluated. Besides the primary contribution of these algorithms, this thesis also presents an empirical finding: standard adversarial training that aims to increase model robustness surprisingly decreases DNNs’ ability to generalise correctly to changes in high-level features such as object position, orientation, shape or colour.

Publication
Testing deep image classifiers using generative machine learning
Isaac Dunn
Isaac Dunn
Century Fellow