The Center for Trustworthy Machine Learning is focused on three interconnected and parallel thrusts that represent the different approaches to attacking ML systems: training attacks, inference attacks, and abuse.
- The first thrust explores inference time security, namely methods to defend a trained model from adversarial inputs. This effort focuses on developing formally grounded measurements of robustness against adversarial examples (defenses), as well as algorithms for their generation (attacks).
- The second thrust explores robustness during training time. Here, the main goal is to develop rigorously grounded measures of robustness to attacks that corrupt the training data. This is achieved through the development of new training techniques that are robust to such forms of manipulation.
- The third thrust explores the general security implications of sophisticated ML algorithms. The Center PIs explore the general implications of generative ML models, such as models that generate (fake) content or data, and develop ways to distinguish such content from real content. This is achieved through the exploration of mechanisms to prevent the theft of a machine learning model by an adversary who interacts with the model.
The Center is also constructing and distributing an extensive evaluation platform that will let the Center PIs and other investigators experiment with new attacks and defenses on a variety of data sets, to test their effectiveness.
Details of the center portal with regards to upcoming tools and open source code will be announced later.