Google announced TensorFlow Privacy, a library for its TensorFlow machine learning framework intended to make it easier for developers to train AI models with strong privacy guarantees. It’s available in open source, and requires “no expertise in privacy” or underlying mathematics, Google says. Moreover, developers using standard TensorFlow mechanisms shouldn’t have to change their model architectures, training procedures, or processes.
It follows hot on the heels of TensorFlow 2.0 alpha, which was also announced today.
“Modern machine learning is increasingly applied to create amazing new technologies and user experiences, many of which involve training machines to learn responsibly from sensitive data, such as personal photos or email,” Google wrote in a Medium post. “We intend for TensorFlow Privacy to develop into a hub of best-of-breed techniques for training machine-learning models with strong privacy guarantees.”
TensorFlow Privacy operates on the principle of differential privacy, according to Google, a statistical technique that aims to maximize accuracy while balancing the users’ information. To ensure this, it optimizes models using a modified stochastic gradient descent — the iterative method for optimizing the objective functions in AI systems — that averages together multiple updates induced by training data examples, clips each of these updates, and adds noise to the final average.
TensorFlow Privacy can prevent the memorization of rare details, Google says, and guarantee that two machine learning models are indistinguishable whether or not a user’s data was used in their training.
“Ideally, the parameters of trained machine-learning models should encode general patterns rather than facts about specific training examples,” Google wrote. “Especially for deep learning, the additional guarantees can usefully strengthen the protections offered by other privacy techniques.”
TensorFlow Privacy comes after the open source debut of Intel’s HE-Transformer, a “privacy-preserving” tool that allows AI systems to operate on sensitive data. It’s a backend for nGraph, Intel’s neural network compiler, and based on Microsoft Research’s Simple Encrypted Arithmetic Library (SEAL).