Distribution distance measures in generative and privacy models

Diesendruck, Maurice
Journal Title
Journal ISSN
Volume Title

Distribution distance measures provide a useful class of tools for generative and privacy models. In both cases, the goal is to simulate a data distribution without revealing too much about individual points. While early generative models focused on matching data in a component-wise manner, the models in this work incorporate distribution metrics to provide population-level information during training. Doing so reduces overfitting and increases the model's ability to generalize. Maximum mean discrepancy and energy distance are two such metrics that are easily defined and implemented over samples, and provide meaningful results on a range of data sets and data types. This work presents three main contributions: (1) a novel use of importance weights to modify the output distribution of a generative model, (2) an application and evaluation of a generative model for medical data privacy, and (3) a novel method for private data synthesis using support points and differential privacy