Part ofAdvances in Neural Information Processing Systems 32 (NeurIPS 2019)
Roi Livni, Yishay Mansour
A basic question in learning theory is to identify if twodistributions are identical when we have access only to examples sampled from the distributions.This basic task is considered, for example, in the context ofGenerative Adversarial Networks (GANs), where a discriminator is trained to distinguish between a real-life distribution and a synthetic distribution.Classically, we use a hypothesis class $H$ and claim that the twodistributions are distinct if for some $h\in H$ the expected valueon the two distributions is (significantly) different.Our starting point is the following fundamental problem: "is havingthe hypothesis dependent on more than a single random examplebeneficial". To address this challenge we define $k$-ary baseddiscriminators, which have a family of Boolean $k$-ary functions$\G$. Each function $g\in \G$ naturally defines a hyper-graph,indicating whether a given hyper-edge exists. A function $g\in \G$distinguishes between two distributions, if the expected value of$g$, on a $k$-tuple of i.i.d examples, on the two distributions is(significantly) different.We study the expressiveness of families of $k$-ary functions,compared to the classical hypothesis class $H$, which is $k=1$. Weshow a separation in expressiveness of $k+1$-ary versus $k$-aryfunctions. This demonstrate the great benefit of having $k\geq 2$ asdistinguishers.For $k\geq 2$ we introduce a notion similar to the VC-dimension, andshow that it controls the sample complexity. We proceed and provide upper andlower bounds as a function of our extended notion of VC-dimension.
Requests for name changes in the electronic proceedings will be accepted with no questions asked. However name changes may cause bibliographic tracking issues. Authors are asked to consider this carefully and discuss it with their co-authors prior to requesting a name change in the electronic proceedings.
Use the "Report an Issue" link to request a name change.