One of the primary theoretical tools in machine learning is Vapnik-Chervonenkis dimension (VC dimension), which measures the maximum number of distinct data points a hypothesis set can distinguish. This concept is primarily used in assessing the effectiveness of training classification algorithms from data, and it is established that having finite VC dimension guarantees uniform versions of the various laws of large numbers, in the sense of e.g. Dudley (2014). An important result of Laskowski (1992) showed that finite VC dimension corresponds to a logical notion independently developed by Shelah, known as the non-independence property, and in subsequent decades much work has been done on finite VC dimension within model theory under the aegis of so-called NIP theories (cf. Simon, 2015).

However, despite this deep connection to logic, there has been little done on the computable model theory of VC dimension (one recent exception being Andrews and Guingona, 2016). The basic questions here are the following: (1) how computationally difficult is it to detect that one is in a setting with finite VC dimension?, and (2) if one is in this situation, how hard is it to compute what the precise VC dimension is? The current paper aims to answer these questions and discuss their implications.

Reference

Andrews, U. and Guingona, V. (2016). A local characterization of VC-minimality. Proceedings of the American Mathematical Society, 144(5):2241–2256.

Dudley, R. M. (2014). Uniform central limit theorems, volume 142 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, New York, second edition.

Laskowski, M. C. (1992). Vapnik-Chervonenkis classes of definable sets. Journal of the London Mathematical Society, 45(2):377–384.

Simon, P. (2015). A Guide to NIP Theories. Lecture Notes in Logic. Cambridge University Press, Cambridge.