In this course, we provide a mathematical analysis what statistical learning is and why it works. Starting from different models of data, we proceed with information-theoretic measures of ...
In this course, we provide a mathematical analysis what statistical learning is and why it works. Starting from different models of data, we proceed with information-theoretic measures of successful learning, the PAC (probably approximately correct) learning paradigm and rates from learning developing elements of large deviation theory. We then proceed to the decomposition of errors - model, statistical / generalization and training - in optimization-based ERM (empirical risk minimization) algorithms. We also discuss fully connected neural networks and adaptive enlargement of hypothesis spaces in order to learn successfully with universal approximators.
Successful participation in the course requires a sufficiently solid mathematical background in probability theory.