Deep Learning is (nearly) everywhere. Its success stories do not only remain in the realm of academics but penetrate the daily life of everybody. Despite the tremendous progress, there are several important questions that need to be addressed. Not only how to increase performance in existing applications or how to open new areas, but also how to enhance the trust of human operators in the predictions of the system. 

In this project, we want to address two important aspects of this: Given a trained model and a query sample, is the model actually applicable to the sample, i.e. can it be expected to give a reasonable answer? Imagine a self-driving car during a rainy day that was trained only during sunny days, or a model interpreting satellite imagery acquired during winter that so far only had seen summer images. These models will make predictions. They wont complain, they wont refuse. They will provide an answer - an answer most likely to be spectacularly wrong. In this project we will investigate methods, that aim to create models that will provide an estimate of their applicability.

The 2nd aspect we want to address is the "Why?". Why did the model make a certain decision? Which part of the input is relevant for this decision?

Participants will conduct a literature research about both aspects and implement a framework able to detect out-of-distribution samples (i.e. samples the model is not applicable to) that also provides insights into why a certain decision was made.