Course topics summary:

  • Multimodal interaction
    • Importance, definition, examples
    • Gesture as a mode of interaction
    • Audiovisual speech recognition systems
    • Combining modalities
  • Speech recognition
    • Basics and definitions
    • Sources of variability is speech
    • Acoustic and language modeling
    • Markov and Hidden Markov models
    • Real life challenges: adaptation, far distance microphones
  • Speech production
    • Theory of speech production
    • Vocal tract and resonance frequencies
    • Feature extraction for speech, spectograms 
  • Intro to machine learning:
    •  Basics and definitions of AI and ML
    • Challenges with standard software engineering approach
    • Feature extraction
    • Example machine learning model: Perceptron
    • Cost function and minimizing error
  • Machine translation
    • Basics of the statistical approach
    • Evaluation of translation systems
  • Seminars on various topics: meta data extraction from speech, deep learning, ....