Learning Outcomes
Recent
advances in technology have led to rapid growth of big data. This led
to the need for cost efficient and scalable analysis algorithms. In this
course concepts for scalable analysis of big data sets will be
presented and applied using open source technologies. Participants of
this module will gain an in-depth understanding of concepts and methods
as well as practical experience in the area of scalable data science.
The course is principally designed to impart: technical skills (50%),
method skills (30%), system skills (10%), and social skills (10%).
Content
The
module will focus on mainstream distributed processing platforms and
paradigms and learn how to employ these to solve challenging big data
problems using popular data mining methods. Students will learn how to
implement and employ varying data mining algorithms, such as Naïve
Bayes, K-Means Clustering, and PageRank on varying open-source systems
(e.g., Apache Hadoop, Apache Flink).
Description of Teaching and Learning Methods
This
Integrated Course (Integrierte Veranstaltung, IV) consists of: (i)
lectures on key concepts, (ii) practical theoretical & programming
exercises, and (iii) student lead presentations (including literature
search). Active participation and contributions to all parts of this
course are essential.
Requirements for participation and examination
Desirable prerequisites for participation in the courses:
Computer science topics addressed in TU Berlin modules in the Bachelor’s curriculum, particularly, the database course (“Information Systems and Data Analysis”) or the equivalent, as well as excellent JAVA AND SQL programming skills are strictly required. Basic knowledge in linear algebra, numerical analysis, probability, and statistics are strongly recommended. Furthermore, it is highly advisable if students have already completed (or are currently enrolled in) a machine-learning course. Since the course will be offered in English, fluency in English is also required.- Trainer/in: Lennart Behme
- Trainer/in: Sergey Redyuk