Learning Outcomes

The global data volume is increasing dramatically each year. Understanding how to store, process and manage these huge amounts of data efficiently is a key requirement for software engineers and data analysts in the modern IT world. This lab (following the corresponding lecture topics of DBT-Database Technology) will teach students both the fundamentals of data processing in traditional single-node database systems and how to scale out these techniques to huge amounts of data in large-scale, distributed environments. During the implementation part of the lab, students will get hands-on experience with important data processing techniques by implementing several components of a relational database system and by using parallel programming platforms like Apache Hadoop or Nephele/PACT.

Content

In the database technology lab, students will implement components of a relational database system and get hands-on experience with a parallel data processing platform. The actual components implemented may vary each year, but will include parsing, query optimizer, execution engine, index structures and storage system.

Description of Teaching and Learning Methods

Lectures are accompanied by exercises in small groups to practically rehearse the theory taught in the lectures. In the project, the students will be split in teams and under self-control will implement some components of a database system, with the goal to have a running demonstrator at the end of the semester.

Requirements for participation and examination

Desirable prerequisites for participation in the courses:

This course is the base course for Master's students with a focus on database systems and information management. Students should enroll in this course in the first semester of their Master's program. In contrast to TU Berlin's introduction of Database Systems Informationssysteme und Datenanalyse (ISDA) course which looks database systems from an application programmers point of view, this class focuses on the internals of database systems. To participate, students are required to have successfully completed a Bachelor's in Computer Science with a focus on database systems (e.g. completed the Datenbankpraktikum and Datenbankprojekt courses) and have either previously completed DBT or a currently enrolled in DBT in the same semester. As a mandatory requirement, knowledge of data modeling, relational algebra, and SQL as well as a very good (!!) command of Java programming and the GIT version control system are essential to participate in the course. Note: These topics will not be covered in the course.