Alle kurser

[WiSe 2025/26] Data Integration: Algorithms and Systems

“Data integration is the 800-pound gorilla in the corner, and everyone’s got it in spades,” according to Mike Stonebraker, MIT professor and Turing Award Laureate. The most challenging and time-consuming task of data scientists in the era of Big Data is to consolidate data from different sources, overcoming dirty data, heterogeneity in data representations, and incompleteness of data. In this course, we will surface the entire pipeline of an information integration workflow, by learning about existing integration architectures, algorithms in data cleansing, schema matching, and data fusion. Furthermore, we will discuss state-of-the-art systems and prominent use cases of information integration techniques.

Trainer/in: Ziawasch Abedjan
Trainer/in: Fatemeh Ahmadi
Trainer/in: Mohamed Ahmed Abdelmaksoud Mohamed

[WiSe 2025/26] Large Scale Data Integration (LSDIPro)

In this course, the students will develop solutions for large scale data integration. Working in groups of up to 4 students, the goal is to reproduce an existing research prototype starting from the related paper and enhance it with their own ideas. All groups are accompanied by a mentor from the D2IP group to report and capture progress. The students will learn to implement scalable algorithms, evaluate them systematically, read and interpret technical papers, and critically judge experimental results. At the same time, students will learn to deal with data heterogeneity problems at scale.

Trainer/in: Ziawasch Abedjan
Trainer/in: Luca Zecchini

WS25/26

Kurser

[WiSe 2025/26] Data Integration: Algorithms and Systems

[WiSe 2025/26] Large Scale Data Integration (LSDIPro)