From Joins to Aggregates and Optimization Problems

Dan Olteanu (University of Oxford).


Course Summary | About the lecturer | Location and schedule | Materials

Course summary:

The course introduces recent development on solving a host of computational problems in the database. The unifying theme underlying this development is the use of the structure of the problem to avoid redundancy in data representation and computation.

The first part overviews recent work on worst-case optimal join algorithms for two representation formalisms for relational data, the standard listing representation and the factorized representation.

The second part goes beyond join queries and looks at functional aggregate queries, which are ubiquitous across Computer Science and can express problems in logic, probabilistic graphical models, CSP, linear algebra, and databases. This part also sketches on-going development on the computation of queries under data updates, with a particular focus on counting triangles under updates in worst-case optimal time.

The third part shows how the development introduced in the first two parts can be applied to the in-database computation of optimization problems used in machine learning, such as learning regression and classification models. Time permitting, it also highlights on-going development on in-database computation for linear algebra such as the Gram-Schmidt QR decomposition and the low-rank (PCA) factorization of matrices defined by database joins.

Each of the three parts concludes with incomplete lists of exciting open research problems and of highly relevant work that builds on the concepts presented in this course, and a quiz.

While originally theoretical in nature, the ideas presented in this course have led to practical and very competitive tools reported by both academia and industry, such as the LeapFrog TrieJoin algorithm used by the LogicBlox commercial database system, the FDB algorithm for factorized query computation, and the AC/DC system for learning models over relational data.

About the lecturer:

Dan Olteanu is Professor of Computer Science at the University of Oxford and Computer Scientist at RelationalAI. He received his PhD from the University of Munich in 2005. He spends his time understanding hard computational challenges and designing simple and scalable solutions towards these challenges. He has published over 70 papers in the areas of database systems, AI, and theoretical computer science, contributing to XML query processing, incomplete information and probabilistic databases, factorized databases, scalable and incremental in-database optimization, and the commercial systems LogicBlox and RelationalAI. He co-authored the book "Probabilistic Databases" (2011). He has served as associate editor for PVLDB and IEEE TKDE, as track chair for IEEE ICDE'15, group leader for ACM SIGMOD'15, vice chair for ACM SIGMOD'17, and co-chair for AMW'18, and he is currently serving as associate editor for ACM TODS. He is the recipient of an ERC Consolidator grant (2016) and an Oxford Outstanding Teaching award (2009).

Location and schedule:

MIMUW building, room 5440

Thursday, November 22
14:15 - 15:45 lecture
15:45 - 16:00coffee and cake
16:00 - 17:00 lecture
Friday, November 23
14:15 - 15:45 lecture
15:45 - 16:00 coffee and cake
16:00 - 17:00 lecture
Saturday, November 24
10:15 - 11:45 lecture
11:45 - 12:00 coffee and cake
12:00 - 13:00 lecture

Materials

Videos: The video recordings of the lectures are now available here.


Assignment: problem set. Please submit the solutions by email to dan.olteanu at cs.ox.ac.uk by Sunday January 27. Also, in case there are any questions concerning the exam, feel free to ask the lecturer.