Wolfgang Gatterbauer Hear name

Associate Professor
Khoury College of Computer Sciences
440 Huntington Avenue
Northeastern University
Boston, MA 02115

+1 (617) 373-2462
Office: 450 West Village H

Northeastern Datalab

I am working on the theory of scalable data management. One of my goals is to extend the capabilities of modern data management systems in generic ways to allow them to support novel functionalities that seem hard at first. Examples of such functionalities are managing provenance, trust, explanations, and uncertain or inconsistent data. To support these functionalities, I am interested in understanding the fundamental algebraic properties that allow algorithms to scale with the size of data by leveraging structure in data: Given a large data or knowledge base, what types of questions can be answered efficiently? And what do we do about those that cannot?

For the hard questions, this work of line tries to find ways to change the objective in a way that qualitatively preserves the original motivation, yet installs those desirable algebraic properties (something we call ''algebraic cheating''). Our work has shown that approaches that leverage those properties and optimize for the overall end-to-end goal can work with less training data and achieve remarkable speed-ups.

Thanks to NSF for supporting us under NSF Career Award IIS-1762268 and NSF Award IIS-1956096. Also thanks a lot to the anonymous reviewers and respective committees for selecting one of our SIGMOD 2024 papers for an honorable mention, our EDBT 2021 paper for the best paper award, our PODS 2021, SIGMOD 2017, VLDB 2015, and WALCOM 2017 papers among "best of conference", and two of our SIGMOD 2020 papers for reproducibility awards.

ORCID, Google scholar, DBLP, ArXiv, ACM profile.

Before academia, I worked as an associate for McKinsey & Co. My first university degree is a Dipl.-Ing. in Mechanical Engineering. I also won a bronze medal in the International Physics Olympiad (IPHO).

To Prospective students

Our DATA lab is growing and we are actively looking for students with strong foundations in algorithms, theory, discrete math, data management, and machine learning. Please visit our research opportunities and the topics page of my class ``Principles of scalable data management." Notice I am a big fan of Ray Dalio's principles applied to research (please read this excerpt to see how we like to work in my group) and Barbara Minto's Pyramid Principle.

I have been working or co-advising a number of students over the years, not always as direct advisor. My current directly advised PhD students are: Nikos Tziavelis (Google PhD fellowship recipient 2022 and UC Santa Cruz faculty from fall 2024), Neha Makhija, and Agapi Rissaki.

If you are a prospective PhD student, please read this Data lab research opportunities page before sending me an email. In sharp contrast to prevailing norms, I value good results in standardized test scores and math or science competitions.