The Case for Multi-Task Zero-Shot Learning for Databases

International Workshop on Applied AI for Database Systems and Applications (AIDB) colocated with VLDB 2022


Recently, machine learning has successfully been applied to many database problems such as query optimization, physical design tuning, or cardinality estimation. However, the predominant paradigm to design such learned database components is workload-driven learning, where a representative workload has to be executed on the database to gather training data. This costly procedure has to be repeated for every new database a model should be trained on. Hence, recently it was suggested to train zero-shot cost models that are pretrained once and can generalize to unseen databases out-of-the-box. While the results for the task of cost estimation are promising, it is unclear how to generalize this approach to addi- tional tasks beyond query latency prediction. Hence, in this paper, we propose several directions to generalize zero-shot cost models to other tasks and validate our approaches in two case studies.

AIDB workshop in VLDB 2022