"GPU"

Distributed GPU Joins on Fast RDMA-capable Networks

In this paper, we present a novel pipelined GPU join that accelerates the performance of distributed DBMSs by leveraging GPU resources on fast networks. A key insight is that we enable pipelined join execution by overlapping the network shuffling …

FA2: Fast, Accurate Autoscaling for Serving Deep Learning Inference with SLA Guarantees

Deep learning (DL) inference has become an essential building block in modern intelligent applications. Due to the high computational intensity of DL, it is critical to scale DL inference serving systems in response to fluctuating workloads to …