Jialin Ding

Jialin

I am an Assistant Professor in the Computer Science Department at Princeton University. I am also an Amazon Scholar working on data systems within AWS.

I was previously an Applied Scientist at AWS, where I worked on autonomics in Amazon Redshift. I received my PhD from MIT, where I worked in the Data Systems Group and was partly supported by a Meta PhD Fellowship.

I research machine learning and optimization techniques for data systems, with a focus on instance-optimization, a new design paradigm for building data systems that can automatically self-optimize to achieve the best performance for any specific application or use case. I have leveraged instance-optimization to introduce novel designs for data storage layouts (1, 2, 3, 4), database indexes (5, 6, 7) and end-to-end data systems (8, 9). See here for a more detailed description of my past and future research directions.

[CV] [Google Scholar] [Twitter] [Research Statement] [Teaching Statement]

📧 jialind@princeton.edu | 🏢 194 Nassau St, Room 242 (directions); Princeton, NJ 08542

Graduate Students

Karan Tandon (co-advised with Ravi Netravali)
Jinghan Zeng (co-advised with Wyatt Lloyd)
Polly Ren (MSE)
Nicholas Yap (MSE)

Conference Publications

Parachute: Single-Pass Bi-Directional Information Passing. Mihail Stoian, Andreas Zimmerer, Skander Krid, Amadou Latyr Ngom, Jialin Ding, Tim Kraska and Andreas Kipf. VLDB 2025.
Automated Multidimensional Data Layouts in Amazon Redshift. [blog] [press release]
Jialin Ding, Matt Abrams, Sanghita Bandyopadhyay, Luciano Di Palma, Yanzhu Ji, Davide Pagano, Gopal Paliwal, Panos Parchas, Pascal Pfeil, Orestis Polychroniou, Gaurav Saxena, Aamer Shah, Amina Voloder, Sherry Xiao, Davis Zhang, Tim Kraska.
SIGMOD 2024 Industrial Track.
SageDB: An Instance-Optimized Data Analytics System. [talk]
Jialin Ding, Ryan Marcus, Andreas Kipf, Vikram Nathan, Aniruddha Nrusimha, Kapil Vaidya, Alexander van Renen and Tim Kraska.
VLDB 2023.
APEX: A High-Performance Learned Index on Persistent Memory.
Baotong Lu, Jialin Ding, Eric Lo, Umar Farooq Minhas and Tianzheng Wang.
VLDB 2022.
Self-Organizing Data Containers. [talk]
Samuel Madden Jialin Ding, Tim Kraska, Sivaprasad Sudhir, David Cohen, Timothy Mattson and Nesime Tatbul.
CIDR 2022.
Tsunami: A Learned Multi-dimensional Index for Correlated Data and Skewed Workloads. [news] [talk]
Jialin Ding, Vikram Nathan, Mohammad Alizadeh and Tim Kraska.
VLDB 2021.
Instance-Optimized Data Layouts for Cloud Analytics Workloads. [talk]
Jialin Ding, Umar Farooq Minhas, Badrish Chandramouli, Chi Wang, Yinan Li, Ying Li, Donald Kossmann, Johannes Gehrke and Tim Kraska.
SIGMOD 2021.
ALEX: An Updatable Adaptive Learned Index. [talk] [seminar talk] [code]
Jialin Ding, Umar Farooq Minhas, Jia Yu, Chi Wang, Jaeyoung Do, Hantian Zhang, Yinan Li, Badrish Chandramouli, Johannes Gehrke, Donald Kossmann, David Lomet and Tim Kraska.
SIGMOD 2020.
Learning Multi-dimensional Indexes. [talk] [seminar talk]
Vikram Nathan^*, Jialin Ding^*, Mohammad Alizadeh and Tim Kraska.
SIGMOD 2020.
SageDB: A Learned Database System. [the morning paper]
Tim Kraska, Mohammad Alizadeh, Alex Beutel, Ed Chi, Jialin Ding, Ani Kristo, Guillaume Leclerc, Samuel Madden, Hongzi Mao and Vikram Nathan.
CIDR 2019.
Moment-Based Quantile Sketches for Efficient High Cardinality Aggregation Queries. [the morning paper] [blog]
Edward Gan, Jialin Ding, Kai Sheng Tai, Vatsal Sharan and Peter Bailis.
VLDB 2018.

Workshop Publications & Short Papers

TailorSQL: A NL2SQL System Tailored for Your Query Workload. Kapil Vaidya, Jialin Ding, Sebastian Kosak, David Kernert, Chuan Lei, Xiao Qin, Abhinav Tripathy, Ramesh Balan, Balakrishnan Narayanaswamy and Tim Kraska. AIDB Workshop @ VLDB 2025.
Utilizing Past User Feedback for More Accurate Text-to-SQL. Matthias Urban, Jialin Ding, David Kernert, Kapil Vaidya and Tim Kraska. HILDA Workshop @ SIGMOD 2025.
Learning Bit Allocations for Z-Order Layouts in Analytic Data Systems.
Jenny Gao, Jialin Ding, Sivaprasad Sudhir and Samuel Madden.
ML for Systems Workshop @ NeurIPS 2023.
The Case for Learned Spatial Indexes.
Varun Pandey, Alexander van Renen, Andreas Kipf, Ibrahim Sabek, Jialin Ding and Alfons Kemper.
AIDB Workshop @ VLDB 2020.
LISA: Towards Learned DNA Sequence Search.
Darryl Ho, Jialin Ding, Sanchit Misra, Nesime Tatbul, Vikram Nathan, Vasimuddin Md and Tim Kraska.
Systems for ML Workshop @ NeurIPS 2019. Oral Presentation.
Learning Multi-dimensional Indexes. [talk]
Vikram Nathan^*, Jialin Ding^*, Mohammad Alizadeh and Tim Kraska.
ML for Systems Workshop @ NeurIPS 2019. Oral Presentation.
Efficient Mergeable Quantile Sketches using Moments.
Edward Gan, Jialin Ding, Peter Bailis.
SysML 2018. Extended Abstract.
A Machine-Compiled Database of Genome-Wide Association Studies.
Volodymyr Kuleshov, Jialin Ding, Braden Hancock, Alexander Ratner, Christopher Re, Serafim Batzoglou and Michael Snyder.
ISMB 2017. Short Paper.

Journal Publications

A Machine-compiled Database of Genome-wide Association Studies.
Volodymyr Kuleshov, Jialin Ding, Christopher Vo, Braden Hancock, Alexander Ratner, Yang Li, Christopher Ré, Serafim Batzoglou and Michael Snyder
Nature Communications 2019.
MacroBase: Prioritizing Attention in Fast Data.
Firas Abuzaid, Peter Bailis, Jialin Ding, Edward Gan, Samuel Madden, Deepak Narayanan, Kexin Rong and Sahaana Suri.
TODS 2018.

Miscellaneous Publications

Cortex: Harnessing Correlations to Boost Query Performance.
Vikram Nathan, Jialin Ding, Tim Kraska and Mohammad Alizadeh.
CoRR 2020.

Teaching

COS 418 (Distributed Systems): Fall 2025
Teaching Assistant: 6.887 (Machine Learning for Systems), Fall 2021

Service

Program Committees:

VLDB: 2025 (Distinguished Reviewer Award), 2026, 2027
SIGMOD: 2026, 2027
VLDB Demo Track: 2022, 2023

Journal Reviewer:

VLDB Journal: 2023
TKDE: 2020

Miscellaneous:

Student Volunteer: VLDB 2021

Industry Experience

Amazon Scholar @ AWS, 2025-Present
Applied Scientist @ AWS, 2022-2025
Research Intern @ MSR Redmond, Summer 2020
Research Intern @ MSR Redmond, Summer 2018
Software Engineering Intern @ Google, Summer 2016
Software Engineering Intern @ Thumbtack, Summer 2015