I am a fourth year CS PhD in the MIT Data Systems Group, where I am advised by Prof. Tim Kraska. My research focuses on applying machine learning to database systems.
I also collaborate with Umar Farooq Minhas and the Database Group at Microsoft Research on learned data structures. Prior to MIT, I was an undergraduate at Stanford University, where I worked on data-intensive systems with Prof. Peter Bailis as part of Stanford DAWN. My research is partly supported by a Facebook Fellowship.
[CV] [Google Scholar]
APEX: A High-Performance Learned Index on Persistent Memory.
Baotong Lu, Jialin Ding, Eric Lo, Umar Farooq Minhas and Tianzheng Wang.
VLDB 2022.
Self-Organizing Data Containers.
Samuel Madden Jialin Ding, Tim Kraska, Sivaprasad Sudhir, David Cohen, Timothy Mattson and Nesime Tatbul.
CIDR 2022.
Tsunami: A Learned Multi-dimensional Index for Correlated Data and Skewed Workloads. [news]
Jialin Ding, Vikram Nathan, Mohammad Alizadeh and Tim Kraska.
VLDB 2021.
Instance-Optimized Data Layouts for Cloud Analytics Workloads.
Jialin Ding, Umar Farooq Minhas, Badrish Chandramouli, Chi Wang, Yinan Li, Ying Li, Donald Kossmann, Johannes Gehrke and Tim Kraska.
SIGMOD 2021.
Cortex: Harnessing Correlations to Boost Query Performance.
Vikram Nathan, Jialin Ding, Tim Kraska and Mohammad Alizadeh.
CoRR 2020.
The Case for Learned Spatial Indexes.
Varun Pandey, Alexander van Renen, Andreas Kipf, Ibrahim Sabek, Jialin Ding and Alfons Kemper.
AIDB Workshop @ VLDB 2020.
ALEX: An Updatable Adaptive Learned Index. [talk] [seminar talk] [code]
Jialin Ding, Umar Farooq Minhas, Jia Yu, Chi Wang, Jaeyoung Do, Hantian Zhang, Yinan Li, Badrish Chandramouli, Johannes Gehrke, Donald Kossmann, David Lomet and Tim Kraska.
SIGMOD 2020.
Learning Multi-dimensional Indexes. [talk] [seminar talk]
Vikram Nathan*, Jialin Ding*, Mohammad Alizadeh and Tim Kraska.
SIGMOD 2020.
LISA: Towards Learned DNA Sequence Search.
Darryl Ho, Jialin Ding, Sanchit Misra, Nesime Tatbul, Vikram Nathan, Vasimuddin Md and Tim Kraska.
Systems for ML Workshop @ NeurIPS 2019. Oral Presentation.
Learning Multi-dimensional Indexes. [talk]
Vikram Nathan*, Jialin Ding*, Mohammad Alizadeh and Tim Kraska.
ML for Systems Workshop @ NeurIPS 2019. Oral Presentation.
SageDB: A Learned Database System. [the morning paper]
Tim Kraska, Mohammad Alizadeh, Alex Beutel, Ed Chi, Jialin Ding, Ani Kristo, Guillaume Leclerc, Samuel Madden, Hongzi Mao and Vikram Nathan.
CIDR 2019.
A Machine-compiled Database of Genome-wide Association Studies.
Volodymyr Kuleshov, Jialin Ding, Christopher Vo, Braden Hancock, Alexander Ratner, Yang Li, Christopher RĂ©, Serafim Batzoglou and Michael Snyder
Nature Communications 2019.
Moment-Based Quantile Sketches for Efficient High Cardinality Aggregation Queries. [the morning paper] [blog]
Edward Gan, Jialin Ding, Kai Sheng Tai, Vatsal Sharan and Peter Bailis.
VLDB 2018.
Efficient Mergeable Quantile Sketches using Moments.
Edward Gan, Jialin Ding, Peter Bailis.
SysML 2018. Extended Abstract.
MacroBase: Prioritizing Attention in Fast Data.
Firas Abuzaid, Peter Bailis, Jialin Ding, Edward Gan, Samuel Madden, Deepak Narayanan, Kexin Rong and Sahaana Suri.
TODS 2018.
A Machine-Compiled Database of Genome-Wide Association Studies.
Volodymyr Kuleshov, Jialin Ding, Braden Hancock, Alexander Ratner, Christopher Re, Serafim Batzoglou and Michael Snyder.
ISMB 2017. Short Paper.
jialind@mit.edu