Publications

(2025). VQ-LLM: High-performance Code Generation for Vector Quantization Augmented LLM Inference. HPCA 2025.
(2024). JUNO: Optimizing High-Dimensional Approximate Nearest Neighbour Search with Sparsity-Aware Algorithm and Ray-Tracing Core Mapping. ASPLOS 2024.