I am a fifth-year Ph.D. candidate in Computer Science at Georgia Institute of Technology, working at SSLAB coadvised by Taesoo Kim and Anand Iyer. Before Georgia Tech, I graduated from The Chinese University of Hong Kong with a Bachelor's degree in Computer Science. My research interests include systems for deep graph learning and general machine learning. I am currectly exploring systems aspects of training/serving dynamic GNNs and Graph-based RAG for LLMs.
While large-scale training has enabled models with unprecedented capabilities, it also introduces significant risks: from misuse in malicious settings to regulatory non-compliance, e.g., autonomous vulnerability exploitation. We are currently developing such verifiable ML training pipelines through the application of Confident Computing (CC). CC provides a hardware-based secure method of verifying the training process, protecting both the model provider’s and data provider’s sensitive data while allowing third-party verifiers to audit the training process.
March. 2024-Present
Machine learning inference platforms continue to face high request rates and strict latency constraints. Existing solutions largely focus on compressing models to substantially lower compute costs with mild accuracy degradations. We explores an alternate (but complementary) technique that trades off accuracy and resource costs on a per-input granularity: early exit models, which selectively allow certain inputs to exit a model from an intermediate layer. We present the first system that makes early exit models practical for realistic inference deployments. Our key insight is to split and replicate blocks of layers in models in a manner that maintains a constant batch size throughout execution, all the while accounting for resource requirements and communication overheads. Evaluations show that we accelerate goodput of early-exit model inference for autoregressive LLMs (2.8-3.8x) and compressed models (1.67x).
May. 2023-Sep. 2024
Existing systems for processing static GNNs either do not support dynamic GNNs, or are inefficient in doing so. In this project, we are building a system that supports dynamic GNNs efficiently. Based on the observation that existing proposals for dynamic GNN architectures combine techniques for structural and temporal information encoding independently, we propose novel techniques that enable cross optimizations across various tasks such as traffic forecasting, anomaly detection, and epidemiological forecasting.
May. 2021-Dec. 2023
Anand Iyer, Mingyu Guan, Yinwei Dai, Rui Pan, Swapnil Gandhi, Ravi Netravali
In Proceedings of the 30th Symposium on Operating Systems Principles (SOSP)
Austin, TX, USA, Nov 2024
Mingyu Guan, Jack W. Stokes, Qinlong Luo, Fuchen Liu, Purvanshi Mehta, Elnaz Nouri, Taesoo Kim
arXiv preprint
arXiv:2402.13496, Feb 2024
Mingyu Guan, Anand Padmanabha Iyer, Taesoo Kim
In Proceedings of the 5th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA) (GRADES-NDA)
Philadelphia, PA, USA, June 2022