Yinmin Zhong

Yinmin Zhong

Ph.D. Student

Peking University

zhongyinmin [at] pku.edu.cn

About Me

I am a third-year Ph.D. student studying computer science in the Computer Systems Research Group at Peking University where I am advised by Xin Jin. Before that, I received my B.S. in Computer Science from Peking University.

I have a broad interest in building efficient systems for training and serving deep learning models, with a primary focus on large language models (LLMs) currently.

I am also an enthusiastic self-learner and interested in various fields of computer science. I have built a website to share my self-learning experiences and resources.

Interests
  • Machine Learning Systems
  • Distributed Systems
  • Large Language Models
Education
  • Peking University

    Ph.D. in Computer Science, Sep 2022 - Present

  • Peking University

    B.S. in Computer Science, Sep 2018 - June 2022

Experience
  • StepFun System Team

    Research Intern, June 2024 - Present

  • ByteDance AML Team

    Research Intern, Aug 2023 - May 2024

  • Alibaba DAMO Academy

    Research Intern, Sep 2021 - Sep 2022

  • AI Innovation Center, Peking University

    Software Engineer Intern, Sep 2020 - Mar 2021

Publications

RLHFuse: Efficient RLHF Training for Large Language Models with Inter- and Intra-Stage Fusion.
DistTrain: Addressing Model and Data Heterogeneity with Disaggregated Training for Multimodal Large Language Models.
DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving.
LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism.
Aquifer: Transparent Microsecond-scale Scheduling for vRAN Workloads.
MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs.
DistMind: Efficient Resource Disaggregation for Deep Learning Workloads.
Fast Distributed Inference Serving for Large Language Models.
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving.
ElasticFlow: An Elastic Serverless Training Platform for Distributed Deep Learning.