MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs.
Ziheng Jiang,
Haibin Lin,
Yinmin Zhong,
Qi Huang,
Yangrui Chen,
Zhi Zhang,
Yanghua Peng,
Xiang Li,
Cong Xie,
Shibiao Nong,
Yulu Jia,
Sun He,
Hongmin Chen,
Zhihao Bai,
Qi Hou,
Shipeng Yan,
Ding Zhou,
Yiyao Sheng,
Zhuo Jiang,
Haohan Xu,
Haoran Wei,
Zhang Zhang,
Pengfei Nie,
Leqi Zou,
Sida Zhao,
Liang Xiang,
Zherui Liu,
Zhe Li,
Xiaoying Jia,
Jianxi Ye,
Xin Jin,
Xin LiuIn NSDI 2024.
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving.
Zhuohan Li,
Lianmin Zheng,
Yinmin Zhong,
Vincent Liu,
Ying Sheng,
Xin Jin,
Yanping Huang,
Zhifeng Chen,
Hao Zhang,
Joseph E. Gonzalez,
Ion StoicaIn OSDI 2023.
FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion.
Liwen Chang,
Wenlei Bao,
Qi Hou,
Chengquan Jiang,
Ningxin Zheng,
Yinmin Zhong,
Xuanrun Zhang,
Zuquan Song,
Ziheng Jiang,
Haibin Lin,
Xin Jin,
Xin LiuIn Preprint.