Xudong Liao

PhD Candidate, HKUST, Hong Kong SAR, China

xudong.jpg

I am a Ph.D candidate in Hong Kong University of Science and Technology (HKUST), advised by Prof. Kai Chen. Before that, I received my B.Eng in Software Engineering in Wuhan University (Outstanding Graduate) in 2020.

In my research projects, I focus on:

  • developing application-oriented optimizations for distributed systems, including Herald. These systems are designed to enhance performance by leveraging unique application characteristics, such as utilizing embedding access patterns in DLRM training within Herald.
  • building performant congestion control (CC) schemes using reinforcement learning techniques, including Astraea, Spine, MOCC and Jury. These initiatives are driven by my goal to make Deep Reinforcement Learning (DRL)-based CC schemes fair, efficient and also practical for real-world deployment.

I was fortunate to be advised by Prof. Yanjiao Chen during my time at WHU. Additionally, I am fortunate to collaborate closely with Prof. Guyue Liu from Peking University and Dr. Zhizhen Zhong from MIT on several recent projects.

Research Interests

  • Machine Learning System
  • Optical Network
  • Congestion Control
  • Datacenter Networking

news

Apr 25, 2025 Pallas accepted to ATC 2025!
Jan 10, 2024 Astraea accepted to EuroSys 2024!
Dec 07, 2023 Herald accepted to NSDI 2024!
Feb 24, 2023 G3 accepted to SIGMOD 2023!
Nov 30, 2022 Spine accepted to CoNEXT 2022!

selected publications

* equal contribution

  1. arXiv
    mFabric: An Efficient and Scalable Fabric for Mixture-of-Experts Training
    Xudong Liao, Yijun Sun, Han TianXinchen WanYilun Jin , Zilong Wang, Zhenghang Ren, Xinyang Huang, Wenxue Li, Kin Fai Tse, Zhizhen Zhong, Guyue Liu , Ying Zhang, Xiaofeng Ye , Yiming Zhang, and Kai Chen
    arXiv:2501.03905, 2025
  2. ATC
    Towards Optimal Rack-scale μs-level CPU Scheduling through In-Network Workload Shaping
    Xudong LiaoHan TianXinchen WanChaoliang ZengHao WangJunxue Zhang, Mengyu Ma, Guyue Liu, and Kai Chen
    In 2025 USENIX Annual Technical Conference (ATC 2025) , 2025
  3. OSDI
    Enabling Efficient GPU Communication over Multiple NICs with FuseLink
    Zhenghang Ren, Yuxuan Li , Zilong Wang, Xinyang Huang, Wenxue Li, Kaiqiang Xu, Xudong Liao, Yijun Sun, Bowen Liu, Han TianJunxue Zhang , Mingfei Wang, Zhizhen Zhong, Guyue Liu , Ying Zhang, and Kai Chen
    In Proceedings of the 19th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2025) , 2025
  4. EuroSys
    Achieving Fairness Generalizability for Learning-based Congestion Control with Jury
    Han TianXudong Liao, Decang Sun, Chaoliang ZengYilun JinJunxue ZhangXinchen Wan , Zilong Wang , Yong Wang, and Kai Chen
    In Proceedings of the 20th ACM European Conference on Computer Systems (EuroSys 2025) , 2025
  5. INFOCOM
    A Generic and Efficient Communication Framework for Message-level In-Network Computing
    Xinchen Wan, Luyang Li, Han TianXudong Liao, Xinyang Huang, Chaoliang Zeng , Zilong Wang, Xinyu Yang, Ke Cheng, Qingsong Ning, Guyue Liu, Layong Luo, and Kai Chen
    In Proceedings of the IEEE International Conference on Computer Communications (INFOCOM 2025) , 2025
  6. EuroSys
    Astraea: Towards Fair and Efficient Learning-based Congestion Control
    Xudong Liao*Han Tian*Chaoliang ZengXinchen Wan, and Kai Chen
    In Proceedings of the 19th ACM European Conference on Computer Systems (EuroSys 2024) , 2024
  7. NSDI
    Accelerating Neural Recommendation Training with Embedding Scheduling
    Chaoliang Zeng*Xudong Liao*, Xiaodian Cheng, Han TianXinchen WanHao Wang, and Kai Chen
    In Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 2024) , 2024
  8. SIGMOD
    Scalable and Efficient Full-Graph GNN Training for Large Graphs
    Xinchen Wan, Kaiqiang Xu, Xudong LiaoYilun JinKai Chen , and Xin Jin
    In Proceedings of the ACM on Management of Data (SIGMOD 2023) , 2023
  9. CoNEXT
    Spine: An Efficient DRL-Based Congestion Control with Ultra-Low Overhead
    Han Tian*Xudong Liao*Chaoliang ZengJunxue Zhang, and Kai Chen
    In Proceedings of the 18th International Conference on Emerging Networking EXperiments and Technologies (CoNEXT 2022) , 2022
  10. EuroSys
    Multi-Objective Congestion Control
    Yiqing Ma, Han TianXudong LiaoJunxue Zhang , Weiyan Wang, Kai Chen , and Xin Jin
    In Proceedings of the 17th European Conference on Computer Systems (EuroSys 2022) , 2022