Follow
Yanghua Peng
Title
Cited by
Cited by
Year
Optimus: an efficient dynamic resource scheduler for deep learning clusters
Y Peng, Y Bao, Y Chen, C Wu, C Guo
Proceedings of the Thirteenth EuroSys Conference, 1-14, 2018
2452018
A generic communication scheduler for distributed dnn training acceleration
Y Peng, Y Zhu, Y Chen, Y Bao, B Yi, C Lan, C Wu, C Guo
Proceedings of the 27th ACM Symposium on Operating Systems Principles, 16-29, 2019
1692019
Deep learning-based job placement in distributed machine learning clusters
Y Bao, Y Peng, C Wu
IEEE INFOCOM 2019-IEEE conference on computer communications, 505-513, 2019
642019
Online job scheduling in distributed machine learning clusters
Y Bao, Y Peng, C Wu, Z Li
IEEE INFOCOM 2018-IEEE Conference on Computer Communications, 495-503, 2018
642018
Dynamic scaling of virtualized, distributed service chains: A case study of IMS
J Duan, C Wu, F Le, AX Liu, Y Peng
IEEE Journal on Selected Areas in Communications 35 (11), 2501-2511, 2017
302017
Dl2: A deep learning-driven scheduler for deep learning clusters
Y Peng, Y Bao, Y Chen, C Wu, C Meng, W Lin
IEEE Transactions on Parallel and Distributed Systems 32 (8), 1947-1960, 2021
292021
deTector: a Topology-aware Monitoring System for Data Center Networks
Y Peng, J Yang, C Wu, C Guo, C Hu, Z Li
2017 USENIX Annual Technical Conference (USENIX ATC 17), 55-68, 2017
262017
Preemptive all-reduce scheduling for expediting distributed dnn training
Y Bao, Y Peng, Y Chen, C Wu
IEEE INFOCOM 2020-IEEE Conference on Computer Communications, 626-635, 2020
232020
Elastic parameter server load distribution in deep learning clusters
Y Chen, Y Peng, Y Bao, C Wu, Y Zhu, C Guo
Proceedings of the 11th ACM Symposium on Cloud Computing, 507-521, 2020
62020
BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing
T Liu, Y Chen, D Li, C Wu, Y Zhu, J He, Y Peng, H Chen, H Chen, C Guo
arXiv preprint arXiv:2112.08541, 2021
22021
dPRO: A Generic Profiling and Optimization System for Expediting Distributed DNN Training
H Hu, C Jiang, Y Zhong, Y Peng, C Wu, Y Zhu, H Lin, C Guo
arXiv preprint arXiv:2205.02473, 2022
2022
dPRO: A Generic Performance Diagnosis and Optimization Toolkit for Expediting Distributed DNN Training
H Hu, C Jiang, Y Zhong, Y Peng, C Wu, Y Zhu, H Lin, C Guo
Proceedings of Machine Learning and Systems 4, 2022
2022
分布式深度学习训练的通信加速
Y Jiang, Y Peng, Y Zhu, C Guo
Communications of the CCF 17 (9), 18-25, 2021
2021
Accelerating distributed DNN training in AI clouds
Y Peng
HKU Theses Online (HKUTO), 2020
2020
Journal: Proceedings of the Thirteenth EuroSys Conference, 2018
Y Peng, Y Bao, Y Chen, C Wu, C Guo
The system can't perform the operation now. Try again later.
Articles 1–15