Mr. biq: Post-training non-uniform quantization based on minimizing the reconstruction error Y Jeon, C Lee, E Cho, Y Ro Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 45 | 2022 |
Multi-dimensional parallel training of winograd layer on memory-centric architecture B Hong, Y Ro, J Kim 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture …, 2018 | 22 | 2018 |
{RingLeader}: efficiently Offloading {Intra-Server} Orchestration to {NICs} J Lin, A Cardoza, T Khan, Y Ro, BE Stephens, H Wassel, A Akella 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2023 | 16 | 2023 |
Ffn-skipllm: A hidden gem for autoregressive decoding with adaptive feed forward skipping A Jaiswal, B Hu, L Yin, Y Ro, S Liu, T Chen, A Akella arXiv preprint arXiv:2404.03865, 2024 | 7 | 2024 |
Post-training weighted quantization of neural networks for language models SJ Kwon, D Lee, Y Jeon, B Kim, BS Park, Y Ro | 3 | 2021 |
Ghost routing to enable oblivious computation on memory-centric networks Y Ro, S Jin, J Huh, J Kim 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture …, 2021 | 2 | 2021 |
Q-Rater: Non-convex optimization for post-training uniform quantization B Kim, D Lee, Y Ro, Y Jeon, SJ Kwon, B Park, D Oh arXiv preprint arXiv:2105.01868, 2021 | 2 | 2021 |
: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design R Cai, Y Ro, GW Kim, P Wang, B Ehteshami Bejnordi, A Akella, Z Wang Advances in Neural Information Processing Systems 37, 116126-116148, 2024 | | 2024 |
Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design R Cai, Y Ro, GW Kim, P Wang, BE Bejnordi, A Akella, Z Wang arXiv preprint arXiv:2410.19123, 2024 | | 2024 |
Electronic device and control method therefor B Kim, D Lee, K Sejung, RO Yeonju, P Baeseong, J Yongkweon US Patent App. 18/131,164, 2023 | | 2023 |
Lowering the Pre-training Tax for Gradient-based Subset Training: A Lightweight Distributed Pre-Training Toolkit Y Ro, Z Wang, V Chidambaram, A Akella International Conference on Machine Learning, 29130-29142, 2023 | | 2023 |
Dataset Efficient Training with Model Ensembling Y Ro, C Xu, A Ciborowska, S Bhattacharya, F Li, M Foltin Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | | 2023 |
Sequential Encryption of Sparse Neural Networks Toward Optimum Representation of Irregular Sparsity. B Park, SJ Kwon, D Lee, D Oh, B Kim, Y Jeon, Y Ro CoRR, 2021 | | 2021 |
Optimizing Transformer Inference with Selective Distillation: Layerwise Conversion to Linear Attention Y Ro, Z Zhang, V Chidambaram, A Akella | | |