Opt: Open pre-trained transformer language models S Zhang, S Roller, N Goyal, M Artetxe, M Chen, S Chen, C Dewan, ... arXiv preprint arXiv:2205.01068, 2022 | 2632 | 2022 |
Few-shot learning with multilingual language models XV Lin, T Mihaylov, M Artetxe, T Wang, S Chen, D Simig, M Ott, N Goyal, ... arXiv preprint arXiv:2112.10668, 2021 | 206* | 2021 |
Semdedup: Data-efficient learning at web-scale through semantic deduplication A Abbas, K Tirumala, D Simig, S Ganguli, AS Morcos arXiv preprint arXiv:2303.09540, 2023 | 140 | 2023 |
Opt-iml: Scaling language model instruction meta learning through the lens of generalization S Iyer, XV Lin, R Pasunuru, T Mihaylov, D Simig, P Yu, K Shuster, T Wang, ... arXiv preprint arXiv:2212.12017, 2022 | 93 | 2022 |
D4: Improving llm pretraining via document de-duplication and diversification K Tirumala, D Simig, A Aghajanyan, A Morcos Advances in Neural Information Processing Systems 36, 53983-53995, 2023 | 79 | 2023 |
MEGABYTE: modeling million-byte sequences with multiscale transformers L Yu, D Simig, C Flaherty, A Aghajanyan, L Zettlemoyer, M Lewis Proceedings of the 37th International Conference on Neural Information …, 2023 | 70* | 2023 |
Understanding in-context learning via supportive pretraining data X Han, D Simig, T Mihaylov, Y Tsvetkov, A Celikyilmaz, T Wang arXiv preprint arXiv:2306.15091, 2023 | 40 | 2023 |
Opt: Open pre-trained transformer language models. arXiv 2022 S Zhang, S Roller, N Goyal, M Artetxe, M Chen, S Chen, C Dewan, ... arXiv preprint arXiv:2205.01068, 2023 | 30 | 2023 |
Open vocabulary extreme classification using generative models D Simig, F Petroni, P Yanki, K Popat, C Du, S Riedel, M Yazdani arXiv preprint arXiv:2205.05812, 2022 | 21 | 2022 |
Text characterization toolkit (TCT) D Simig, T Wang, V Dankers, P Henderson, K Batsuren, D Hupkes, ... Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the …, 2022 | 7* | 2022 |
Evaluating end-to-end entity linking on domain-specific knowledge bases: Learning about ancient technologies from museum collections S Cadavid-Sanchez, K Kacem, RAM Frade, J Boehm, T Chaney, ... arXiv preprint arXiv:2305.14588, 2023 | 1 | 2023 |
Turning Flows into Trees: Graph Analytics for Aerodynamic Flows D Simig, P Kelly | | 2016 |
Natural Language to Neural Programs D Simig | | |