Flamingo: a visual language model for few-shot learning JB Alayrac, J Donahue, P Luc, A Miech, I Barr, Y Hasson, K Lenc, ... Advances in neural information processing systems 35, 23716-23736, 2022 | 2004 | 2022 |
Training compute-optimal large language models J Hoffmann, S Borgeaud, A Mensch, E Buchatskaya, T Cai, E Rutherford, ... arXiv preprint arXiv:2203.15556, 2022 | 965 | 2022 |
Scaling language models: Methods, analysis & insights from training gopher JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, F Song, J Aslanides, ... arXiv preprint arXiv:2112.11446, 2021 | 768 | 2021 |
Improving language models by retrieving from trillions of tokens S Borgeaud, A Mensch, J Hoffmann, T Cai, E Rutherford, K Millican, ... International conference on machine learning, 2206-2240, 2022 | 669 | 2022 |
Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023 | 550 | 2023 |
Cyprien de Masson d’Autume JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, F Song, J Aslanides, ... | 83 | 2021 |
An empirical analysis of compute-optimal large language model training J Hoffmann, S Borgeaud, A Mensch, E Buchatskaya, T Cai, E Rutherford, ... Advances in Neural Information Processing Systems 35, 30016-30030, 2022 | 67 | 2022 |
Gemma: Open models based on gemini research and technology G Team, T Mesnard, C Hardin, R Dadashi, S Bhupatiraju, S Pathak, ... arXiv preprint arXiv:2403.08295, 2024 | 50 | 2024 |
Cyprien de Masson d’Autume, Yujia Li, Tayfun Terzi, Vladimir Mikulik, Igor Babuschkin, Aidan Clark, Diego de Las Casas, Aurelia Guy, Chris Jones, James Bradbury, Matthew J JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, HF Song, J Aslanides, ... Johnson, Blake A. Hechtman, Laura Weidinger, Iason Gabriel, William S. Isaac …, 2021 | 48 | 2021 |
Mikoł aj Binkowski, Ricardo Barreira, Oriol Vinyals, Andrew Zisserman, and Karén Simonyan. Flamingo: a visual language model for few-shot learning JB Alayrac, J Donahue, P Luc, A Miech, I Barr, Y Hasson, K Lenc, ... Advances in Neural Information Processing Systems 35, 23716-23736, 2022 | 40 | 2022 |
Unified scaling laws for routed language models A Clark, D de Las Casas, A Guy, A Mensch, M Paganini, J Hoffmann, ... International conference on machine learning, 4057-4086, 2022 | 34 | 2022 |
Flamingo: a visual language model for few-shot learning, 2022 JB Alayrac, J Donahue, P Luc, A Miech, I Barr, Y Hasson, K Lenc, ... URL https://arxiv. org/abs/2204.14198, 0 | 23 | |
Flamingo: A visual language model for few-shot learning. arXiv 2022 JB Alayrac, J Donahue, P Luc, A Miech, I Barr, Y Hasson, K Lenc, ... arXiv preprint arXiv:2204.14198, 0 | 20 | |
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context M Reid, N Savinov, D Teplyashin, D Lepikhin, T Lillicrap, J Alayrac, ... arXiv preprint arXiv:2403.05530, 2024 | 18 | 2024 |
Scaling Language Models: Methods JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, HF Song, J Aslanides, ... Analysis & Insights from Training Gopher. arXiv, 2021 | 18 | 2021 |
Flamingo: a visual language model for few-shot learning. arXiv preprint arXiv: 220414198 JB Alayrac, J Donahue, P Luc, A Miech, I Barr, Y Hasson, K Lenc, ... | 17 | 2022 |
Driessche J Hoffmann, S Borgeaud, A Mensch, E Buchatskaya, T Cai, E Rutherford, ... G. vd, Damoc, B., Guy, A., Osindero, S., Simonyan, K., Elsen, E., Rae, JW …, 2022 | 16 | 2022 |
Driessche, G. vd, Lespiau, J S Borgeaud, A Mensch, J Hoffmann, T Cai, E Rutherford, K Millican B., Damoc, B., Clark, A., et al, 2206-2240, 2021 | 16 | 2021 |
Training compute-optimal large language models. arXiv 2022 J Hoffmann, S Borgeaud, A Mensch, E Buchatskaya, T Cai, E Rutherford, ... arXiv preprint arXiv:2203.15556 10, 2022 | 15 | 2022 |
Scaling language models: Methods, analysis & insights from training gopher. arXiv 2021 JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, F Song, J Aslanides, ... arXiv preprint arXiv:2112.11446, 2021 | 10 | 2021 |