An empirical analysis of compute-optimal large language model training J Hoffmann, S Borgeaud, A Mensch, E Buchatskaya, T Cai, E Rutherford, ... Advances in Neural Information Processing Systems 35, 30016-30030, 2022 | 1774* | 2022 |
Scaling language models: Methods, analysis & insights from training gopher JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, F Song, J Aslanides, ... arXiv preprint arXiv:2112.11446, 2021 | 1077 | 2021 |
Improving language models by retrieving from trillions of tokens S Borgeaud, A Mensch, J Hoffmann, T Cai, E Rutherford, K Millican, ... International conference on machine learning, 2206-2240, 2022 | 1016 | 2022 |
Skilful precipitation nowcasting using deep generative models of radar S Ravuri, K Lenc, M Willson, D Kangin, R Lam, P Mirowski, M Fitzsimons, ... Nature 597 (7878), 672-677, 2021 | 826 | 2021 |
Stabilizing transformers for reinforcement learning E Parisotto, F Song, J Rae, R Pascanu, C Gulcehre, S Jayakumar, ... International conference on machine learning, 7487-7498, 2020 | 426 | 2020 |
Adversarial video generation on complex datasets A Clark, J Donahue, K Simonyan arXiv preprint arXiv:1907.06571, 2019 | 320* | 2019 |
High fidelity speech synthesis with adversarial networks M Bińkowski, J Donahue, S Dieleman, A Clark, E Elsen, N Casagrande, ... arXiv preprint arXiv:1909.11646, 2019 | 305 | 2019 |
The DeepMind JAX Ecosystem I Babuschkin, K Baumli, A Bell, S Bhupatiraju, J Bruce, P Buchlovsky, ... URL http://github. com/deepmind 24, 25, 2020 | 166* | 2020 |
Unified scaling laws for routed language models A Clark, D de Las Casas, A Guy, A Mensch, M Paganini, J Hoffmann, ... International conference on machine learning, 4057-4086, 2022 | 154* | 2022 |
V-mpo: On-policy maximum a posteriori policy optimization for discrete and continuous control HF Song, A Abdolmaleki, JT Springenberg, A Clark, H Soyer, JW Rae, ... arXiv preprint arXiv:1909.12238, 2019 | 122 | 2019 |
Transformation-based adversarial video prediction on large-scale data P Luc, A Clark, S Dieleman, DL Casas, Y Doron, A Cassirer, K Simonyan arXiv preprint arXiv:2003.04035, 2020 | 71 | 2020 |
Gpt-4o system card A Hurst, A Lerer, AP Goucher, A Perelman, A Ramesh, A Clark, AJ Ostrow, ... arXiv preprint arXiv:2410.21276, 2024 | 31 | 2024 |
TF-Replicator: Distributed machine learning for researchers P Buchlovsky, D Budden, D Grewe, C Jones, J Aslanides, F Besse, ... arXiv preprint arXiv:1902.00465, 2019 | 25 | 2019 |
Podracer architectures for scalable reinforcement learning M Hessel, M Kroiss, A Clark, I Kemaev, J Quan, T Keck, F Viola, ... arXiv preprint arXiv:2104.06272, 2021 | 24 | 2021 |
Unsupervised authorial clustering based on syntactic structure A Daks, A Clark Proceedings of the ACL 2016 Student Research Workshop, 114-118, 2016 | 9 | 2016 |
Generative Adversarial Networks with Temporal and Spatial Discriminators for Efficient Video Generation A Clark, J Donahue, K Simonyan US Patent App. 17/613,694, 2022 | 3 | 2022 |
Recurrent unit for generating or processing a sequence of images LUC Pauline, A Clark, SEL Dieleman, K Simonyan US Patent App. 17/797,198, 2023 | 1 | 2023 |
Training conditional computation neural networks using reinforcement learning A Clark, A Mensch US Patent App. 18/076,978, 2023 | | 2023 |
A Contextual Discretization framework for compressing Recurrent Neural Networks A Clark, VU Prabhu, J Whaley | | 2017 |
Open Library of Bioscience S Ravuri, K Lenc, M Willson, D Kangin, R Lam, P Mirowski, M Fitzsimons, ... | | |