Towards accountability for machine learning datasets: Practices from software engineering and infrastructure B Hutchinson, A Smart, A Hanna, E Denton, C Greer, O Kjartansson, ... Proceedings of the 2021 ACM Conference on Fairness, Accountability, and …, 2021 | 139 | 2021 |
Crowd-sourced speech corpora for javanese, sundanese, sinhala, nepali, and bangladeshi bengali O Kjartansson, S Sarin, K Pipatsrisawat, M Jansche, L Ha | 48 | 2018 |
Open-source multi-speaker speech corpora for building gujarati, kannada, malayalam, marathi, tamil and telugu speech synthesis systems F He, SHC Chu, O Kjartansson, CE Rivera, A Katanova, A Gutkin, ... | 38 | 2020 |
Open-source multi-speaker corpora of the english accents in the british isles I Demirsahin, O Kjartansson, A Gutkin, C Rivera Proceedings of the Twelfth Language Resources and Evaluation Conference …, 2020 | 26 | 2020 |
A step-by-step process for building tts voices using open source data and framework for bangla, javanese, khmer, nepali, sinhala, and sundanese K Sodimana, K Pipatsrisawat, L Ha, M Jansche, O Kjartansson, P De Silva, ... | 25 | 2018 |
Rapid development of TTS corpora for four South African languages D van Niekerk, C van Heerden, M Davel, N Kleynhans, O Kjartansson, ... | 23 | 2017 |
Almannaromur: An open icelandic speech corpus J Guðnason, O Kjartansson, J Jóhannsson, E Carstensdóttir, ... Spoken Language Technologies for Under-Resourced Languages, 2012 | 22 | 2012 |
Data cards: Purposeful and transparent dataset documentation for responsible ai M Pushkarna, A Zaldivar, O Kjartansson 2022 ACM Conference on Fairness, Accountability, and Transparency, 1776-1826, 2022 | 18 | 2022 |
Developing an open-source corpus of yoruba speech A Gutkin, I Demirsahin, O Kjartansson, CE Rivera, K Túbòsún | 16 | 2020 |
Crowdsourcing Latin American Spanish for low-resource text-to-speech A Guevara-Rukoz, I Demirsahin, F He, SHC Chu, S Sarin, K Pipatsrisawat, ... | 15 | 2020 |
Building open Javanese and Sundanese corpora for multilingual text-to-speech JAE Wibawa, S Sarin, CF Li, K Pipatsrisawat, K Sodimana, O Kjartansson, ... | 14 | 2018 |
Building statistical parametric multi-speaker synthesis for bangladeshi bangla A Gutkin, L Ha, M Jansche, O Kjartansson, K Pipatsrisawat, R Sproat Procedia Computer Science 81, 194-200, 2016 | 14 | 2016 |
Open-source high quality speech datasets for Basque, Catalan and Galician O Kjartansson, A Gutkin, A Butryna, I Demirsahin, C Rivera Proceedings of the 1st Joint Workshop on Spoken Language Technologies for …, 2020 | 12 | 2020 |
Burmese speech corpus, finite-state text normalization and pronunciation grammars with an application to text-to-speech YM Oo, A Theeraphol, CF Li, P De Silva, S Sarin, K Pipatsrisawat, ... | 9 | 2020 |
Google crowdsourced speech corpora and related open-source resources for low-resource languages and dialects: an overview A Butryna, SHC Chu, I Demirsahin, A Gutkin, L Ha, F He, M Jansche, ... arXiv preprint arXiv:2010.06778, 2020 | 5 | 2020 |
Towards accountability for machine learning datasets A Hanna, A Smart, B Hutchinson, C Greer, E Denton, M Mitchell, ... | 2 | 2021 |
Data Cards: Purposeful and Transparent Dataset Documentation for Responsible AI A Zaldivar, M Pushkarna, O Kjartansson | | 2022 |
Málrómur J Guðnason, O Kjartansson, J Jóhannsson, E Carstensdóttir, ... The Árni Magnússon Institute for Icelandic Studies, 2014 | | 2014 |