Tacotron: Towards end-to-end speech synthesis Y Wang, RJ Skerry-Ryan, D Stanton, Y Wu, RJ Weiss, N Jaitly, Z Yang, ... arXiv preprint arXiv:1703.10135, 2017 | 2056* | 2017 |
Style tokens: Unsupervised style modeling, control and transfer in end-to-end speech synthesis Y Wang, D Stanton, Y Zhang, RJS Ryan, E Battenberg, J Shor, Y Xiao, ... International conference on machine learning, 5180-5189, 2018 | 803 | 2018 |
Towards end-to-end prosody transfer for expressive speech synthesis with tacotron RJ Skerry-Ryan, E Battenberg, Y Xiao, Y Wang, D Stanton, J Shor, ... international conference on machine learning, 4693-4702, 2018 | 589 | 2018 |
Predicting expressive speaking style from text in end-to-end speech synthesis D Stanton, Y Wang, RJ Skerry-Ryan 2018 IEEE Spoken Language Technology Workshop (SLT), 595-602, 2018 | 131 | 2018 |
Location-relative attention mechanisms for robust long-form speech synthesis E Battenberg, RJ Skerry-Ryan, S Mariooryad, D Stanton, D Kao, ... ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 113 | 2020 |
Uncovering latent style factors for expressive speech synthesis Y Wang, RJ Skerry-Ryan, Y Xiao, D Stanton, J Shor, E Battenberg, ... arXiv preprint arXiv:1711.00520, 2017 | 83 | 2017 |
Temporal ranking scheme for desktop searching S Raub, A Dingle, D Stanton US Patent 7,529,739, 2009 | 54 | 2009 |
Effective use of variational embedding capacity in expressive end-to-end speech synthesis E Battenberg, S Mariooryad, D Stanton, RJ Skerry-Ryan, M Shannon, ... arXiv preprint arXiv:1906.03402, 2019 | 51 | 2019 |
Semi-supervised generative modeling for controllable speech synthesis R Habib, S Mariooryad, M Shannon, E Battenberg, RJ Skerry-Ryan, ... arXiv preprint arXiv:1910.01709, 2019 | 50 | 2019 |
A systematic comparison of phrase table pruning techniques R Zens, D Stanton, P Xu Proceedings of the 2012 Joint Conference on Empirical Methods in Natural …, 2012 | 49 | 2012 |
Combined title prefix and full-word content searching D Stanton, S Raub, A Dingle US Patent 7,617,197, 2009 | 25 | 2009 |
Speaker generation D Stanton, M Shannon, S Mariooryad, RJ Skerry-Ryan, E Battenberg, ... ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 23 | 2022 |
Non-saturating GAN training as divergence minimization M Shannon, B Poole, S Mariooryad, T Bagby, E Battenberg, D Kao, ... arXiv preprint arXiv:2010.08029, 2020 | 13 | 2020 |
Fix it where it fails: Pronunciation learning by mining error corrections from speech logs Z Kou, D Stanton, F Peng, F Beaufays, T Strohman 2015 IEEE International Conference on Acoustics, Speech and Signal …, 2015 | 13 | 2015 |
Variational embedding capacity in expressive end-to-end speech synthesis ED Battenberg, D Stanton, RJW Skerry-Ryan, S Mariooryad, DT Kao, ... US Patent 11,222,621, 2022 | 7 | 2022 |
Document translation including pre-defined term translator and translation model JJ Chin, D Stanton, VS Thadkal, J Yin US Patent 9,116,886, 2015 | 3 | 2015 |
Learning the joint distribution of two sequences using little or no paired data S Mariooryad, M Shannon, S Ma, T Bagby, D Kao, D Stanton, ... arXiv preprint arXiv:2212.03232, 2022 | 1 | 2022 |
Controlling Expressivity In End-to-End Speech Synthesis Systems D Stanton, ED Battenberg, RJW Skerry-Ryan, S Mariooryad, DT Kao, ... US Patent App. 18/314,556, 2023 | | 2023 |
Variational Embedding Capacity in Expressive End-to-End Speech Synthesis ED Battenberg, D Stanton, RJW Skerry-Ryan, S Mariooryad, DT Kao, ... US Patent App. 18/302,764, 2023 | | 2023 |
Generative semi-supervised learning with a neural seq2seq noisy channel S Mariooryad, M Shannon, S Ma, T Bagby, DTH Kao, D Stanton, ... ICML 2023 Workshop on Structured Probabilistic Inference {\&} Generative …, 2023 | | 2023 |