Follow
Detai Xin
Detai Xin
Verified email at ipc.i.u-tokyo.ac.jp
Title
Cited by
Cited by
Year
Utmos: Utokyo-sarulab system for voicemos challenge 2022
T Saeki, D Xin, W Nakata, T Koriyama, S Takamichi, H Saruwatari
arXiv preprint arXiv:2204.02152, 2022
1802022
Naturalspeech 3: Zero-shot speech synthesis with factorized codec and diffusion models
Z Ju, Y Wang, K Shen, X Tan, D Xin, D Yang, Y Liu, Y Leng, K Song, ...
arXiv preprint arXiv:2403.03100, 2024
1302024
Disentangled speaker and language representations using mutual information minimization and domain adaptation for cross-lingual TTS
D Xin, T Komatsu, S Takamichi, H Saruwatari
ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021
372021
Rall-e: Robust codec language modeling with chain-of-thought prompting for text-to-speech synthesis
D Xin, X Tan, K Shen, Z Ju, D Yang, Y Wang, S Takamichi, H Saruwatari, ...
arXiv preprint arXiv:2404.03204, 2024
242024
Cross-Lingual Text-To-Speech Synthesis via Domain Adaptation and Perceptual Similarity Regression in Speaker Space.
D Xin, Y Saito, S Takamichi, T Koriyama, H Saruwatari
Interspeech, 2947-2951, 2020
242020
Improving speech prosody of audiobook text-to-speech synthesis with acoustic and textual contexts
D Xin, S Adavanne, F Ang, A Kulkarni, S Takamichi, H Saruwatari
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
162023
Cross-Lingual Speaker Adaptation Using Domain Adaptation and Speaker Consistency Loss for Text-To-Speech Synthesis.
D Xin, Y Saito, S Takamichi, T Koriyama, H Saruwatari
Interspeech, 1614-1618, 2021
152021
Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech
D Yang, T Koriyama, Y Saito, T Saeki, D Xin, H Saruwatari
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
142023
Exploring the effectiveness of self-supervised learning and classifier chains in emotion recognition of nonverbal vocalizations
D Xin, S Takamichi, H Saruwatari
arXiv preprint arXiv:2206.10695, 2022
122022
JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal Expressions
D Xin, J Jiang, S Takamichi, Y Saito, A Aizawa, H Saruwatari
IEEE Access, 2024
102024
Coco-nut: Corpus of japanese utterance and voice characteristics description for prompt-based control
A Watanabe, S Takamichi, Y Saito, W Nakata, D Xin, H Saruwatari
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023
92023
Laughter synthesis using pseudo phonetic tokens with a large-scale in-the-wild laughter corpus
D Xin, S Takamichi, A Morimatsu, H Saruwatari
arXiv preprint arXiv:2305.12442, 2023
92023
JNV corpus: A corpus of Japanese nonverbal vocalizations with diverse phrases and emotions
D Xin, S Takamichi, H Saruwatari
Speech Communication 156, 103004, 2024
52024
Mid-attribute speaker generation using optimal-transport-based interpolation of gaussian mixture models
A Watanabe, S Takamichi, Y Saito, D Xin, H Saruwatari
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
52023
Bigcodec: Pushing the limits of low-bitrate neural speech codec
D Xin, X Tan, S Takamichi, H Saruwatari
arXiv preprint arXiv:2409.05377, 2024
42024
How generative spoken language modeling encodes noisy speech: Investigation from phonetics to syntactics
J Park, S Takamichi, T Nakamura, K Seki, D Xin, H Saruwatari
arXiv preprint arXiv:2306.00697, 2023
32023
Building speech corpus with diverse voice characteristics for its prompt-based representation
A Watanabe, S Takamichi, Y Saito, W Nakata, D Xin, H Saruwatari
arXiv preprint arXiv:2403.13353, 2024
2024
Coco-Nut: Corpus of connecting Nihongo utterance and text toward prompt-based control of voice characteristics
AYA WATANABE, S TAKAMICHI, Y SAITO, D XIN, H SARUWATARI
日本音響学会研究発表会講演論文集 (CD-ROM) 2023, 3-9, 2023
2023
Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation
D Xin, S Takamichi, T Okamoto, H Kawai, H Saruwatari
arXiv preprint arXiv:2204.10561, 2022
2022
Emotional Speech with Nonverbal Vocalizations: Corpus Design, Synthesis, and Detection
D Xin
The system can't perform the operation now. Try again later.
Articles 1–20