
GTR-Voice: Articulatory Phonetics Informed Controllable …
2024年6月15日 · With this framework, we record a high-quality speech dataset named GTR-Voice, featuring 20 Chinese sentences articulated by a professional voice actor across 125 distinct GTR combinations.
GTR-Voice: Articulatory Phonetics Informed Controllable …
2024年6月15日 · Specifically, we identify three fundamental dimensions of speech expression at the articulation level, namely Glottalization, Tenseness, and Resonance (GTR). Under this framework, we designed and recorded a high-quality expressive speech dataset comprising 125 distinct GTR types of voice uttered by a single professional voice actor.
Articulatory Phonetics Informed Controllable Expressive Speech ...
2024年6月15日 · With this framework, we record a high-quality speech dataset named GTR-Voice, featuring 20 Chinese sentences articulated by a professional voice actor across 125 distinct GTR combinations.
(PDF) Articulatory Phonetics Informed Controllable Expressive …
2024年6月15日 · Specifically, we define a framework with three dimensions: Glottalization, Tenseness, and Resonance (GTR), to guide the synthesis at the voice production level. With this framework, we record a...
第二期 | INTERSPEECH 2024 论文预讲会(罗切斯特大学音频信息 …
2024年7月4日 · 借助这个框架,我们录制了一个名为GTR-Voice的高质量语音数据集,包括由一位专业配音演员发音的20句中文句子,涵盖125种不同的GTR组合。 我们通过自动分类和听力测试验证了框架和GTR标注,并演示了在两个经过精细调节的表达性TTS模型上沿着GTR维度的精确可控 ...
[PDF] Articulatory Phonetics Informed Controllable Expressive …
This work defines a framework with three dimensions: Glottalization, Tenseness, and Resonance (GTR), to guide the synthesis at the voice production level, and demonstrates precise controllability along the GTR dimensions on two fine-tuned expressive TTS models.
GTR-Voice Subjective Evaluation - demo.gtr-voice.com
1. You will hear 18~19 groups of voices, each group contains a reference voice and three test voices. 2. The text content of the reference voice is different from the text content of the test voice. 3. Click the triangle button on the left side of the audio block with the mouse to play the audio. 4.
Paper tables with annotated results for Articulatory Phonetics …
With this framework, we record a high-quality speech dataset named GTR-Voice, featuring 20 Chinese sentences articulated by a professional voice actor across 125 distinct GTR combinations.
This study introduces a novel GTR framework and dataset to improve control over expressive speech synthesis by focusing on Glottalization, Tenseness, and Resonance. Experimental results show controllability in expressive TTS, with user studies confirming GTR-based models in capturing articulatory nuances across various speech dimensions.
GTR-Voice: Articulatory Phonetics Informed Controllable …
With this framework, we record a high-quality speech dataset named GTR-Voice, featuring 20 Chinese sentences articulated by a professional voice actor across 125 distinct GTR combinations.
- 某些结果已被删除