
Vevo: Controllable Zero-Shot Voice Imitation with
We present the performance of Vevo-Style in the zero-shot style imitation task, focusing on widely studied styles such as accent and emotion. Notably, Vevo-Style employs a zero-shot manner (i.e., using just a few seconds of speech) to achieve style imitation, which is …
amphion/Vevo - Hugging Face
We present our reproduction of Vevo, a versatile zero-shot voice imitation framework with controllable timbre and style. We invite you to explore the audio samples to experience Vevo's capabilities firsthand. We have included the following pre-trained Vevo models at Amphion: Vevo-Timbre: It can conduct style-preserved voice conversion.
Vevo: Controllable Zero-Shot Voice Imitation with Self-Supervised ...
2025年2月11日 · However, existing methods rely heavily on annotated data, and struggle with effectively disentangling timbre and style, leading to challenges in achieving controllable generation, especially in zero-shot scenarios. To address these issues, we propose Vevo, a versatile zero-shot voice imitation framework with controllable timbre and style.
Amphion/models/vc/vevo/README.md at main - GitHub
We present our reproduction of Vevo, a versatile zero-shot voice imitation framework with controllable timbre and style. We invite you to explore the audio samples to experience Vevo's capabilities firsthand. We have included the following pre-trained Vevo models at Amphion: Vevo-Timbre: It can conduct style-preserved voice conversion.
Vevo: Controllable Zero-Shot Voice Imitation with Self-Supervised ...
2025年1月22日 · TL;DR: We propose a versatile zero-shot voice imitation framework, with controllable timbre and style. The imitation of voice, targeted on specific speech attributes such as timbre and speaking style, is crucial in speech generation.
Mara Sattei - Shot (Live Performance) | Vevo - YouTube
2022年1月26日 · Mara Sattei dal vivo con “Shot” in un’esclusiva performance per Vevo.
StableVC: Style Controllable Zero-Shot Voice Conversion with ...
2024年12月6日 · Experiments demonstrate that our proposed StableVC outperforms state-of-the-art baseline systems in zero-shot VC and achieves flexible control over timbre and style from different unseen speakers. Moreover, StableVC offers approximately 25x and 1.65x faster sampling compared to autoregressive and diffusion-based baselines.
Vevo: Controllable Zero-Shot Voice Imitation with Self-Supervised ...
2025年3月7日 · However, existing methods relyheavily on annotated data, and struggle with effectively disentangling timbre andstyle, leading to challenges in achieving controllable generation, especially inzero-shot scenarios. To address these issues, we propose Vevo, a versatile zero-shot voice imitation framework with controllable timbre and style.
Connected Papers | Find and explore academic papers
Showing paper suggestions for "Vevo: Controllable Zero-Shot Voice Imitation with Self-Supervised Disentanglement". Choose a paper to build a graph: Search powered by Semantic Scholar
Vevo: Controllable Zero-Shot Voice Imitation with Self-Supervised ...
We introduce Vevo, a versatile zero-shot voice imitation framework featuring controllable timbre and style. Vevo contains of two primary stages: content-style modeling via an autoregressive transformer, and acoustic modeling via a flow matching transformer.
- 某些结果已被删除