Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to train a cosyvoice2 Supervised speech tokenizer with FSQ? #896

Open
weedge opened this issue Jan 17, 2025 · 2 comments
Open

how to train a cosyvoice2 Supervised speech tokenizer with FSQ? #896

weedge opened this issue Jan 17, 2025 · 2 comments

Comments

@weedge
Copy link

weedge commented Jan 17, 2025

like this issue question: #121

请问 如何构造训练数据,格式和SenseVoice-Large ASR的微调数据格式一样吗? 加入FSQ: https://github.com/google-research/google-research/tree/master/fsq | https://github.com/lucidrains/vector-quantize-pytorch#finite-scalar-quantization 是否需要从头开始训练呢?

@weedge weedge changed the title how to trainning a cosyvoice2 Supervised speech tokenizer with FSQ? how to train a cosyvoice2 Supervised speech tokenizer with FSQ? Jan 17, 2025
@aluminumbox
Copy link
Collaborator

fsq训练代码可自行参照report实现,跟训asr基本一样

@weedge
Copy link
Author

weedge commented Jan 17, 2025

@aluminumbox 好的,谢谢, 论文中有提到 Pitch Loss(音高损失)?

通过在FSQ基础的Speech Tokenizer训练中加入音高损失作为约束进行了比较实验。发现,这种方法在下游TTS任务中表现出了更好的性能;

Pitch Loss 有相应的背景论文吗? 或者是否有前期研究工作,找了下没有找到对应出处 :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants