how to train a cosyvoice2 Supervised speech tokenizer with FSQ? #896

weedge · 2025-01-17T05:28:16Z

like this issue question: #121

请问如何构造训练数据，格式和SenseVoice-Large ASR的微调数据格式一样吗？加入FSQ: https://github.com/google-research/google-research/tree/master/fsq | https://github.com/lucidrains/vector-quantize-pytorch#finite-scalar-quantization 是否需要从头开始训练呢？

aluminumbox · 2025-01-17T06:46:32Z

fsq训练代码可自行参照report实现，跟训asr基本一样

weedge · 2025-01-17T14:31:07Z

@aluminumbox 好的，谢谢，论文中有提到 Pitch Loss（音高损失）？

通过在FSQ基础的Speech Tokenizer训练中加入音高损失作为约束进行了比较实验。发现，这种方法在下游TTS任务中表现出了更好的性能；

Pitch Loss 有相应的背景论文吗？或者是否有前期研究工作，找了下没有找到对应出处 :(

weedge changed the title ~~how to trainning a cosyvoice2 Supervised speech tokenizer with FSQ?~~ how to train a cosyvoice2 Supervised speech tokenizer with FSQ? Jan 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to train a cosyvoice2 Supervised speech tokenizer with FSQ? #896

how to train a cosyvoice2 Supervised speech tokenizer with FSQ? #896

weedge commented Jan 17, 2025

aluminumbox commented Jan 17, 2025

weedge commented Jan 17, 2025

how to train a cosyvoice2 Supervised speech tokenizer with FSQ? #896

how to train a cosyvoice2 Supervised speech tokenizer with FSQ? #896

Comments

weedge commented Jan 17, 2025

aluminumbox commented Jan 17, 2025

weedge commented Jan 17, 2025