Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way or are there plans to introduce emotions? #187

Open
juangea opened this issue Aug 30, 2024 · 4 comments
Open

Is there a way or are there plans to introduce emotions? #187

juangea opened this issue Aug 30, 2024 · 4 comments

Comments

@juangea
Copy link

juangea commented Aug 30, 2024

Hi there.

I'm using Melo and the quality is very good, in spanish works very well, however it's emotion-less, it feels dead, it's greaet for some uses, but for others is very monotone, is there a way to introduce amotion or are there any plans to do so?

Thanks!

@juangea juangea changed the title Is there a way or Are there plans to introduce emotions? Is there a way or are there plans to introduce emotions? Aug 30, 2024
@dezynetechnologies
Copy link

dezynetechnologies commented Oct 10, 2024

I guess you can use tone-coloring from OpenVoice..We have used meloTTS for creating AI generated voices using clear recording of a single speaker for around 40-45 minutes. We trained for around 1000 epochs.

@juangea
Copy link
Author

juangea commented Oct 10, 2024

How could we use tone coloring?

The reference audio changes the tone and the emotion? I thought it was only trying to clone the voice, but not the emotion.

For us, Melo is giving good results, in relation to clarity of speech, the problem is that the entonation is too emotionless and monotone, and that's what we are trying to solve.

The trained voice you trained, sound more natural?

@SayanoAI
Copy link

All open source TTS models will be emotionless because the data they are trained on are emotionless (e.g. librispeech). Close sourced models (like openai and elevenlabs) can get away with using copyrighted data or paying people for emotional voice data to train their models.

@juangea
Copy link
Author

juangea commented Oct 16, 2024

I don't agree with that asumption.

There are techniques to introduce emotion, like the one used by F5, also Bark has emotion on it, not sure their data set but it's open source.

Emotion can be introduced by providing a reference and then translating that reference over to the inference, the data set it's not so important with this techniques.

Toucan does something similar too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants