-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Now fluentspeech.py also manages validation set. #271
Now fluentspeech.py also manages validation set. #271
Conversation
Hi Umberto, The thing is that in CL, you are not supposed to have a valid set in the beginning. You are supposed to produce it at each task from the training data. |
Hmm, I see. FluentSpeech Commands dataset provides the valid_data.csv file together w/ the train and test set, so I can't extract a portion of the train set because the valid already exists. Well, I'll see how to pull it off anyways 👍 Thank you Timothée :) |
Is it possible to merge train and valid to create train data? |
I think, by and large, that it is possible, yet if we merge the train and valid, and then we split again, the original validation samples will not be the same as the new ones brought by |
@TLESORT I think we can allow having a define val dataset, I think a few others datasets in Continuum proposes that. It can be useful when everyone compare on the same val set. |
Yes, we can potentially accept it. But first, having the same valid set is not supposed to be a requirement. The mandatory thing is to have the same test set. I also think we have already some datasets with valid sets, but it is against my will ;D :P So, we can merge this, but we need to keep in mind that some particularities in the scenario make it not perfectly rigorous... |
Well, the split into train, test and valid has been made by the authors who created the corpus and I don't know whether they crafted then different sets. Since I'm the first to use FSC in a CL scenario, I think it could be ok to proceed in this way, and I understand your rigorousness for this matter. So, you have the last word about this. |
@umbertocappellazzo I'm merging this PR, and I create an issue to discuss what you're just talked about as it's quite different matter (#272). |
@umbertocappellazzo note that it has been deployed in the version 1.2.4. :) Thanks for your contrib! |
Hi, I modified the fluentspeech.py file such that it can deal with the validation set as well (it was missing).
Cheers