Reconstruct mel spectrogram from librosa #19
Replies: 3 comments
-
Hi, thanks a lot. I am glad you like it 🙂 If I understand you correctly, you would like to reconstruct a Mel spectrogram you obtained from wav file using librosa. However, the demo (in this cell [is the output of the cell is what you want?]) also extracts mel spectrogram using librosa from the raw audio: SpecVQGAN/feature_extraction/demo_utils.py Lines 348 to 353 in eee222d that calls get_spectrogram() that has the implementation:SpecVQGAN/feature_extraction/extract_mel_spectrogram.py Lines 166 to 187 in eee222d and here are the transforms you need to apply in order to convert the sound samples to Mel spectrogram SpecVQGAN/feature_extraction/extract_mel_spectrogram.py Lines 141 to 151 in eee222d Just make sure your mel spectrogram is extracted with the same parameters and you apply the same transforms (log, calling etc, see |
Beta Was this translation helpful? Give feedback.
-
Also, check if the Neural Audio Codec colab demo makes it any clearer |
Beta Was this translation helpful? Give feedback.
-
Hello thank you very much for these! will check them out! :D |
Beta Was this translation helpful? Give feedback.
-
Hello! first of all, thanks for this wonderful repo. I would just like to ask as how to reconstruct the mel spectrogram i generated from librosa? I can do this via VQGAN using this code:
the xrec is the reconstructed image (from VQGAN)
I also add a preprocessing step before reconstructing using this code (same one from DALL-E's VQVAE):
in the end i just call these 2 functions to reconstruct the image
I was wondering how I could use your model instead to reconstruct in a way that it is similar to this. I just checked the demo and saw that it extracts audio from the video. I'm thinking as to how I can directly reconstruct the mel spectrogram generated on librosa.
Thank you very much in advance :D
Beta Was this translation helpful? Give feedback.
All reactions