Sentiment Analysis in Music Therapy

This project focuses on using machine learning models to predict user moods based on audio and metadata from songs. By leveraging features like track popularity, danceability, energy, and more, the project aims to classify songs into mood categories for use in music therapy.

Project Structure

Directories

Original/
- Contains the raw dataset (therapeutic_music_enriched.csv) along with other versions of the dataset, including Y.M.I.R(Yielding Melodies for Internal Restoration) Dataset.
XGBoost/
- Contains the XGBoost model and associated files (e.g., xgboost_model.json).
LGBM/
- Contains the LightGBM model and associated files (e.g., best_lgbm_model.txt).
CatBoost/
- Contains the CatBoost model and associated files (e.g., catboost_model.json).
Random-Forest/
- Contains the Random Forest model and associated files (e.g., random_forest_model.joblib).
CNN/
- Contains the trained CNN model (best_cnn_model.pth) and preprocessing files (cnn_scaler.pkl, cnn_label_encoder.pkl).
BERT/
- Contains the BERT model training script and files.
BART/
- Contains the BART model training script and files.
DEBERTA/
- Contains the DeBERTa model training script and files.
ELECTRA/
- Contains the ELECTRA model training script and files.
INDIC-BERT/
- Contains the Indic-BERT model training script and files.
Naive-Bayes/
- Contains the Naive Bayes classifier training script and files.
Ensemble-Voting/
- Contains the script for ensemble learning using majority voting.

Scripts

As it is not possible to save the Trained Models in the Repository and upload it to Github due to their large sizes, we have instead uploaded their Training Scripts. Just run the given Files to obtain the trained models along with their corresponding accuracy, f1 score, etc.

XGBoost/XGBoost_Training.py
- Script to train and save the XGBoost model.
LGBM/LGBM_Training.py
- Script to train and save the LightGBM model.
CatBoost/CatBoost_Training.py
- Script to train and save the CatBoost model.
Random-Forest/Random-Forest_Training.py
- Script to train and save the Random Forest model.
CNN/CNN_Training.py
- Script to train and save the CNN model.
BERT/BERT_Training.py
- Script to train and save the BERT model.
BART/BART_Training.py
- Script to train and save the BART model.
DEBERTA/DEBERTA_Training.py
- Script to train and save the DeBERTa model.
ELECTRA/ELECTRA_Training.py
- Script to train and save the ELECTRA model.
INDIC-BERT/INDIC-BERT_Training.py
- Script to train and save the Indic-BERT model.
Naive-Bayes/NaiveBayes_Training.py
- Script to train and save the Naive Bayes classifier.
Ensemble-Voting/Ensemble-Voting.py
- Script to combine predictions from XGBoost, LightGBM, CatBoost, Random Forest, and CNN using majority voting. To run this you must first train the given 5 models.

Installation

1. Clone the Repository

$ git clone https://github.com/AetherSparks/Sentiment-Analysis-in-Music-Therapy.git
$ cd Sentiment-Analysis-in-Music-Therapy

2. Create a Virtual Environment

$ python -m venv musicvenv
$ source musicvenv/bin/activate  # For Linux/Mac
$ musicvenv\Scripts\activate   # For Windows

3. Install Dependencies

Ensure all required libraries are installed by using the requirements.txt file:

$ pip install -r requirements.txt

Usage

1. Train Individual Models

Train each model by running their respective training scripts. Example:

$ python XGBoost/XGBoost_Training.py
$ python LGBM/LGBM_Training.py
$ python CatBoost/CatBoost_Training.py
$ python Random-Forest/RandomForest_Training.py
$ python CNN/CNN_Training.py
$ python BERT/BERT_Training.py
$ python BART/BART_Training.py
$ python DEBERTA/DEBERTA_Training.py
$ python ELECTRA/ELECTRA_Training.py
$ python INDIC-BERT/IndicBERT_Training.py
$ python Naive-Bayes/NaiveBayes_Training.py

2. Run Ensemble Voting

Combine predictions using the ensemble script:

$ python Ensemble-Voting/Ensemble-Voting.py

Models Used

XGBoost
- Gradient Boosting framework optimized for performance and efficiency.
LightGBM
- Fast, distributed gradient boosting framework.
CatBoost
- Gradient boosting on decision trees with categorical feature support.
Random Forest
- Ensemble learning method using decision trees.
Convolutional Neural Network (CNN)
- Deep learning model for analyzing numerical features.
BERT
- Transformer model for natural language processing tasks.
BART
- Denoising autoencoder for sequence-to-sequence tasks.
DeBERTa
- Enhanced BERT model for improved representation.
ELECTRA
- Transformer model pre-trained as a discriminator.
Indic-BERT
- BERT model optimized for Indian languages.
Naive Bayes
- Probabilistic classifier based on Bayes' theorem.
Ensemble Majority Voting
- Combines predictions from XGBoost, LightGBM, CatBoost, Random Forest, and CNN to improve accuracy.

Dataset

Path: Original/therapeutic_music_enriched.csv
Features:
- Track Popularity, Danceability, Energy, Key, Loudness, Mode, Speechiness, Acousticness, Instrumentalness, Liveness, Valence, Tempo, Duration (ms)
Target: Mood_Label

Results

The following accuracies were achieved during testing:

Model	Accuracy
XGBoost	0.97
LightGBM	0.97
CatBoost	0.96
Random Forest	0.92
CNN	0.80
BERT	0.44
BART	0.42
DeBERTa	0.35
ELECTRA	0.45
Indic-BERT	0.46
Naive Bayes	0.40
Ensemble	0.965

An In Depth Score of these models have been saved in the Results Directory.

Drive Link for the Datasets and Raw Audio Files

https://drive.google.com/drive/folders/1Iia5wi49W-TZfKnQyKRpj2g0CzBd1yn3?usp=sharing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment Analysis in Music Therapy

Project Structure

Directories

Scripts

Installation

1. Clone the Repository

2. Create a Virtual Environment

3. Install Dependencies

Usage

1. Train Individual Models

2. Run Ensemble Voting

Models Used

Dataset

Results

Drive Link for the Datasets and Raw Audio Files

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
BART		BART
BERT		BERT
CNN		CNN
CatBoost		CatBoost
DEBERTA		DEBERTA
ELECTRA		ELECTRA
Ensemble-Voting		Ensemble-Voting
INDIC-BERT		INDIC-BERT
LGBM		LGBM
Naive-Bayes		Naive-Bayes
Original		Original
Random-Forest		Random-Forest
XGBoost		XGBoost
results		results
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

AetherSparks/Sentiment-Analysis-in-Music-Therapy

Folders and files

Latest commit

History

Repository files navigation

Sentiment Analysis in Music Therapy

Project Structure

Directories

Scripts

Installation

1. Clone the Repository

2. Create a Virtual Environment

3. Install Dependencies

Usage

1. Train Individual Models

2. Run Ensemble Voting

Models Used

Dataset

Results

Drive Link for the Datasets and Raw Audio Files

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages