From ebe7999a05786d6a77125e4662ff0f63239a3261 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Enrique=20Hern=C3=A1ndez=20Calabr=C3=A9s?= Date: Wed, 14 Jul 2021 09:51:07 +0000 Subject: [PATCH] Update README.md --- README.md | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index a790205..441cc62 100644 --- a/README.md +++ b/README.md @@ -13,7 +13,14 @@ should probably proofread and complete it, then remove this comment. --> # wav2vec2-lg-xlsr-en-speech-emotion-recognition -This model is a fine-tuned version of [jonatasgrosman/wav2vec2-large-xlsr-53-english](https://huggingface.co/jonatasgrosman/wav2vec2-large-xlsr-53-english) on an unkown dataset. +This model is a fine-tuned version of [jonatasgrosman/wav2vec2-large-xlsr-53-english](https://huggingface.co/jonatasgrosman/wav2vec2-large-xlsr-53-english) for a Speech Emotion Recognition (SER) task. + +The dataset used to fine-tune the original pre-trained model is the [RAVDESS dataset](https://zenodo.org/record/1188976#.YO6yI-gzaUk). This dataset provides 1440 samples of recordings from actors performing on 8 different emotions in English, which are: + +```python +emotions = ['angry', 'calm', 'disgust', 'fearful', 'happy', 'neutral', 'sad', 'surprised'] +``` + It achieves the following results on the evaluation set: - Loss: 0.5023 - Accuracy: 0.8223