Update README.md

This commit is contained in:
Hartmann 2021-08-29 09:17:08 +00:00 committed by huggingface-web
parent a51ab62183
commit 312b82913a
1 changed files with 4 additions and 2 deletions

View File

@ -58,3 +58,5 @@ Please find an overview of the datasets used for training below. All datasets co
|SemEval-2018, EI-reg (Mohammad et al. 2018) |Yes|-|Yes|Yes|-|Yes|-| |SemEval-2018, EI-reg (Mohammad et al. 2018) |Yes|-|Yes|Yes|-|Yes|-|
The datasets represent a diverse collection of text types. Specifically, they contain emotion labels for texts from Twitter, Reddit, student self-reports, and utterances from TV dialogues. As MELD (Multimodal EmotionLines Dataset) extends the popular EmotionLines dataset, EmotionLines itself is not included here. The datasets represent a diverse collection of text types. Specifically, they contain emotion labels for texts from Twitter, Reddit, student self-reports, and utterances from TV dialogues. As MELD (Multimodal EmotionLines Dataset) extends the popular EmotionLines dataset, EmotionLines itself is not included here.
The model is trained on a balanced subset from the datasets listed above (2,811 observations per emotion, i.e., nearly 20k observations in total). The evaluation accuracy on a holdout test set is 66% (and significantly above the random-chance baseline of 1/7 = 14%).