Update README.md

This commit is contained in:
Hartmann 2021-08-29 09:17:08 +00:00 committed by huggingface-web
parent a51ab62183
commit 312b82913a
1 changed files with 4 additions and 2 deletions

View File

@ -46,7 +46,7 @@ Thanks to Samuel Domdey and chrsiebert for their support in making this model av
## Appendix 📚
Please find an overview of the datasets used for training below. All datasets contain English text. The table summarizes which emotions are available in each of the datasets.
Please find an overview of the datasets used for training below. All datasets contain English text. The table summarizes which emotions are available in each of the datasets.
|Name|anger|disgust|fear|joy|neutral|sadness|surprise|
|---|---|---|---|---|---|---|---|
@ -57,4 +57,6 @@ Please find an overview of the datasets used for training below. All datasets co
|MELD, Poria et al. (2019)|Yes|Yes|Yes|Yes|Yes|Yes|Yes|
|SemEval-2018, EI-reg (Mohammad et al. 2018) |Yes|-|Yes|Yes|-|Yes|-|
The datasets represent a diverse collection of text types. Specifically, they contain emotion labels for texts from Twitter, Reddit, student self-reports, and utterances from TV dialogues. As MELD (Multimodal EmotionLines Dataset) extends the popular EmotionLines dataset, EmotionLines itself is not included here.
The datasets represent a diverse collection of text types. Specifically, they contain emotion labels for texts from Twitter, Reddit, student self-reports, and utterances from TV dialogues. As MELD (Multimodal EmotionLines Dataset) extends the popular EmotionLines dataset, EmotionLines itself is not included here.
The model is trained on a balanced subset from the datasets listed above (2,811 observations per emotion, i.e., nearly 20k observations in total). The evaluation accuracy on a holdout test set is 66% (and significantly above the random-chance baseline of 1/7 = 14%).