|
||
---|---|---|
.gitattributes | ||
README.md | ||
config.json | ||
merges.txt | ||
pytorch_model.bin | ||
special_tokens_map.json | ||
tokenizer.json | ||
tokenizer_config.json | ||
training_args.bin | ||
vocab.json |
README.md
language | tags | widget | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
en |
|
|
Description ℹ
With this model, you can classify emotions in English text data. The model was trained on 6 diverse datasets (see Appendix below) and predicts Ekman's 6 basic emotions, plus a neutral class:
- anger 🤬
- disgust 🤢
- fear 😨
- joy 😀
- neutral 😐
- sadness 😭
- surprise 😲
The model is a fine-tuned checkpoint of DistilRoBERTa-base.
Application 🚀
a) Run emotion model with 3 lines of code on single text example using Hugging Face's pipeline command on Google Colab:
b) Run emotion model on multiple examples and full datasets (e.g., .csv files) on Google Colab:
Contact 💻
Please reach out to jochen.hartmann@uni-hamburg.de if you have any questions or feedback.
Thanks to Samuel Domdey and chrsiebert for their support in making this model available.
Appendix 📚
Please find an overview of the datasets used for training below. All datasets contain English text. The table summarizes which emotions are available in each of the datasets.
Name | anger | disgust | fear | joy | neutral | sadness | surprise |
---|---|---|---|---|---|---|---|
Crowdflower (2016) | Yes | - | - | Yes | Yes | Yes | Yes |
Emotion Dataset, Elvis et al. (2018) | Yes | - | Yes | Yes | - | Yes | Yes |
GoEmotions, Demszky et al. (2020) | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
ISEAR, Vikash (2018) | Yes | Yes | Yes | Yes | - | Yes | - |
MELD, Poria et al. (2019) | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
SemEval-2018, EI-reg (Mohammad et al. 2018) | Yes | - | Yes | Yes | - | Yes | - |
The datasets represent a diverse collection of text types. Specifically, they contain emotion labels for texts from Twitter, Reddit, student self-reports, and utterances from TV dialogues. As MELD (Multimodal EmotionLines Dataset) extends the popular EmotionLines dataset, EmotionLines itself is not included here.