60 lines
2.3 KiB
Markdown
60 lines
2.3 KiB
Markdown
---
|
||
language: "en"
|
||
tags:
|
||
- distilroberta
|
||
- sentiment
|
||
- emotion
|
||
- twitter
|
||
- reddit
|
||
|
||
widget:
|
||
- text: "Oh wow. I didn't know that."
|
||
- text: "This movie always makes me cry.."
|
||
- text: "Oh Happy Day"
|
||
|
||
---
|
||
|
||
## Description ℹ
|
||
|
||
With this model, you can classify emotions in English text data. The model was trained on 6 diverse datasets (see Appendix) and predicts Ekman's 6 basic emotions, plus a neutral class:
|
||
|
||
1) anger 🤬
|
||
2) disgust 🤢
|
||
3) fear 😨
|
||
4) joy 😀
|
||
5) neutral 😐
|
||
6) sadness 😭
|
||
7) surprise 😲
|
||
|
||
The model is a fine-tuned checkpoint of DistilRoBERTa-base.
|
||
|
||
## Application 🚀
|
||
|
||
a) Run emotion model with 3 lines of code on single text example using Hugging Face's pipeline command on Google Colab:
|
||
|
||
[](https://colab.research.google.com/github/j-hartmann/emotion-english-distilroberta-base/blob/main/simple_emotion_pipeline.ipynb)
|
||
|
||
b) Run emotion model on multiple examples and full datasets (e.g., .csv files) on Google Colab:
|
||
|
||
[](https://colab.research.google.com/github/j-hartmann/emotion-english-distilroberta-base/blob/main/emotion_prediction_example.ipynb)
|
||
|
||
## Contact 💻
|
||
|
||
Please reach out to jochen.hartmann@uni-hamburg.de if you have any questions or feedback.
|
||
|
||
Thanks to Samuel Domdey and chrsiebert for their support in making this model available.
|
||
|
||
## Appendix 📚
|
||
|
||
Please find an overview of the datasets used for training below. All datasets contain English text. The table summarizes which emotions are available in each of the datasets.
|
||
|
||
|Name|anger|disgust|fear|joy|neutral|sadness|surprise|
|
||
|---|---|---|---|---|---|---|---|
|
||
|Crowdflower (2016)|Yes|-|-|Yes|Yes|Yes|Yes|
|
||
|Emotion Dataset, Elvis et al. (2018)|Yes|-|Yes|Yes|-|Yes|Yes|
|
||
|GoEmotions, Demszky et al. (2020)|Yes|Yes|Yes|Yes|Yes|Yes|Yes|
|
||
|ISEAR, Vikash (2018)|Yes|Yes|Yes|Yes|-|Yes|-|
|
||
|MELD, Poria et al. (2019)|Yes|Yes|Yes|Yes|Yes|Yes|Yes|
|
||
|SemEval-2018, EI-reg (Mohammad et al. 2018) |Yes|-|Yes|Yes|-|Yes|-|
|
||
|
||
The datasets represent a diverse collection of text types. Specifically, they contain emotion labels for texts from Twitter, Reddit, student self-reports, and utterances from TV dialogues. As MELD (Multimodal EmotionLines Dataset) extends the popular EmotionLines dataset, EmotionLines itself is not included here. |