Compare commits

..

11 Commits

Author SHA1 Message Date
JB Polle 8f3abc1ef8 Adding ONNX file of this model (#6)
- Adding ONNX file of this model (a49525de74e3567169157cb8ff78fa95ad144823)


Co-authored-by: Ali Cenk Baytop <alicenkbaytop@users.noreply.huggingface.co>
2023-03-22 02:19:36 +00:00
JB Polle 1ba9d1ea33 Adding `safetensors` variant of this model (#5)
- Adding `safetensors` variant of this model (dfdf5400c640769c59ba34215c34a67778208cbf)


Co-authored-by: Safetensors convertbot <SFconvertbot@users.noreply.huggingface.co>
2023-03-17 03:34:09 +00:00
JB Polle bf7c40ef40 Update README.md 2022-10-12 23:16:21 +00:00
JB Polle c272484a77 Update README.md 2022-07-04 15:02:50 +00:00
Jean-Baptiste 37688147fa Add TF weights (#1)
- Add TF weights (e86cccf1381a1c6be82a0e4d5492f23d179a4b6d)


Co-authored-by: Joao Gante <joaogante@users.noreply.huggingface.co>
2022-06-10 13:15:48 +00:00
JB Polle 5a2a7df547 Update README.md 2022-04-04 01:12:34 +00:00
JB Polle 0456c35359 Update README.md 2022-04-04 01:09:51 +00:00
JB Polle 4c2bd0cda5 Update README.md 2022-01-16 19:13:27 +00:00
jeanpoll 94ebad4491 Merge branch 'main' of https://huggingface.co/Jean-Baptiste/roberta-large-ner-english 2022-01-05 22:53:02 -05:00
jeanpoll 983b42225e Update of the model with similar parameters 2022-01-05 22:41:00 -05:00
JB Polle 9dfa5850cd Update README.md 2022-01-05 21:31:52 +00:00
8 changed files with 42 additions and 19 deletions

1
.gitattributes vendored
View File

@ -25,3 +25,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zstandard filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
model.safetensors filter=lfs diff=lfs merge=lfs -text

View File

@ -6,10 +6,20 @@ widget:
- text: "My name is jean-baptiste and I live in montreal"
- text: "My name is clara and I live in berkeley, california."
- text: "My name is wolfgang and I live in berlin"
train-eval-index:
- config: conll2003
task: token-classification
task_id: entity_extraction
splits:
eval_split: validation
col_mapping:
tokens: tokens
ner_tags: tags
license: mit
---
# roberta-large-ner: model fine-tuned from roberta-large for NER task
# roberta-large-ner-english: model fine-tuned from roberta-large for NER task
## Introduction
@ -37,15 +47,15 @@ Train | Validation
-|-
17494 | 3250
## How to use camembert-ner with HuggingFace
## How to use roberta-large-ner-english with HuggingFace
##### Load camembert-ner and its sub-word tokenizer :
##### Load roberta-large-ner-english and its sub-word tokenizer :
```python
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("Jean-Baptiste/roberta-large-ner")
model = AutoModelForTokenClassification.from_pretrained("Jean-Baptiste/roberta-large-ner")
tokenizer = AutoTokenizer.from_pretrained("Jean-Baptiste/roberta-large-ner-english")
model = AutoModelForTokenClassification.from_pretrained("Jean-Baptiste/roberta-large-ner-english")
##### Process text sample (from wikipedia)
@ -119,3 +129,6 @@ ORG|0.7655|0.6437|0.6993
LOC|0.8727|0.6180|0.7236
For those who could be interested, here is a short article on how I used the results of this model to train a LSTM model for signature detection in emails:
https://medium.com/@jean-baptiste.polle/lstm-model-for-email-signature-detection-8e990384fefa

View File

@ -12,19 +12,19 @@
"hidden_size": 1024,
"id2label": {
"0": "O",
"1": "LOC",
"2": "PER",
"3": "MISC",
"4": "ORG"
"1": "PER",
"2": "ORG",
"3": "LOC",
"4": "MISC"
},
"initializer_range": 0.02,
"intermediate_size": 4096,
"label2id": {
"LOC": 1,
"MISC": 3,
"LOC": 3,
"MISC": 4,
"O": 0,
"ORG": 4,
"PER": 2
"ORG": 2,
"PER": 1
},
"layer_norm_eps": 1e-05,
"max_position_embeddings": 514,

BIN
model.onnx (Stored with Git LFS) Normal file

Binary file not shown.

BIN
model.safetensors (Stored with Git LFS) Normal file

Binary file not shown.

BIN
pytorch_model.bin (Stored with Git LFS)

Binary file not shown.

View File

@ -1,6 +1,6 @@
,precision,recall,f1,entity
0,0.9795249795249795,0.9862561847168774,0.9828790576633339,LOC
1,0.9914318668643928,0.9927404718693285,0.9920857378400659,PER
2,0.9292274446245273,0.9262250942380184,0.9277238403451995,MISC
3,0.9627007895453308,0.966120218579235,0.9644074730669576,ORG
4,0.9740825890497252,0.9766692954784437,0.9753719894698967,Overall
0,0.9904511030622325,0.9925754825936314,0.9915121549237741,PER
1,0.9628323385784048,0.969672131147541,0.966240130683365,ORG
2,0.974924221548636,0.9725123694337549,0.9737168019815605,LOC
3,0.9308278867102396,0.9203015616585891,0.925534795559166,MISC
4,0.9728188879121981,0.9734490010515248,0.9731265700746845,Overall

1 precision recall f1 entity
2 0 0.9795249795249795 0.9904511030622325 0.9862561847168774 0.9925754825936314 0.9828790576633339 0.9915121549237741 LOC PER
3 1 0.9914318668643928 0.9628323385784048 0.9927404718693285 0.969672131147541 0.9920857378400659 0.966240130683365 PER ORG
4 2 0.9292274446245273 0.974924221548636 0.9262250942380184 0.9725123694337549 0.9277238403451995 0.9737168019815605 MISC LOC
5 3 0.9627007895453308 0.9308278867102396 0.966120218579235 0.9203015616585891 0.9644074730669576 0.925534795559166 ORG MISC
6 4 0.9740825890497252 0.9728188879121981 0.9766692954784437 0.9734490010515248 0.9753719894698967 0.9731265700746845 Overall

BIN
tf_model.h5 (Stored with Git LFS) Normal file

Binary file not shown.