Adding ONNX file of this model (#6 )

- Adding ONNX file of this model (a49525de74e3567169157cb8ff78fa95ad144823) Co-authored-by: Ali Cenk Baytop <alicenkbaytop@users.noreply.huggingface.co>
Adding `safetensors` variant of this model (#5 )
2023-03-22 02:19:36 +00:00 · 2023-03-17 03:34:09 +00:00 · 2022-10-12 23:16:21 +00:00 · 2022-07-04 15:02:50 +00:00 · 2022-06-10 13:15:48 +00:00 · 2022-04-04 01:12:34 +00:00
8 changed files with 42 additions and 19 deletions
--- a/.gitattributes
+++ b/.gitattributes
@ -25,3 +25,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zstandard filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+model.safetensors filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@ -6,10 +6,20 @@ widget:
 - text: "My name is jean-baptiste and I live in montreal"
 - text: "My name is clara and I live in berkeley, california."
 - text: "My name is wolfgang and I live in berlin"
+train-eval-index:
+- config: conll2003
+  task: token-classification
+  task_id: entity_extraction
+  splits:
+    eval_split: validation
+  col_mapping:
+    tokens: tokens
+    ner_tags: tags
+license: mit

 ---

-# roberta-large-ner: model fine-tuned from roberta-large for NER task
+# roberta-large-ner-english: model fine-tuned from roberta-large for NER task

 ## Introduction

@ -37,15 +47,15 @@ Train | Validation
 -|-
 17494 | 3250

-## How to use camembert-ner with HuggingFace
+## How to use roberta-large-ner-english with HuggingFace

-##### Load camembert-ner and its sub-word tokenizer :
+##### Load roberta-large-ner-english and its sub-word tokenizer :

 ```python
 from transformers import AutoTokenizer, AutoModelForTokenClassification

-tokenizer = AutoTokenizer.from_pretrained("Jean-Baptiste/roberta-large-ner")
-model = AutoModelForTokenClassification.from_pretrained("Jean-Baptiste/roberta-large-ner")
+tokenizer = AutoTokenizer.from_pretrained("Jean-Baptiste/roberta-large-ner-english")
+model = AutoModelForTokenClassification.from_pretrained("Jean-Baptiste/roberta-large-ner-english")


 ##### Process text sample (from wikipedia)
@ -119,3 +129,6 @@ ORG|0.7655|0.6437|0.6993
 LOC|0.8727|0.6180|0.7236


+
+For those who could be interested, here is a short article on how I used the results of this model to train a LSTM model for signature detection in emails:
+https://medium.com/@jean-baptiste.polle/lstm-model-for-email-signature-detection-8e990384fefa
--- a/config.json
+++ b/config.json
@ -12,19 +12,19 @@
  "hidden_size": 1024,
  "id2label": {
    "0": "O",
-    "1": "LOC",
-    "2": "PER",
-    "3": "MISC",
-    "4": "ORG"
+    "1": "PER",
+    "2": "ORG",
+    "3": "LOC",
+    "4": "MISC"
  },
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "label2id": {
-    "LOC": 1,
-    "MISC": 3,
+    "LOC": 3,
+    "MISC": 4,
    "O": 0,
-    "ORG": 4,
-    "PER": 2
+    "ORG": 2,
+    "PER": 1
  },
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 514,
--- a/model.onnx
+++ b/model.onnx
--- a/model.safetensors
+++ b/model.safetensors
--- a/pytorch_model.bin
+++ b/pytorch_model.bin
--- a/results.csv
+++ b/results.csv
@ -1,6 +1,6 @@
 ,precision,recall,f1,entity
-0,0.9795249795249795,0.9862561847168774,0.9828790576633339,LOC
-1,0.9914318668643928,0.9927404718693285,0.9920857378400659,PER
-2,0.9292274446245273,0.9262250942380184,0.9277238403451995,MISC
-3,0.9627007895453308,0.966120218579235,0.9644074730669576,ORG
-4,0.9740825890497252,0.9766692954784437,0.9753719894698967,Overall
+0,0.9904511030622325,0.9925754825936314,0.9915121549237741,PER
+1,0.9628323385784048,0.969672131147541,0.966240130683365,ORG
+2,0.974924221548636,0.9725123694337549,0.9737168019815605,LOC
+3,0.9308278867102396,0.9203015616585891,0.925534795559166,MISC
+4,0.9728188879121981,0.9734490010515248,0.9731265700746845,Overall
--- a/tf_model.h5
+++ b/tf_model.h5
Author	SHA1	Message	Date
JB Polle	8f3abc1ef8	Adding ONNX file of this model (#6 ) - Adding ONNX file of this model (a49525de74e3567169157cb8ff78fa95ad144823) Co-authored-by: Ali Cenk Baytop <alicenkbaytop@users.noreply.huggingface.co>	2023-03-22 02:19:36 +00:00
JB Polle	1ba9d1ea33	Adding `safetensors` variant of this model (#5 ) - Adding `safetensors` variant of this model (dfdf5400c640769c59ba34215c34a67778208cbf) Co-authored-by: Safetensors convertbot <SFconvertbot@users.noreply.huggingface.co>	2023-03-17 03:34:09 +00:00
JB Polle	bf7c40ef40	Update README.md	2022-10-12 23:16:21 +00:00
JB Polle	c272484a77	Update README.md	2022-07-04 15:02:50 +00:00
Jean-Baptiste	37688147fa	Add TF weights (#1 ) - Add TF weights (e86cccf1381a1c6be82a0e4d5492f23d179a4b6d) Co-authored-by: Joao Gante <joaogante@users.noreply.huggingface.co>	2022-06-10 13:15:48 +00:00
JB Polle	5a2a7df547	Update README.md	2022-04-04 01:12:34 +00:00
JB Polle	0456c35359	Update README.md	2022-04-04 01:09:51 +00:00
JB Polle	4c2bd0cda5	Update README.md	2022-01-16 19:13:27 +00:00
jeanpoll	94ebad4491	Merge branch 'main' of https://huggingface.co/Jean-Baptiste/roberta-large-ner-english	2022-01-05 22:53:02 -05:00
jeanpoll	983b42225e	Update of the model with similar parameters	2022-01-05 22:41:00 -05:00
JB Polle	9dfa5850cd	Update README.md	2022-01-05 21:31:52 +00:00