Update README.md
This commit is contained in:
parent
4fdb3ab1c8
commit
2ef9d8fee5
22
README.md
22
README.md
|
@ -83,7 +83,7 @@ class KeyphraseExtractionPipeline(TokenClassificationPipeline):
|
||||||
|
|
||||||
```python
|
```python
|
||||||
# Load pipeline
|
# Load pipeline
|
||||||
model_name = "DeDeckerThomas/keyphrase-extraction-distilbert-inspec"
|
model_name = "ml6team/keyphrase-extraction-distilbert-inspec"
|
||||||
extractor = KeyphraseExtractionPipeline(model=model_name)
|
extractor = KeyphraseExtractionPipeline(model=model_name)
|
||||||
```
|
```
|
||||||
```python
|
```python
|
||||||
|
@ -91,10 +91,11 @@ extractor = KeyphraseExtractionPipeline(model=model_name)
|
||||||
text = """
|
text = """
|
||||||
Keyphrase extraction is a technique in text analysis where you extract the important keyphrases from a text.
|
Keyphrase extraction is a technique in text analysis where you extract the important keyphrases from a text.
|
||||||
Since this is a time-consuming process, Artificial Intelligence is used to automate it.
|
Since this is a time-consuming process, Artificial Intelligence is used to automate it.
|
||||||
Currently, classical machine learning methods, that use statistics and linguistics, are widely used for the extraction process.
|
Currently, classical machine learning methods, that use statistics and linguistics,
|
||||||
The fact that these methods have been widely used in the community has the advantage that there are many easy-to-use libraries.
|
are widely used for the extraction process. The fact that these methods have been widely used in the community
|
||||||
Now with the recent innovations in deep learning methods (such as recurrent neural networks and transformers, GANS, …),
|
has the advantage that there are many easy-to-use libraries. Now with the recent innovations in NLP,
|
||||||
keyphrase extraction can be improved. These new methods also focus on the semantics and context of a document, which is quite an improvement.
|
transformers can be used to improve keyphrase extraction. Transformers also focus on the semantics
|
||||||
|
and context of a document, which is quite an improvement.
|
||||||
""".replace(
|
""".replace(
|
||||||
"\n", ""
|
"\n", ""
|
||||||
)
|
)
|
||||||
|
@ -106,10 +107,9 @@ print(keyphrases)
|
||||||
|
|
||||||
```
|
```
|
||||||
# Output
|
# Output
|
||||||
['Artificial Intelligence' 'GANS' 'Keyphrase extraction'
|
['artificial intelligence', 'classical machine learning methods',
|
||||||
'classical machine learning' 'deep learning methods'
|
'keyphrase extraction', 'linguistics', 'statistics',
|
||||||
'keyphrase extraction' 'linguistics' 'recurrent neural networks'
|
'text analysis']
|
||||||
'semantics' 'statistics' 'text analysis' 'transformers']
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## 📚 Training Dataset
|
## 📚 Training Dataset
|
||||||
|
@ -172,7 +172,7 @@ def preprocess_fuction(all_samples_per_split):
|
||||||
```
|
```
|
||||||
|
|
||||||
### Postprocessing
|
### Postprocessing
|
||||||
For the post-processing, you will need to filter out the B and I labeled tokens and concat the consecutive B and Is. As last you strip the keyphrase to ensure all spaces are removed.
|
For the post-processing, you will need to filter out the B and I labeled tokens and concat the consecutive Bs and Is. As last you strip the keyphrase to ensure all spaces are removed.
|
||||||
```python
|
```python
|
||||||
# Define post_process functions
|
# Define post_process functions
|
||||||
def concat_tokens_by_tag(keyphrases):
|
def concat_tokens_by_tag(keyphrases):
|
||||||
|
@ -216,4 +216,4 @@ The model achieves the following results on the Inspec test set:
|
||||||
For more information on the evaluation process, you can take a look at the keyphrase extraction evaluation notebook.
|
For more information on the evaluation process, you can take a look at the keyphrase extraction evaluation notebook.
|
||||||
|
|
||||||
## 🚨 Issues
|
## 🚨 Issues
|
||||||
Please feel free to contact Thomas De Decker for any problems with this model.
|
Please feel free to start discussions in the Community Tab.
|
Loading…
Reference in New Issue