Update README.md
This commit is contained in:
parent
4fdb3ab1c8
commit
2ef9d8fee5
24
README.md
24
README.md
|
@ -83,18 +83,19 @@ class KeyphraseExtractionPipeline(TokenClassificationPipeline):
|
|||
|
||||
```python
|
||||
# Load pipeline
|
||||
model_name = "DeDeckerThomas/keyphrase-extraction-distilbert-inspec"
|
||||
model_name = "ml6team/keyphrase-extraction-distilbert-inspec"
|
||||
extractor = KeyphraseExtractionPipeline(model=model_name)
|
||||
```
|
||||
```python
|
||||
# Inference
|
||||
text = """
|
||||
Keyphrase extraction is a technique in text analysis where you extract the important keyphrases from a text.
|
||||
Since this is a time-consuming process, Artificial Intelligence is used to automate it.
|
||||
Currently, classical machine learning methods, that use statistics and linguistics, are widely used for the extraction process.
|
||||
The fact that these methods have been widely used in the community has the advantage that there are many easy-to-use libraries.
|
||||
Now with the recent innovations in deep learning methods (such as recurrent neural networks and transformers, GANS, …),
|
||||
keyphrase extraction can be improved. These new methods also focus on the semantics and context of a document, which is quite an improvement.
|
||||
Since this is a time-consuming process, Artificial Intelligence is used to automate it.
|
||||
Currently, classical machine learning methods, that use statistics and linguistics,
|
||||
are widely used for the extraction process. The fact that these methods have been widely used in the community
|
||||
has the advantage that there are many easy-to-use libraries. Now with the recent innovations in NLP,
|
||||
transformers can be used to improve keyphrase extraction. Transformers also focus on the semantics
|
||||
and context of a document, which is quite an improvement.
|
||||
""".replace(
|
||||
"\n", ""
|
||||
)
|
||||
|
@ -106,10 +107,9 @@ print(keyphrases)
|
|||
|
||||
```
|
||||
# Output
|
||||
['Artificial Intelligence' 'GANS' 'Keyphrase extraction'
|
||||
'classical machine learning' 'deep learning methods'
|
||||
'keyphrase extraction' 'linguistics' 'recurrent neural networks'
|
||||
'semantics' 'statistics' 'text analysis' 'transformers']
|
||||
['artificial intelligence', 'classical machine learning methods',
|
||||
'keyphrase extraction', 'linguistics', 'statistics',
|
||||
'text analysis']
|
||||
```
|
||||
|
||||
## 📚 Training Dataset
|
||||
|
@ -172,7 +172,7 @@ def preprocess_fuction(all_samples_per_split):
|
|||
```
|
||||
|
||||
### Postprocessing
|
||||
For the post-processing, you will need to filter out the B and I labeled tokens and concat the consecutive B and Is. As last you strip the keyphrase to ensure all spaces are removed.
|
||||
For the post-processing, you will need to filter out the B and I labeled tokens and concat the consecutive Bs and Is. As last you strip the keyphrase to ensure all spaces are removed.
|
||||
```python
|
||||
# Define post_process functions
|
||||
def concat_tokens_by_tag(keyphrases):
|
||||
|
@ -216,4 +216,4 @@ The model achieves the following results on the Inspec test set:
|
|||
For more information on the evaluation process, you can take a look at the keyphrase extraction evaluation notebook.
|
||||
|
||||
## 🚨 Issues
|
||||
Please feel free to contact Thomas De Decker for any problems with this model.
|
||||
Please feel free to start discussions in the Community Tab.
|
Loading…
Reference in New Issue