Update README.md

This commit is contained in:
xuming 2022-03-07 13:02:32 +00:00 committed by huggingface-web
parent 59e85aae8d
commit 01c2deec18
1 changed files with 39 additions and 3 deletions

View File

@ -9,22 +9,36 @@ tags:
--- ---
# shibing624/text2vec # shibing624/text2vec
This is a CoSENT(Cosine Sentence) model: It maps sentences to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search. This is a CoSENT(Cosine Sentence) model: It maps sentences to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
## Usage (text2vec) ## Usage (text2vec)
Using this model becomes easy when you have [text2vec](https://github.com/shibing624/text2vec) installed: Using this model becomes easy when you have [text2vec](https://github.com/shibing624/text2vec) installed:
``` ```
pip install -U text2vec pip install -U text2vec
``` ```
Then you can use the model like this: Then you can use the model like this:
```python ```python
from text2vec import SBert from text2vec import SentenceModel
sentences = ['如何更换花呗绑定银行卡', '花呗更改绑定银行卡'] sentences = ['如何更换花呗绑定银行卡', '花呗更改绑定银行卡']
model = SBert('shibing624/text2vec-base-chinese') model = SentenceModel('shibing624/text2vec-base-chinese')
embeddings = model.encode(sentences) embeddings = model.encode(sentences)
print(embeddings) print(embeddings)
``` ```
## Usage (HuggingFace Transformers) ## Usage (HuggingFace Transformers)
Without [text2vec](https://github.com/shibing624/text2vec), you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings. Without [text2vec](https://github.com/shibing624/text2vec), you can use the model like this:
First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
Install transformers:
```
pip install transformers
```
Then load model and predict:
```python ```python
from transformers import BertTokenizer, BertModel from transformers import BertTokenizer, BertModel
import torch import torch
@ -50,6 +64,28 @@ sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask']
print("Sentence embeddings:") print("Sentence embeddings:")
print(sentence_embeddings) print(sentence_embeddings)
``` ```
## Usage (sentence-transformers)
[sentence-transformers](https://github.com/UKPLab/sentence-transformers) is a popular library to compute dense vector representations for sentences.
Install sentence-transformers:
```
pip install -U sentence-transformers
```
Then load model and predict:
```python
from sentence_transformers import SentenceTransformer
m = SentenceTransformer("shibing624/text2vec-base-chinese")
sentences = ['如何更换花呗绑定银行卡', '花呗更改绑定银行卡']
sentence_embeddings = m.encode(sentences)
print("Sentence embeddings:")
print(sentence_embeddings)
```
## Evaluation Results ## Evaluation Results
For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [text2vec](https://github.com/shibing624/text2vec) For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [text2vec](https://github.com/shibing624/text2vec)