From 59577b379b6f549689d309cdda4e2a3a4548ac8f Mon Sep 17 00:00:00 2001 From: xuming Date: Mon, 14 Mar 2022 06:41:46 +0000 Subject: [PATCH] Update README.md --- README.md | 22 ++++++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index f35c0e0..6daeb50 100644 --- a/README.md +++ b/README.md @@ -7,8 +7,24 @@ tags: - sentence-similarity - transformers --- -# shibing624/text2vec -This is a CoSENT(Cosine Sentence) model: It maps sentences to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search. +# shibing624/text2vec-base-chinese +This is a CoSENT(Cosine Sentence) model: shibing624/text2vec-base-chinese. + +It maps sentences to a 768 dimensional dense vector space and can be used for tasks +like sentence embeddings, text matching or semantic search. + + +## Evaluation +For an automated evaluation of this model, see the *Evaluation Benchmark*: [text2vec](https://github.com/shibing624/text2vec) + +- chinese text matching task: + +| Arch | Backbone | Model Name | ATEC | BQ | LCQMC | PAWSX | STS-B | Avg | QPS | +| :-- | :--- | :---- | :-: | :-: | :-: | :-: | :-: | :-: | :-: | +| Word2Vec | word2vec | w2v-light-tencent-chinese | 20.00 | 31.49 | 59.46 | 2.57 | 55.78 | 33.86 | 10283 | +| SBERT | xlm-roberta-base | paraphrase-multilingual-MiniLM-L12-v2 | 18.42 | 38.52 | 63.96 | 10.14 | 78.90 | 41.99 | 2371 | +| CoSENT | hfl/chinese-macbert-base | text2vec-base-chinese | 31.93 | 42.67 | 70.16 | 17.21 | 79.30 | **48.25** | 2572 | + ## Usage (text2vec) Using this model becomes easy when you have [text2vec](https://github.com/shibing624/text2vec) installed: @@ -86,8 +102,6 @@ print("Sentence embeddings:") print(sentence_embeddings) ``` -## Evaluation Results -For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [text2vec](https://github.com/shibing624/text2vec) ## Full Model Architecture ```