3 changed files with 2018 additions and 34 deletions
--- a/README.md
+++ b/README.md
@ -1,9 +1,7 @@
 ---
 language: ja
 thumbnail: https://github.com/rinnakk/japanese-pretrained-models/blob/master/rinna.png
-license: apache-2.0
 tags:
- feature-extraction
 - ja
 - japanese
 - clip
@ -14,22 +12,16 @@ tags:

 ![rinna-icon](./rinna.png)

-This is a Japanese [CLIP (Contrastive Language-Image Pre-Training)](https://arxiv.org/abs/2103.00020) model trained by [rinna Co., Ltd.](https://corp.rinna.co.jp/).
-
-Please see [japanese-clip](https://github.com/rinnakk/japanese-clip) for the other available models.
-
+This repository provides a Japanese [CLIP (Contrastive Language-Image Pre-Training)](https://arxiv.org/abs/2103.00020) model. The model was trained by [rinna Co., Ltd.](https://corp.rinna.co.jp/)

 # How to use the model


 1. Install package
-
 ```shell
 $ pip install git+https://github.com/rinnakk/japanese-clip.git
 ```
-
 2. Run
-
 ```python
 import io
 import requests
@ -61,13 +53,3 @@ with torch.no_grad():
 print("Label probs:", text_probs)  # prints: [[1.0, 0.0, 0.0]]
 ```

-# Model architecture
-The model was trained  a ViT-B/16 Transformer architecture as an image encoder and uses a 12-layer BERT as a text encoder. The image encoder was initialized from the [AugReg `vit-base-patch16-224` model](https://github.com/google-research/vision_transformer).
-
-# Training
-The model was trained on [CC12M](https://github.com/google-research-datasets/conceptual-12m) translated the captions to Japanese.
-
-
-# License
-
-[The Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0)
--- a/config.json
+++ b/config.json
--- a/pytorch_model.bin
+++ b/pytorch_model.bin