Compare commits

..

10 Commits

Author SHA1 Message Date
mkshing 577833e503 minor fix on config.json 2022-07-19 14:46:31 +09:00
mkshing 2707159f64 release v0.2.0 models 2022-07-19 14:43:12 +09:00
mkshing 055792af34 add image encoder's information 2022-05-16 11:05:47 +09:00
mkshing 58c914a486 minor fix on readme 2022-05-11 11:29:57 +09:00
mkshing 05b501981a minor fix on readme 2022-05-11 11:29:22 +09:00
mkshing 357abcbe9d minor fix on readme 2022-05-11 11:26:08 +09:00
mkshing a7dfb0a85f minor fix on readme 2022-05-11 11:06:58 +09:00
mkshing da2e08c95d update README.md 2022-05-11 10:44:33 +09:00
mkshing f093721140 update README.md 2022-05-11 10:39:40 +09:00
mkshing 062b689ee1 Fix README 2022-05-10 18:15:58 +09:00
3 changed files with 34 additions and 2018 deletions

View File

@ -1,7 +1,9 @@
---
language: ja
thumbnail: https://github.com/rinnakk/japanese-pretrained-models/blob/master/rinna.png
license: apache-2.0
tags:
- feature-extraction
- ja
- japanese
- clip
@ -12,16 +14,22 @@ tags:
![rinna-icon](./rinna.png)
This repository provides a Japanese [CLIP (Contrastive Language-Image Pre-Training)](https://arxiv.org/abs/2103.00020) model. The model was trained by [rinna Co., Ltd.](https://corp.rinna.co.jp/)
This is a Japanese [CLIP (Contrastive Language-Image Pre-Training)](https://arxiv.org/abs/2103.00020) model trained by [rinna Co., Ltd.](https://corp.rinna.co.jp/).
Please see [japanese-clip](https://github.com/rinnakk/japanese-clip) for the other available models.
# How to use the model
1. Install package
```shell
$ pip install git+https://github.com/rinnakk/japanese-clip.git
```
2. Run
```python
import io
import requests
@ -53,3 +61,13 @@ with torch.no_grad():
print("Label probs:", text_probs) # prints: [[1.0, 0.0, 0.0]]
```
# Model architecture
The model was trained a ViT-B/16 Transformer architecture as an image encoder and uses a 12-layer BERT as a text encoder. The image encoder was initialized from the [AugReg `vit-base-patch16-224` model](https://github.com/google-research/vision_transformer).
# Training
The model was trained on [CC12M](https://github.com/google-research-datasets/conceptual-12m) translated the captions to Japanese.
# License
[The Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0)

File diff suppressed because it is too large Load Diff

BIN
pytorch_model.bin (Stored with Git LFS)

Binary file not shown.