diff --git a/README.md b/README.md index e223b1c..5940137 100644 --- a/README.md +++ b/README.md @@ -25,6 +25,18 @@ Disclaimer: The model card adapts the model card from [here](https://huggingface ## Model Details +UPDATE (10/03/22): We have updated the model! We found that [laion/CLIP-ViT-B-32-laion2B-s34B-b79K](https://huggingface.co/laion/CLIP-ViT-B-32-laion2B-s34B-b79K) checkpoint worked better than original OpenAI CLIP on Fashion. We thus fine-tune a newer (and better!) version of FashionCLIP (henceforth FashionCLIP 2.0), while keeping the architecture the same. We postulate that the perofrmance gains afforded by `laion/CLIP-ViT-B-32-laion2B-s34B-b79K` are due to the increased training data (5x OpenAI CLIP data). Our [thesis](https://www.nature.com/articles/s41598-022-23052-9), however, remains the same -- fine-tuning `laion/CLIP` on our fashion dataset improved zero-shot perofrmance across our benchmarks. See the below table comparing weighted macro F1 score across models. + + +| Model | FMNIST | KAGL | DEEP | +| ------------- | ------------- | ------------- | ------------- | +| OpenAI CLIP | 0.66 | 0.63 | 0.45 | +| FashionCLIP | 0.74 | 0.67 | 0.48 | +| Laion CLIP | 0.78 | 0.71 | 0.58 | +| FashionCLIP 2.0 | __0.83__ | __0.73__ | __0.62__ | + +--- + FashionCLIP is a CLIP-based model developed to produce general product representations for fashion concepts. Leveraging the pre-trained checkpoint (ViT-B/32) released by [OpenAI](https://github.com/openai/CLIP), we train FashionCLIP on a large, high-quality novel fashion dataset to study whether domain specific fine-tuning of CLIP-like models is sufficient to produce product representations that are zero-shot transferable to entirely new datasets and tassks. FashionCLIP was not developed for model deplyoment - to do so, researchers will first need to carefully study their capabilities in relation to the specific context they’re being deployed within. ### Model Date