Update README.md

2023-02-23 18:16:31 +00:00 · 2023-02-23 18:16:31 +00:00 · b8ef58397e
parent 1c047d6fea
commit b8ef58397e
1 changed files with 43 additions and 40 deletions
--- a/README.md
+++ b/README.md
@ -44,7 +44,7 @@ The model card has been written in combination by the Hugging Face team and Inte
 | Version | 1 | 
 | Type | Computer Vision - Monocular Depth Estimation | 
 | Paper or Other Resources | [Vision Transformers for Dense Prediction](https://arxiv.org/abs/2103.13413) and [GitHub Repo](https://github.com/isl-org/DPT) | 
-| License | [MIT](https://github.com/isl-org/DPT/blob/main/LICENSE) |
+| License | Apache 2.0 |
 | Questions or Comments | [Community Tab](https://huggingface.co/Intel/dpt-large/discussions) and [Intel Developers Discord](https://discord.gg/rv2Gp55UJQ)|

 | Intended Use | Description |
@ -53,6 +53,48 @@ The model card has been written in combination by the Hugging Face team and Inte
 | Primary intended users | Anyone doing monocular depth estimation | 
 | Out-of-scope uses | This model in most cases will need to be fine-tuned for your particular task.  |

+
+### How to use
+
+Here is how to use this model for zero-shot depth estimation on an image:
+
+```python
+from transformers import DPTFeatureExtractor, DPTForDepthEstimation
+import torch
+import numpy as np
+from PIL import Image
+import requests
+
+url = "http://images.cocodataset.org/val2017/000000039769.jpg"
+image = Image.open(requests.get(url, stream=True).raw)
+
+feature_extractor = DPTFeatureExtractor.from_pretrained("Intel/dpt-large")
+model = DPTForDepthEstimation.from_pretrained("Intel/dpt-large")
+
+# prepare image for the model
+inputs = feature_extractor(images=image, return_tensors="pt")
+
+with torch.no_grad():
+    outputs = model(**inputs)
+    predicted_depth = outputs.predicted_depth
+
+# interpolate to original size
+prediction = torch.nn.functional.interpolate(
+    predicted_depth.unsqueeze(1),
+    size=image.size[::-1],
+    mode="bicubic",
+    align_corners=False,
+)
+
+# visualize the prediction
+output = prediction.squeeze().cpu().numpy()
+formatted = (output * 255 / np.max(output)).astype("uint8")
+depth = Image.fromarray(formatted)
+```
+
+For more code examples, we refer to the [documentation](https://huggingface.co/docs/transformers/master/en/model_doc/dpt).
+
+
 | Factors | Description | 
 | ----------- | ----------- | 
 | Groups | Multiple datasets compiled together | 
@ -102,45 +144,6 @@ protocol defined in [30]. Relative performance is computed with respect to the o
 | There are no additional caveats or recommendations for this model. |


-### How to use
-
-Here is how to use this model for zero-shot depth estimation on an image:
-
-```python
-from transformers import DPTFeatureExtractor, DPTForDepthEstimation
-import torch
-import numpy as np
-from PIL import Image
-import requests
-
-url = "http://images.cocodataset.org/val2017/000000039769.jpg"
-image = Image.open(requests.get(url, stream=True).raw)
-
-feature_extractor = DPTFeatureExtractor.from_pretrained("Intel/dpt-large")
-model = DPTForDepthEstimation.from_pretrained("Intel/dpt-large")
-
-# prepare image for the model
-inputs = feature_extractor(images=image, return_tensors="pt")
-
-with torch.no_grad():
-    outputs = model(**inputs)
-    predicted_depth = outputs.predicted_depth
-
-# interpolate to original size
-prediction = torch.nn.functional.interpolate(
-    predicted_depth.unsqueeze(1),
-    size=image.size[::-1],
-    mode="bicubic",
-    align_corners=False,
-)
-
-# visualize the prediction
-output = prediction.squeeze().cpu().numpy()
-formatted = (output * 255 / np.max(output)).astype("uint8")
-depth = Image.fromarray(formatted)
-```
-
-For more code examples, we refer to the [documentation](https://huggingface.co/docs/transformers/master/en/model_doc/dpt).

 ### BibTeX entry and citation info