Update README.md

This commit is contained in:
Benjamin Consolvo 2023-02-23 18:16:31 +00:00 committed by huggingface-web
parent 1c047d6fea
commit b8ef58397e
1 changed files with 43 additions and 40 deletions

View File

@ -44,7 +44,7 @@ The model card has been written in combination by the Hugging Face team and Inte
| Version | 1 |
| Type | Computer Vision - Monocular Depth Estimation |
| Paper or Other Resources | [Vision Transformers for Dense Prediction](https://arxiv.org/abs/2103.13413) and [GitHub Repo](https://github.com/isl-org/DPT) |
| License | [MIT](https://github.com/isl-org/DPT/blob/main/LICENSE) |
| License | Apache 2.0 |
| Questions or Comments | [Community Tab](https://huggingface.co/Intel/dpt-large/discussions) and [Intel Developers Discord](https://discord.gg/rv2Gp55UJQ)|
| Intended Use | Description |
@ -53,6 +53,48 @@ The model card has been written in combination by the Hugging Face team and Inte
| Primary intended users | Anyone doing monocular depth estimation |
| Out-of-scope uses | This model in most cases will need to be fine-tuned for your particular task. |
### How to use
Here is how to use this model for zero-shot depth estimation on an image:
```python
from transformers import DPTFeatureExtractor, DPTForDepthEstimation
import torch
import numpy as np
from PIL import Image
import requests
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
feature_extractor = DPTFeatureExtractor.from_pretrained("Intel/dpt-large")
model = DPTForDepthEstimation.from_pretrained("Intel/dpt-large")
# prepare image for the model
inputs = feature_extractor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
predicted_depth = outputs.predicted_depth
# interpolate to original size
prediction = torch.nn.functional.interpolate(
predicted_depth.unsqueeze(1),
size=image.size[::-1],
mode="bicubic",
align_corners=False,
)
# visualize the prediction
output = prediction.squeeze().cpu().numpy()
formatted = (output * 255 / np.max(output)).astype("uint8")
depth = Image.fromarray(formatted)
```
For more code examples, we refer to the [documentation](https://huggingface.co/docs/transformers/master/en/model_doc/dpt).
| Factors | Description |
| ----------- | ----------- |
| Groups | Multiple datasets compiled together |
@ -102,45 +144,6 @@ protocol defined in [30]. Relative performance is computed with respect to the o
| There are no additional caveats or recommendations for this model. |
### How to use
Here is how to use this model for zero-shot depth estimation on an image:
```python
from transformers import DPTFeatureExtractor, DPTForDepthEstimation
import torch
import numpy as np
from PIL import Image
import requests
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
feature_extractor = DPTFeatureExtractor.from_pretrained("Intel/dpt-large")
model = DPTForDepthEstimation.from_pretrained("Intel/dpt-large")
# prepare image for the model
inputs = feature_extractor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
predicted_depth = outputs.predicted_depth
# interpolate to original size
prediction = torch.nn.functional.interpolate(
predicted_depth.unsqueeze(1),
size=image.size[::-1],
mode="bicubic",
align_corners=False,
)
# visualize the prediction
output = prediction.squeeze().cpu().numpy()
formatted = (output * 255 / np.max(output)).astype("uint8")
depth = Image.fromarray(formatted)
```
For more code examples, we refer to the [documentation](https://huggingface.co/docs/transformers/master/en/model_doc/dpt).
### BibTeX entry and citation info