Update README.md
This commit is contained in:
parent
1c047d6fea
commit
b8ef58397e
83
README.md
83
README.md
|
@ -44,7 +44,7 @@ The model card has been written in combination by the Hugging Face team and Inte
|
||||||
| Version | 1 |
|
| Version | 1 |
|
||||||
| Type | Computer Vision - Monocular Depth Estimation |
|
| Type | Computer Vision - Monocular Depth Estimation |
|
||||||
| Paper or Other Resources | [Vision Transformers for Dense Prediction](https://arxiv.org/abs/2103.13413) and [GitHub Repo](https://github.com/isl-org/DPT) |
|
| Paper or Other Resources | [Vision Transformers for Dense Prediction](https://arxiv.org/abs/2103.13413) and [GitHub Repo](https://github.com/isl-org/DPT) |
|
||||||
| License | [MIT](https://github.com/isl-org/DPT/blob/main/LICENSE) |
|
| License | Apache 2.0 |
|
||||||
| Questions or Comments | [Community Tab](https://huggingface.co/Intel/dpt-large/discussions) and [Intel Developers Discord](https://discord.gg/rv2Gp55UJQ)|
|
| Questions or Comments | [Community Tab](https://huggingface.co/Intel/dpt-large/discussions) and [Intel Developers Discord](https://discord.gg/rv2Gp55UJQ)|
|
||||||
|
|
||||||
| Intended Use | Description |
|
| Intended Use | Description |
|
||||||
|
@ -53,6 +53,48 @@ The model card has been written in combination by the Hugging Face team and Inte
|
||||||
| Primary intended users | Anyone doing monocular depth estimation |
|
| Primary intended users | Anyone doing monocular depth estimation |
|
||||||
| Out-of-scope uses | This model in most cases will need to be fine-tuned for your particular task. |
|
| Out-of-scope uses | This model in most cases will need to be fine-tuned for your particular task. |
|
||||||
|
|
||||||
|
|
||||||
|
### How to use
|
||||||
|
|
||||||
|
Here is how to use this model for zero-shot depth estimation on an image:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from transformers import DPTFeatureExtractor, DPTForDepthEstimation
|
||||||
|
import torch
|
||||||
|
import numpy as np
|
||||||
|
from PIL import Image
|
||||||
|
import requests
|
||||||
|
|
||||||
|
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
|
||||||
|
image = Image.open(requests.get(url, stream=True).raw)
|
||||||
|
|
||||||
|
feature_extractor = DPTFeatureExtractor.from_pretrained("Intel/dpt-large")
|
||||||
|
model = DPTForDepthEstimation.from_pretrained("Intel/dpt-large")
|
||||||
|
|
||||||
|
# prepare image for the model
|
||||||
|
inputs = feature_extractor(images=image, return_tensors="pt")
|
||||||
|
|
||||||
|
with torch.no_grad():
|
||||||
|
outputs = model(**inputs)
|
||||||
|
predicted_depth = outputs.predicted_depth
|
||||||
|
|
||||||
|
# interpolate to original size
|
||||||
|
prediction = torch.nn.functional.interpolate(
|
||||||
|
predicted_depth.unsqueeze(1),
|
||||||
|
size=image.size[::-1],
|
||||||
|
mode="bicubic",
|
||||||
|
align_corners=False,
|
||||||
|
)
|
||||||
|
|
||||||
|
# visualize the prediction
|
||||||
|
output = prediction.squeeze().cpu().numpy()
|
||||||
|
formatted = (output * 255 / np.max(output)).astype("uint8")
|
||||||
|
depth = Image.fromarray(formatted)
|
||||||
|
```
|
||||||
|
|
||||||
|
For more code examples, we refer to the [documentation](https://huggingface.co/docs/transformers/master/en/model_doc/dpt).
|
||||||
|
|
||||||
|
|
||||||
| Factors | Description |
|
| Factors | Description |
|
||||||
| ----------- | ----------- |
|
| ----------- | ----------- |
|
||||||
| Groups | Multiple datasets compiled together |
|
| Groups | Multiple datasets compiled together |
|
||||||
|
@ -102,45 +144,6 @@ protocol defined in [30]. Relative performance is computed with respect to the o
|
||||||
| There are no additional caveats or recommendations for this model. |
|
| There are no additional caveats or recommendations for this model. |
|
||||||
|
|
||||||
|
|
||||||
### How to use
|
|
||||||
|
|
||||||
Here is how to use this model for zero-shot depth estimation on an image:
|
|
||||||
|
|
||||||
```python
|
|
||||||
from transformers import DPTFeatureExtractor, DPTForDepthEstimation
|
|
||||||
import torch
|
|
||||||
import numpy as np
|
|
||||||
from PIL import Image
|
|
||||||
import requests
|
|
||||||
|
|
||||||
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
|
|
||||||
image = Image.open(requests.get(url, stream=True).raw)
|
|
||||||
|
|
||||||
feature_extractor = DPTFeatureExtractor.from_pretrained("Intel/dpt-large")
|
|
||||||
model = DPTForDepthEstimation.from_pretrained("Intel/dpt-large")
|
|
||||||
|
|
||||||
# prepare image for the model
|
|
||||||
inputs = feature_extractor(images=image, return_tensors="pt")
|
|
||||||
|
|
||||||
with torch.no_grad():
|
|
||||||
outputs = model(**inputs)
|
|
||||||
predicted_depth = outputs.predicted_depth
|
|
||||||
|
|
||||||
# interpolate to original size
|
|
||||||
prediction = torch.nn.functional.interpolate(
|
|
||||||
predicted_depth.unsqueeze(1),
|
|
||||||
size=image.size[::-1],
|
|
||||||
mode="bicubic",
|
|
||||||
align_corners=False,
|
|
||||||
)
|
|
||||||
|
|
||||||
# visualize the prediction
|
|
||||||
output = prediction.squeeze().cpu().numpy()
|
|
||||||
formatted = (output * 255 / np.max(output)).astype("uint8")
|
|
||||||
depth = Image.fromarray(formatted)
|
|
||||||
```
|
|
||||||
|
|
||||||
For more code examples, we refer to the [documentation](https://huggingface.co/docs/transformers/master/en/model_doc/dpt).
|
|
||||||
|
|
||||||
### BibTeX entry and citation info
|
### BibTeX entry and citation info
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue