Compare commits

...

10 Commits

Author SHA1 Message Date
Niels Rogge 6010ab2712 Update README.md 2022-08-06 10:24:23 +00:00
Niels Rogge 994fb52b50 Update license 2022-08-06 09:53:29 +00:00
joaogante 677af011c3 Add TF weights (#1)
- Add TF weights (8092da502e9df2107dda87e91cacf35d4da8222c)
2022-07-20 09:52:37 +00:00
Niels Rogge f5533fed06 Create README.md 2022-02-21 20:07:03 +00:00
Niels Rogge 684c733388 Update preprocessor_config.json 2021-11-17 14:43:07 +00:00
Niels Rogge ad66f92b30 Set reduce_zero_label to True 2021-11-09 15:25:00 +00:00
Niels Rogge 17174e826e Set do_pad to False by default 2021-11-09 14:23:26 +00:00
Niels Rogge 57e4a51cad Update model name 2021-10-15 10:02:54 +00:00
Niels Rogge 77fa31aeb8 Update model 2021-10-15 08:42:13 +00:00
Niels Rogge 409aa62386 Add new weights as decode head is modularized 2021-07-28 15:11:16 +02:00
5 changed files with 389 additions and 311 deletions

77
README.md Normal file
View File

@ -0,0 +1,77 @@
---
license: other
tags:
- vision
- image-segmentation
datasets:
- scene_parse_150
widget:
- src: https://huggingface.co/datasets/hf-internal-testing/fixtures_ade20k/resolve/main/ADE_val_00000001.jpg
example_title: House
- src: https://huggingface.co/datasets/hf-internal-testing/fixtures_ade20k/resolve/main/ADE_val_00000002.jpg
example_title: Castle
---
# SegFormer (b0-sized) model fine-tuned on ADE20k
SegFormer model fine-tuned on ADE20k at resolution 512x512. It was introduced in the paper [SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers](https://arxiv.org/abs/2105.15203) by Xie et al. and first released in [this repository](https://github.com/NVlabs/SegFormer).
Disclaimer: The team releasing SegFormer did not write a model card for this model so this model card has been written by the Hugging Face team.
## Model description
SegFormer consists of a hierarchical Transformer encoder and a lightweight all-MLP decode head to achieve great results on semantic segmentation benchmarks such as ADE20K and Cityscapes. The hierarchical Transformer is first pre-trained on ImageNet-1k, after which a decode head is added and fine-tuned altogether on a downstream dataset.
## Intended uses & limitations
You can use the raw model for semantic segmentation. See the [model hub](https://huggingface.co/models?other=segformer) to look for fine-tuned versions on a task that interests you.
### How to use
Here is how to use this model to classify an image of the COCO 2017 dataset into one of the 1,000 ImageNet classes:
```python
from transformers import SegformerFeatureExtractor, SegformerForSemanticSegmentation
from PIL import Image
import requests
feature_extractor = SegformerFeatureExtractor.from_pretrained("nvidia/segformer-b0-finetuned-ade-512-512")
model = SegformerForSemanticSegmentation.from_pretrained("nvidia/segformer-b0-finetuned-ade-512-512")
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = feature_extractor(images=image, return_tensors="pt")
outputs = model(**inputs)
logits = outputs.logits # shape (batch_size, num_labels, height/4, width/4)
```
For more code examples, we refer to the [documentation](https://huggingface.co/transformers/model_doc/segformer.html#).
### License
The license for this model can be found [here](https://github.com/NVlabs/SegFormer/blob/master/LICENSE).
### BibTeX entry and citation info
```bibtex
@article{DBLP:journals/corr/abs-2105-15203,
author = {Enze Xie and
Wenhai Wang and
Zhiding Yu and
Anima Anandkumar and
Jose M. Alvarez and
Ping Luo},
title = {SegFormer: Simple and Efficient Design for Semantic Segmentation with
Transformers},
journal = {CoRR},
volume = {abs/2105.15203},
year = {2021},
url = {https://arxiv.org/abs/2105.15203},
eprinttype = {arXiv},
eprint = {2105.15203},
timestamp = {Wed, 02 Jun 2021 11:46:42 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-2105-15203.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```

View File

@ -1,6 +1,6 @@
{
"architectures": [
"SegFormerForImageSegmentation"
"SegformerForSemanticSegmentation"
],
"attention_probs_dropout_prob": 0.0,
"classifier_dropout_prob": 0.1,
@ -27,310 +27,310 @@
256
],
"id2label": {
"0": "LABEL_0",
"1": "LABEL_1",
"2": "LABEL_2",
"3": "LABEL_3",
"4": "LABEL_4",
"5": "LABEL_5",
"6": "LABEL_6",
"7": "LABEL_7",
"8": "LABEL_8",
"9": "LABEL_9",
"10": "LABEL_10",
"11": "LABEL_11",
"12": "LABEL_12",
"13": "LABEL_13",
"14": "LABEL_14",
"15": "LABEL_15",
"16": "LABEL_16",
"17": "LABEL_17",
"18": "LABEL_18",
"19": "LABEL_19",
"20": "LABEL_20",
"21": "LABEL_21",
"22": "LABEL_22",
"23": "LABEL_23",
"24": "LABEL_24",
"25": "LABEL_25",
"26": "LABEL_26",
"27": "LABEL_27",
"28": "LABEL_28",
"29": "LABEL_29",
"30": "LABEL_30",
"31": "LABEL_31",
"32": "LABEL_32",
"33": "LABEL_33",
"34": "LABEL_34",
"35": "LABEL_35",
"36": "LABEL_36",
"37": "LABEL_37",
"38": "LABEL_38",
"39": "LABEL_39",
"40": "LABEL_40",
"41": "LABEL_41",
"42": "LABEL_42",
"43": "LABEL_43",
"44": "LABEL_44",
"45": "LABEL_45",
"46": "LABEL_46",
"47": "LABEL_47",
"48": "LABEL_48",
"49": "LABEL_49",
"50": "LABEL_50",
"51": "LABEL_51",
"52": "LABEL_52",
"53": "LABEL_53",
"54": "LABEL_54",
"55": "LABEL_55",
"56": "LABEL_56",
"57": "LABEL_57",
"58": "LABEL_58",
"59": "LABEL_59",
"60": "LABEL_60",
"61": "LABEL_61",
"62": "LABEL_62",
"63": "LABEL_63",
"64": "LABEL_64",
"65": "LABEL_65",
"66": "LABEL_66",
"67": "LABEL_67",
"68": "LABEL_68",
"69": "LABEL_69",
"70": "LABEL_70",
"71": "LABEL_71",
"72": "LABEL_72",
"73": "LABEL_73",
"74": "LABEL_74",
"75": "LABEL_75",
"76": "LABEL_76",
"77": "LABEL_77",
"78": "LABEL_78",
"79": "LABEL_79",
"80": "LABEL_80",
"81": "LABEL_81",
"82": "LABEL_82",
"83": "LABEL_83",
"84": "LABEL_84",
"85": "LABEL_85",
"86": "LABEL_86",
"87": "LABEL_87",
"88": "LABEL_88",
"89": "LABEL_89",
"90": "LABEL_90",
"91": "LABEL_91",
"92": "LABEL_92",
"93": "LABEL_93",
"94": "LABEL_94",
"95": "LABEL_95",
"96": "LABEL_96",
"97": "LABEL_97",
"98": "LABEL_98",
"99": "LABEL_99",
"100": "LABEL_100",
"101": "LABEL_101",
"102": "LABEL_102",
"103": "LABEL_103",
"104": "LABEL_104",
"105": "LABEL_105",
"106": "LABEL_106",
"107": "LABEL_107",
"108": "LABEL_108",
"109": "LABEL_109",
"110": "LABEL_110",
"111": "LABEL_111",
"112": "LABEL_112",
"113": "LABEL_113",
"114": "LABEL_114",
"115": "LABEL_115",
"116": "LABEL_116",
"117": "LABEL_117",
"118": "LABEL_118",
"119": "LABEL_119",
"120": "LABEL_120",
"121": "LABEL_121",
"122": "LABEL_122",
"123": "LABEL_123",
"124": "LABEL_124",
"125": "LABEL_125",
"126": "LABEL_126",
"127": "LABEL_127",
"128": "LABEL_128",
"129": "LABEL_129",
"130": "LABEL_130",
"131": "LABEL_131",
"132": "LABEL_132",
"133": "LABEL_133",
"134": "LABEL_134",
"135": "LABEL_135",
"136": "LABEL_136",
"137": "LABEL_137",
"138": "LABEL_138",
"139": "LABEL_139",
"140": "LABEL_140",
"141": "LABEL_141",
"142": "LABEL_142",
"143": "LABEL_143",
"144": "LABEL_144",
"145": "LABEL_145",
"146": "LABEL_146",
"147": "LABEL_147",
"148": "LABEL_148",
"149": "LABEL_149"
"0": "wall",
"1": "building",
"2": "sky",
"3": "floor",
"4": "tree",
"5": "ceiling",
"6": "road",
"7": "bed ",
"8": "windowpane",
"9": "grass",
"10": "cabinet",
"11": "sidewalk",
"12": "person",
"13": "earth",
"14": "door",
"15": "table",
"16": "mountain",
"17": "plant",
"18": "curtain",
"19": "chair",
"20": "car",
"21": "water",
"22": "painting",
"23": "sofa",
"24": "shelf",
"25": "house",
"26": "sea",
"27": "mirror",
"28": "rug",
"29": "field",
"30": "armchair",
"31": "seat",
"32": "fence",
"33": "desk",
"34": "rock",
"35": "wardrobe",
"36": "lamp",
"37": "bathtub",
"38": "railing",
"39": "cushion",
"40": "base",
"41": "box",
"42": "column",
"43": "signboard",
"44": "chest of drawers",
"45": "counter",
"46": "sand",
"47": "sink",
"48": "skyscraper",
"49": "fireplace",
"50": "refrigerator",
"51": "grandstand",
"52": "path",
"53": "stairs",
"54": "runway",
"55": "case",
"56": "pool table",
"57": "pillow",
"58": "screen door",
"59": "stairway",
"60": "river",
"61": "bridge",
"62": "bookcase",
"63": "blind",
"64": "coffee table",
"65": "toilet",
"66": "flower",
"67": "book",
"68": "hill",
"69": "bench",
"70": "countertop",
"71": "stove",
"72": "palm",
"73": "kitchen island",
"74": "computer",
"75": "swivel chair",
"76": "boat",
"77": "bar",
"78": "arcade machine",
"79": "hovel",
"80": "bus",
"81": "towel",
"82": "light",
"83": "truck",
"84": "tower",
"85": "chandelier",
"86": "awning",
"87": "streetlight",
"88": "booth",
"89": "television receiver",
"90": "airplane",
"91": "dirt track",
"92": "apparel",
"93": "pole",
"94": "land",
"95": "bannister",
"96": "escalator",
"97": "ottoman",
"98": "bottle",
"99": "buffet",
"100": "poster",
"101": "stage",
"102": "van",
"103": "ship",
"104": "fountain",
"105": "conveyer belt",
"106": "canopy",
"107": "washer",
"108": "plaything",
"109": "swimming pool",
"110": "stool",
"111": "barrel",
"112": "basket",
"113": "waterfall",
"114": "tent",
"115": "bag",
"116": "minibike",
"117": "cradle",
"118": "oven",
"119": "ball",
"120": "food",
"121": "step",
"122": "tank",
"123": "trade name",
"124": "microwave",
"125": "pot",
"126": "animal",
"127": "bicycle",
"128": "lake",
"129": "dishwasher",
"130": "screen",
"131": "blanket",
"132": "sculpture",
"133": "hood",
"134": "sconce",
"135": "vase",
"136": "traffic light",
"137": "tray",
"138": "ashcan",
"139": "fan",
"140": "pier",
"141": "crt screen",
"142": "plate",
"143": "monitor",
"144": "bulletin board",
"145": "shower",
"146": "radiator",
"147": "glass",
"148": "clock",
"149": "flag"
},
"image_size": 224,
"initializer_range": 0.02,
"label2id": {
"LABEL_0": 0,
"LABEL_1": 1,
"LABEL_10": 10,
"LABEL_100": 100,
"LABEL_101": 101,
"LABEL_102": 102,
"LABEL_103": 103,
"LABEL_104": 104,
"LABEL_105": 105,
"LABEL_106": 106,
"LABEL_107": 107,
"LABEL_108": 108,
"LABEL_109": 109,
"LABEL_11": 11,
"LABEL_110": 110,
"LABEL_111": 111,
"LABEL_112": 112,
"LABEL_113": 113,
"LABEL_114": 114,
"LABEL_115": 115,
"LABEL_116": 116,
"LABEL_117": 117,
"LABEL_118": 118,
"LABEL_119": 119,
"LABEL_12": 12,
"LABEL_120": 120,
"LABEL_121": 121,
"LABEL_122": 122,
"LABEL_123": 123,
"LABEL_124": 124,
"LABEL_125": 125,
"LABEL_126": 126,
"LABEL_127": 127,
"LABEL_128": 128,
"LABEL_129": 129,
"LABEL_13": 13,
"LABEL_130": 130,
"LABEL_131": 131,
"LABEL_132": 132,
"LABEL_133": 133,
"LABEL_134": 134,
"LABEL_135": 135,
"LABEL_136": 136,
"LABEL_137": 137,
"LABEL_138": 138,
"LABEL_139": 139,
"LABEL_14": 14,
"LABEL_140": 140,
"LABEL_141": 141,
"LABEL_142": 142,
"LABEL_143": 143,
"LABEL_144": 144,
"LABEL_145": 145,
"LABEL_146": 146,
"LABEL_147": 147,
"LABEL_148": 148,
"LABEL_149": 149,
"LABEL_15": 15,
"LABEL_16": 16,
"LABEL_17": 17,
"LABEL_18": 18,
"LABEL_19": 19,
"LABEL_2": 2,
"LABEL_20": 20,
"LABEL_21": 21,
"LABEL_22": 22,
"LABEL_23": 23,
"LABEL_24": 24,
"LABEL_25": 25,
"LABEL_26": 26,
"LABEL_27": 27,
"LABEL_28": 28,
"LABEL_29": 29,
"LABEL_3": 3,
"LABEL_30": 30,
"LABEL_31": 31,
"LABEL_32": 32,
"LABEL_33": 33,
"LABEL_34": 34,
"LABEL_35": 35,
"LABEL_36": 36,
"LABEL_37": 37,
"LABEL_38": 38,
"LABEL_39": 39,
"LABEL_4": 4,
"LABEL_40": 40,
"LABEL_41": 41,
"LABEL_42": 42,
"LABEL_43": 43,
"LABEL_44": 44,
"LABEL_45": 45,
"LABEL_46": 46,
"LABEL_47": 47,
"LABEL_48": 48,
"LABEL_49": 49,
"LABEL_5": 5,
"LABEL_50": 50,
"LABEL_51": 51,
"LABEL_52": 52,
"LABEL_53": 53,
"LABEL_54": 54,
"LABEL_55": 55,
"LABEL_56": 56,
"LABEL_57": 57,
"LABEL_58": 58,
"LABEL_59": 59,
"LABEL_6": 6,
"LABEL_60": 60,
"LABEL_61": 61,
"LABEL_62": 62,
"LABEL_63": 63,
"LABEL_64": 64,
"LABEL_65": 65,
"LABEL_66": 66,
"LABEL_67": 67,
"LABEL_68": 68,
"LABEL_69": 69,
"LABEL_7": 7,
"LABEL_70": 70,
"LABEL_71": 71,
"LABEL_72": 72,
"LABEL_73": 73,
"LABEL_74": 74,
"LABEL_75": 75,
"LABEL_76": 76,
"LABEL_77": 77,
"LABEL_78": 78,
"LABEL_79": 79,
"LABEL_8": 8,
"LABEL_80": 80,
"LABEL_81": 81,
"LABEL_82": 82,
"LABEL_83": 83,
"LABEL_84": 84,
"LABEL_85": 85,
"LABEL_86": 86,
"LABEL_87": 87,
"LABEL_88": 88,
"LABEL_89": 89,
"LABEL_9": 9,
"LABEL_90": 90,
"LABEL_91": 91,
"LABEL_92": 92,
"LABEL_93": 93,
"LABEL_94": 94,
"LABEL_95": 95,
"LABEL_96": 96,
"LABEL_97": 97,
"LABEL_98": 98,
"LABEL_99": 99
"airplane": 90,
"animal": 126,
"apparel": 92,
"arcade machine": 78,
"armchair": 30,
"ashcan": 138,
"awning": 86,
"bag": 115,
"ball": 119,
"bannister": 95,
"bar": 77,
"barrel": 111,
"base": 40,
"basket": 112,
"bathtub": 37,
"bed ": 7,
"bench": 69,
"bicycle": 127,
"blanket": 131,
"blind": 63,
"boat": 76,
"book": 67,
"bookcase": 62,
"booth": 88,
"bottle": 98,
"box": 41,
"bridge": 61,
"buffet": 99,
"building": 1,
"bulletin board": 144,
"bus": 80,
"cabinet": 10,
"canopy": 106,
"car": 20,
"case": 55,
"ceiling": 5,
"chair": 19,
"chandelier": 85,
"chest of drawers": 44,
"clock": 148,
"coffee table": 64,
"column": 42,
"computer": 74,
"conveyer belt": 105,
"counter": 45,
"countertop": 70,
"cradle": 117,
"crt screen": 141,
"curtain": 18,
"cushion": 39,
"desk": 33,
"dirt track": 91,
"dishwasher": 129,
"door": 14,
"earth": 13,
"escalator": 96,
"fan": 139,
"fence": 32,
"field": 29,
"fireplace": 49,
"flag": 149,
"floor": 3,
"flower": 66,
"food": 120,
"fountain": 104,
"glass": 147,
"grandstand": 51,
"grass": 9,
"hill": 68,
"hood": 133,
"house": 25,
"hovel": 79,
"kitchen island": 73,
"lake": 128,
"lamp": 36,
"land": 94,
"light": 82,
"microwave": 124,
"minibike": 116,
"mirror": 27,
"monitor": 143,
"mountain": 16,
"ottoman": 97,
"oven": 118,
"painting": 22,
"palm": 72,
"path": 52,
"person": 12,
"pier": 140,
"pillow": 57,
"plant": 17,
"plate": 142,
"plaything": 108,
"pole": 93,
"pool table": 56,
"poster": 100,
"pot": 125,
"radiator": 146,
"railing": 38,
"refrigerator": 50,
"river": 60,
"road": 6,
"rock": 34,
"rug": 28,
"runway": 54,
"sand": 46,
"sconce": 134,
"screen": 130,
"screen door": 58,
"sculpture": 132,
"sea": 26,
"seat": 31,
"shelf": 24,
"ship": 103,
"shower": 145,
"sidewalk": 11,
"signboard": 43,
"sink": 47,
"sky": 2,
"skyscraper": 48,
"sofa": 23,
"stage": 101,
"stairs": 53,
"stairway": 59,
"step": 121,
"stool": 110,
"stove": 71,
"streetlight": 87,
"swimming pool": 109,
"swivel chair": 75,
"table": 15,
"tank": 122,
"television receiver": 89,
"tent": 114,
"toilet": 65,
"towel": 81,
"tower": 84,
"trade name": 123,
"traffic light": 136,
"tray": 137,
"tree": 4,
"truck": 83,
"van": 102,
"vase": 135,
"wall": 0,
"wardrobe": 35,
"washer": 107,
"water": 21,
"waterfall": 113,
"windowpane": 8
},
"layer_norm_eps": 1e-06,
"mlp_ratios": [
@ -354,6 +354,7 @@
3,
3
],
"reshape_last_stage": true,
"sr_ratios": [
8,
4,
@ -366,5 +367,6 @@
2,
2
],
"transformers_version": "4.9.0.dev0"
"torch_dtype": "float32",
"transformers_version": "4.12.0.dev0"
}

View File

@ -1,22 +1,18 @@
{
"do_normalize": true,
"do_resize": true,
"feature_extractor_type": "SegFormerFeatureExtractor",
"feature_extractor_type": "SegformerFeatureExtractor",
"image_mean": [
0.485,
0.456,
0.406
],
"image_scale": [
2048,
512
],
"image_std": [
0.229,
0.224,
0.225
],
"keep_ratio": true,
"reduce_labels": true,
"resample": 2,
"size_divisor": 32
"size": 512
}

BIN
pytorch_model.bin (Stored with Git LFS)

Binary file not shown.

BIN
tf_model.h5 (Stored with Git LFS) Normal file

Binary file not shown.