Remove RGB -> BGR image conversion in Object Detection tutorial (#6228)

huggingface · Sep 8, 2023 · 99641ce · 99641ce
1 parent d058d6e
commit 99641ce
Showing 1 changed file with 6 additions and 9 deletions.
diff --git a/docs/source/object_detection.mdx b/docs/source/object_detection.mdx
@@ -13,7 +13,7 @@ In this example, you'll use the [`cppe-5`](https://huggingface.co/datasets/cppe-
 Load the dataset and take a look at an example:
 
 ```py
-from datasets import load_dataset
+>>> from datasets import load_dataset
 
 >>> ds = load_dataset("cppe-5")
 >>> example = ds['train'][0]
@@ -74,8 +74,6 @@ You can visualize the `bboxes` on the image using some internal torch utilities.
 
 With `albumentations`, you can apply transforms that will affect the image while also updating the `bboxes` accordingly. In this case, the image is resized to (480, 480), flipped horizontally, and brightened. 
 
-`albumentations` expects the image to be in BGR format, not RGB, so you'll have to convert the image before applying the transform.
-
 ```py
 >>> import albumentations
 >>> import numpy as np
@@ -86,8 +84,7 @@ With `albumentations`, you can apply transforms that will affect the image while
 ...     albumentations.RandomBrightnessContrast(p=1.0),
 ... ], bbox_params=albumentations.BboxParams(format='coco',  label_fields=['category']))
 
->>> # RGB PIL Image -> BGR Numpy array
->>> image = np.flip(np.array(example['image']), -1)
+>>> image = np.array(example['image'])
 >>> out = transform(
 ...     image=image,
 ...     bboxes=example['objects']['bbox'],
@@ -98,7 +95,7 @@ With `albumentations`, you can apply transforms that will affect the image while
 Now when you visualize the result, the image should be flipped, but the `bboxes` should still be in the right places.
 
 ```py
->>> image = torch.tensor(out['image']).flip(-1).permute(2, 0, 1)
+>>> image = torch.tensor(out['image']).permute(2, 0, 1)
 >>> boxes_xywh = torch.stack([torch.tensor(x) for x in out['bboxes']])
 >>> boxes_xyxy = box_convert(boxes_xywh, 'xywh', 'xyxy')
 >>> labels = [categories.int2str(x) for x in out['category']]
@@ -122,13 +119,13 @@ Create a function to apply the transform to a batch of examples:
 >>> def transforms(examples):
 ...     images, bboxes, categories = [], [], []
 ...     for image, objects in zip(examples['image'], examples['objects']):
-...         image = np.array(image.convert("RGB"))[:, :, ::-1]
+...         image = np.array(image.convert("RGB"))
 ...         out = transform(
 ...             image=image,
 ...             bboxes=objects['bbox'],
 ...             category=objects['category']
 ...         )
-...         images.append(torch.tensor(out['image']).flip(-1).permute(2, 0, 1))
+...         images.append(torch.tensor(out['image']).permute(2, 0, 1))
 ...         bboxes.append(torch.tensor(out['bboxes']))
 ...         categories.append(out['category'])
 ...     return {'image': images, 'bbox': bboxes, 'category': categories}
@@ -164,4 +161,4 @@ Now that you know how to process a dataset for object detection, learn
 [how to train an object detection model](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/YOLOS/Fine_tuning_YOLOS_for_object_detection_on_custom_dataset_(balloon).ipynb)
 and use it for inference.
 
-</Tip>
+</Tip>