Skip to content

Commit

Permalink
Remove RGB -> BGR image conversion in Object Detection tutorial (#6228)
Browse files Browse the repository at this point in the history
  • Loading branch information
mariosasko authored Sep 8, 2023
1 parent d058d6e commit 99641ce
Showing 1 changed file with 6 additions and 9 deletions.
15 changes: 6 additions & 9 deletions docs/source/object_detection.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ In this example, you'll use the [`cppe-5`](https://huggingface.co/datasets/cppe-
Load the dataset and take a look at an example:

```py
from datasets import load_dataset
>>> from datasets import load_dataset

>>> ds = load_dataset("cppe-5")
>>> example = ds['train'][0]
Expand Down Expand Up @@ -74,8 +74,6 @@ You can visualize the `bboxes` on the image using some internal torch utilities.

With `albumentations`, you can apply transforms that will affect the image while also updating the `bboxes` accordingly. In this case, the image is resized to (480, 480), flipped horizontally, and brightened.

`albumentations` expects the image to be in BGR format, not RGB, so you'll have to convert the image before applying the transform.

```py
>>> import albumentations
>>> import numpy as np
Expand All @@ -86,8 +84,7 @@ With `albumentations`, you can apply transforms that will affect the image while
... albumentations.RandomBrightnessContrast(p=1.0),
... ], bbox_params=albumentations.BboxParams(format='coco', label_fields=['category']))

>>> # RGB PIL Image -> BGR Numpy array
>>> image = np.flip(np.array(example['image']), -1)
>>> image = np.array(example['image'])
>>> out = transform(
... image=image,
... bboxes=example['objects']['bbox'],
Expand All @@ -98,7 +95,7 @@ With `albumentations`, you can apply transforms that will affect the image while
Now when you visualize the result, the image should be flipped, but the `bboxes` should still be in the right places.

```py
>>> image = torch.tensor(out['image']).flip(-1).permute(2, 0, 1)
>>> image = torch.tensor(out['image']).permute(2, 0, 1)
>>> boxes_xywh = torch.stack([torch.tensor(x) for x in out['bboxes']])
>>> boxes_xyxy = box_convert(boxes_xywh, 'xywh', 'xyxy')
>>> labels = [categories.int2str(x) for x in out['category']]
Expand All @@ -122,13 +119,13 @@ Create a function to apply the transform to a batch of examples:
>>> def transforms(examples):
... images, bboxes, categories = [], [], []
... for image, objects in zip(examples['image'], examples['objects']):
... image = np.array(image.convert("RGB"))[:, :, ::-1]
... image = np.array(image.convert("RGB"))
... out = transform(
... image=image,
... bboxes=objects['bbox'],
... category=objects['category']
... )
... images.append(torch.tensor(out['image']).flip(-1).permute(2, 0, 1))
... images.append(torch.tensor(out['image']).permute(2, 0, 1))
... bboxes.append(torch.tensor(out['bboxes']))
... categories.append(out['category'])
... return {'image': images, 'bbox': bboxes, 'category': categories}
Expand Down Expand Up @@ -164,4 +161,4 @@ Now that you know how to process a dataset for object detection, learn
[how to train an object detection model](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/YOLOS/Fine_tuning_YOLOS_for_object_detection_on_custom_dataset_(balloon).ipynb)
and use it for inference.

</Tip>
</Tip>

0 comments on commit 99641ce

Please sign in to comment.