🚀 Add `PreProcessor` to `AnomalyModule` #2358

samet-akcay · 2024-10-09T13:30:26Z

📝 Description

The PreProcessor class serves as both a PyTorch module and a Lightning callback, handling transforms during different stages of training, validation, testing and prediction. This PR demonstrates how to create and use custom pre-processors.

Key Components

The pre-processor functionality is implemented in:

class PreProcessor(nn.Module, Callback):
    """Anomalib pre-processor.

    This class serves as both a PyTorch module and a Lightning callback, handling
    the application of transforms to data batches during different stages of
    training, validation, testing, and prediction.

    Args:
        train_transform (Transform | None): Transform to apply during training.
        val_transform (Transform | None): Transform to apply during validation.
        test_transform (Transform | None): Transform to apply during testing.
        transform (Transform | None): General transform to apply if stage-specific
            transforms are not provided.

    Raises:
        ValueError: If both `transform` and any of the stage-specific transforms
            are provided simultaneously.

    Notes:
        If only `transform` is provided, it will be used for all stages (train, val, test).

        Priority of transforms:
        1. Explicitly set PreProcessor transforms (highest priority)
        2. Datamodule transforms (if PreProcessor has no transforms)
        3. Dataloader transforms (if neither PreProcessor nor datamodule have transforms)
        4. Default transforms (lowest priority)

    Examples:
        >>> from torchvision.transforms.v2 import Compose, Resize, ToTensor
        >>> from anomalib.pre_processing import PreProcessor

        >>> # Define transforms
        >>> train_transform = Compose([Resize((224, 224)), ToTensor()])
        >>> val_transform = Compose([Resize((256, 256)), CenterCrop((224, 224)), ToTensor()])

        >>> # Create PreProcessor with stage-specific transforms
        >>> pre_processor = PreProcessor(
        ...     train_transform=train_transform,
        ...     val_transform=val_transform
        ... )

        >>> # Create PreProcessor with a single transform for all stages
        >>> common_transform = Compose([Resize((224, 224)), ToTensor()])
        >>> pre_processor_common = PreProcessor(transform=common_transform)

        >>> # Use in a Lightning module
        >>> class MyModel(LightningModule):
        ...     def __init__(self):
        ...         super().__init__()
        ...         self.pre_processor = PreProcessor(...)
        ...
        ...     def configure_callbacks(self):
        ...         return [self.pre_processor]
        ...
        ...     def training_step(self, batch, batch_idx):
        ...         # The pre_processor will automatically apply the correct transform
        ...         processed_batch = self.pre_processor(batch)
        ...         # Rest of the training step
    """

And used by the base AnomalyModule in:

    def _resolve_pre_processor(self, pre_processor: PreProcessor | bool) -> PreProcessor:
        """Resolve and validate which pre-processor to use..

        Args:
            pre_processor: Pre-processor configuration
                - True -> use default pre-processor
                - False -> no pre-processor
                - PreProcessor -> use the provided pre-processor

        Returns:
            Configured pre-processor
        """
        if isinstance(pre_processor, PreProcessor):
            return pre_processor
        if isinstance(pre_processor, bool):
            return self.configure_pre_processor()
        msg = f"Invalid pre-processor type: {type(pre_processor)}"
        raise TypeError(msg)

Usage Examples

1. Using Default Pre-Processor

The simplest way is to use the default pre-processor which resizes images to 256x256 and normalizes using ImageNet statistics:

from anomalib.models import PatchCore

# Uses default pre-processor
model = PatchCore()

2. Custom Pre-Processor with Different Transforms

Create a pre-processor with custom transforms for different stages:

from torchvision.transforms.v2 import Compose, Resize, CenterCrop, RandomHorizontalFlip, Normalize
from anomalib.pre_processing import PreProcessor

# Define stage-specific transforms
train_transform = Compose([
    Resize((256, 256), antialias=True),
    RandomHorizontalFlip(p=0.5),
    Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

val_transform = Compose([
    Resize((256, 256), antialias=True), 
    CenterCrop((224, 224)),
    Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

# Create pre-processor with different transforms per stage
pre_processor = PreProcessor(
    train_transform=train_transform,
    val_transform=val_transform,
    test_transform=val_transform  # Use same transform as validation for testing
)

# Use custom pre-processor in model
model = PatchCore(pre_processor=pre_processor)

3. Disable Pre-Processing

To disable pre-processing entirely:

model = PatchCore(pre_processor=False)

4. Override Default Pre-Processor in Custom Model

Custom models can override the default pre-processor configuration:

from anomalib.models.components.base import AnomalyModule

class CustomModel(AnomalyModule):
    @classmethod
    def configure_pre_processor(cls, image_size=(224, 224)) -> PreProcessor:
        transform = Compose([
            Resize(image_size, antialias=True),
            Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
        ])
        return PreProcessor(transform=transform)

Notes

Pre-processor transforms are applied in order of priority:
- Explicitly set PreProcessor transforms (highest)
- Datamodule transforms
- Dataloader transforms
- Default transforms (lowest)
The pre-processor automatically handles both image and mask transforms during training
Custom transforms should maintain compatibility with both image and segmentation mask inputs

Testing

Added unit tests to verify:
Default pre-processor behavior
Custom transform application
Transform priority order
Mask transformation handling

✨ Changes

Select what type of change your PR is:

🐞 Bug fix (non-breaking change which fixes an issue)
🔨 Refactor (non-breaking change which refactors the code base)
🚀 New feature (non-breaking change which adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
📚 Documentation update
🔒 Security update

✅ Checklist

Before you submit your pull request, please make sure you have completed the following steps:

📋 I have summarized my changes in the CHANGELOG and followed the guidelines for my type of change (skip for minor changes, documentation updates, and test enhancements).
📚 I have made the necessary updates to the documentation (if applicable).
🧪 I have written tests that support my changes and prove that my fix is effective or my feature works (if applicable).

For more information about code review checklists, see the Code Review Checklist.

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

jpcbertoldo · 2024-10-09T14:34:46Z

A sub-feature request that would fit here: (optionally?) keep both the transformed and original image/mask in the batch.

So instead of

            image, gt_mask = self.XXX_transform(batch.image, batch.gt_mask)
            batch.update(image=image, gt_mask=gt_mask)

something like

            batch.update(image_original=batch.image, gt_mask_original=batch.gt_mask)
            image, gt_mask = self.XXX_transform(batch.image, batch.gt_mask)
            batch.update(image=image, gt_mask=gt_mask)

It's quite practical to have these when using the API (i've re-implemented this in my local copy 100 times haha).

samet-akcay · 2024-10-09T14:51:11Z

A sub-feature request that would fit here: (optionally?) keep both the transformed and original image/mask in the batch.

So instead of
            image, gt_mask = self.XXX_transform(batch.image, batch.gt_mask)
            batch.update(image=image, gt_mask=gt_mask)
something like
            batch.update(image_original=batch.image, gt_mask_original=batch.gt_mask)
            image, gt_mask = self.XXX_transform(batch.image, batch.gt_mask)
            batch.update(image=image, gt_mask=gt_mask)
It's quite practical to have these when using the API (i've re-implemented this in my local copy 100 times haha).

yeah, the idea is to keep batch.image and batch.gt_mask original outside the model. It is not working that way though :)

jpcbertoldo · 2024-10-09T15:26:28Z

yeah, the idea is to keep batch.image and batch.gt_mask original outside the model

exactly, makes sense : )

but it's also useful to be able to access the transformed one (eg. when using augmentations)

it is not working that way though :)

didnt get this. cause it's not backcompatible?

samet-akcay · 2024-10-09T17:54:12Z

yeah, the idea is to keep batch.image and batch.gt_mask original outside the model

exactly, makes sense : )

but it's also useful to be able to access the transformed one (eg. when using augmentations)

it is not working that way though :)

didnt get this. cause it's not backcompatible?

oh I meant, it is currently not working, I need to fix it :)

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

…ssor Signed-off-by: Samet Akcay <samet.akcay@intel.com>

…oolkit/anomalib into add-pre-processor

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

review-notebook-app · 2024-10-17T19:22:34Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

ashwinvaidya17

Thanks. I have a few minor comments

src/anomalib/models/components/base/anomaly_module.py

ashwinvaidya17 · 2024-10-30T10:43:31Z

src/anomalib/models/components/base/anomaly_module.py

@@ -220,30 +250,12 @@ def input_size(self) -> tuple[int, int] | None:
        The effective input size is the size of the input tensor after the transform has been applied. If the transform
        is not set, or if the transform does not change the shape of the input tensor, this method will return None.
        """
-        transform = self.transform or self.configure_transforms()
+        transform = self.pre_processor.train_transform


Should we add a check to ascertain whether train_transform is present? Models like VlmAD might not have train_transforms passed to them. I feel it should pick up val or pred transform is train is not available.

src/anomalib/models/components/base/anomaly_module.py

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

src/anomalib/models/components/base/anomaly_module.py

src/anomalib/pre_processing/pre_processing.py

src/anomalib/pre_processing/utils/transform.py

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

src/anomalib/pre_processing/pre_processing.py

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

djdameln

Thanks! I think we could merge this

src/anomalib/pre_processing/pre_processing.py

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

samet-akcay added 6 commits October 9, 2024 12:14

Created pre-processor

5338afa

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Rename transforms to transform in pre-processor

180c22f

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove transforms from datamodules

7738e38

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove transforms from datasets

a133048

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove setup_transforms from Engine

c748a0d

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Add preprocessor to AnomalyModule and models

03a2a2e

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

samet-akcay requested review from ashwinvaidya17 and djdameln as code owners October 9, 2024 13:30

samet-akcay added 6 commits October 10, 2024 15:02

Fix tests

6f7399a

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove self._transform from AnomalyModule

cc5f559

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

revert transforms in datasets

4d2e110

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

fix efficient_ad and engine.config tests

1e83e57

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Update the upgrade tests

1e05349

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Revert on_load_checkpoint hook to AnomalyModule

785d64f

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

This was referenced Oct 14, 2024

[Bug]: Output segmentation mask mismatch #1660

Open

📋 [TASK] Integrate Pre-processing as AnomalibModule Attribute #2366

Open

samet-akcay linked an issue Oct 14, 2024 that may be closed by this pull request

📋 [TASK] Integrate Pre-processing as AnomalibModule Attribute #2366

Open

samet-akcay added 7 commits October 15, 2024 05:58

Remove exportable transform from anomaly module and move to pre-proce…

b798243

…ssor Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Merge branch 'feature/design-simplifications' of github.com:openvinot…

4bf6187

…oolkit/anomalib into add-pre-processor

Merge main

c942604

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Add pre-processor to the model graph

ea28833

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Add docstring to pre-processor class

78cf516

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Fix win-clip tests

46fe7e5

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Update notebooks

f058fbb

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Split the forward logic and move the training to model hooks

84c39cd

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

samet-akcay added 9 commits October 29, 2024 13:29

Remove transforms from datamodules

c71e41c

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove transforms from datamodules

b9bb700

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove transforms from datamodules

185fec8

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove transforms from datamodules

06fd947

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove transforms from datamodules

079168e

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove transform related keys from data configs

1f6555c

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

update preprocessor tests

03196fa

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove setup method from the model implementations

d579312

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove image size from datamodules in jupyter notebooks

5e82c34

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

ashwinvaidya17 reviewed Oct 30, 2024

View reviewed changes

Modify folder notebook to acccess the batch from dataset not dataloader

a1a0548

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

samet-akcay requested a review from djdameln October 31, 2024 05:26

Create resolve preprocessor method

0ab0a71

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

samet-akcay requested a review from ashwinvaidya17 November 4, 2024 18:13

djdameln requested changes Nov 5, 2024

View reviewed changes

samet-akcay added 4 commits November 5, 2024 14:21

Return if is

401fbaa

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Rename self.exportable_transform to self.export_transform

9b45def

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove set_datamodule_transforms

f5fbb7c

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

remove hooks as they are not needed anymore

9cc11d0

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

samet-akcay requested a review from djdameln November 5, 2024 14:46

Fix pre-processor tests

05c86da

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

djdameln reviewed Nov 5, 2024

View reviewed changes

src/anomalib/pre_processing/pre_processing.py Outdated Show resolved Hide resolved

remove transform getter util function

3eecd89

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

djdameln approved these changes Nov 6, 2024

View reviewed changes

djdameln reviewed Nov 7, 2024

View reviewed changes

src/anomalib/pre_processing/pre_processing.py Outdated Show resolved Hide resolved

samet-akcay added 3 commits November 7, 2024 18:13

Fix transform dict to setup datamodule transforms

ad43f40

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Resolve merge conflicts

cb7682b

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Fix Fastflow notebook

b99717c

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

samet-akcay merged commit dddf707 into openvinotoolkit:feature/v2 Nov 8, 2024
7 checks passed

samet-akcay deleted the add-pre-processor branch November 8, 2024 07:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🚀 Add `PreProcessor` to `AnomalyModule` #2358

🚀 Add `PreProcessor` to `AnomalyModule` #2358

samet-akcay commented Oct 9, 2024 •

edited

Loading

jpcbertoldo commented Oct 9, 2024

samet-akcay commented Oct 9, 2024

jpcbertoldo commented Oct 9, 2024

samet-akcay commented Oct 9, 2024

review-notebook-app bot commented Oct 17, 2024

ashwinvaidya17 left a comment

ashwinvaidya17 Oct 30, 2024

djdameln left a comment

🚀 Add PreProcessor to AnomalyModule #2358

🚀 Add PreProcessor to AnomalyModule #2358

Conversation

samet-akcay commented Oct 9, 2024 • edited Loading

📝 Description

Key Components

Usage Examples

1. Using Default Pre-Processor

2. Custom Pre-Processor with Different Transforms

3. Disable Pre-Processing

4. Override Default Pre-Processor in Custom Model

Notes

Testing

✨ Changes

✅ Checklist

jpcbertoldo commented Oct 9, 2024

samet-akcay commented Oct 9, 2024

jpcbertoldo commented Oct 9, 2024

samet-akcay commented Oct 9, 2024

review-notebook-app bot commented Oct 17, 2024

ashwinvaidya17 left a comment

Choose a reason for hiding this comment

ashwinvaidya17 Oct 30, 2024

Choose a reason for hiding this comment

djdameln left a comment

Choose a reason for hiding this comment

🚀 Add `PreProcessor` to `AnomalyModule` #2358

🚀 Add `PreProcessor` to `AnomalyModule` #2358

samet-akcay commented Oct 9, 2024 •

edited

Loading