Train sam2 on SA-1B #432

xings19 · 2024-11-02T11:55:13Z

I see the config in another issue, and i have a try, but..

Hi @hpichlerbio, thanks for your interest. One way to finetune on your custom image only dataset is to follow the last section in the training README but remove the video dataset from the mix. The config would look something like this (removing the video dataset from the mix):

data:
  train:
    _target_: training.dataset.sam2_datasets.TorchTrainMixedDataset 
    phases_per_epoch: ${phases_per_epoch} # Chunks a single epoch into smaller phases
    batch_sizes: # List of batch sizes corresponding to each dataset
    - ${bs1} # Batch size of dataset 1
    datasets:
    # Custom Image dataset
    - _target_: training.dataset.vos_dataset.VOSDataset
      training: true
      video_dataset:
        _target_: training.dataset.vos_raw_dataset.CustomImageDataset # Your custom Dataset class
        img_folder: ${path_to_img_folder}
        gt_folder: ${path_to_gt_folder}
        file_list_txt: ${path_to_train_filelist} # Optional
      sampler:
        _target_: training.dataset.vos_sampler.RandomUniformSampler
        num_frames: 1
        max_num_objects: ${max_num_objects_per_image}
      transforms: ${image_transforms}
    shuffle: True
    num_workers: ${num_train_workers}
    pin_memory: True
    drop_last: True
    collate_fn:
    _target_: training.utils.data_utils.collate_fn
    _partial_: true
    dict_key: all

Note that if you'd like to use your custom dataset, you should implement your own dataset class (similar to SA1BRawDataset). If your dataset is in SA1B format, you can directly use SA1BRawDataset. Please let me know if you have further questions

I write like this, the program will output an error, i think you miss the multiplier, after i fix the bug, i run the program on SA-1B, the program will always out a warnning, like this:

INFO 2024-11-02 19:49:22,049 train_utils.py: 271: Train Epoch: [0][ 100/2796] | Batch Time: 0.91 (1.31) | Data Time: 0.00 (0.44) | Mem (GB): 52.00 (52.77/54.00) | Time Elapsed: 00d 00h 02m | Losses/train_all_loss: 5.03e-01 (7.52e-01)
WARNING:root:Skip RandomAffine for zero-area mask in first frame after 1 tentatives
WARNING:root:Skip RandomAffine for zero-area mask in first frame after 1 tentatives
WARNING:root:Skip RandomAffine for zero-area mask in first frame after 1 tentatives
WARNING:root:Skip RandomAffine for zero-area mask in first frame after 1 tentatives
WARNING:root:Skip RandomAffine for zero-area mask in first frame after 1 tentatives
WARNING:root:Skip RandomAffine for zero-area mask in first frame after 1 tentatives
WARNING:root:Skip RandomAffine for zero-area mask in first frame after 1 tentatives
WARNING:root:Skip RandomAffine for zero-area mask in first frame after 1 tentatives
WARNING:root:Skip RandomAffine for zero-area mask in first frame after 1 tentatives
WARNING:root:Skip RandomAffine for zero-area mask in first frame after 1 tentatives

What should I do to reduce this warnning？
My hyperparameters:

scratch:
  resolution: 1024
  train_batch_size: 4
  num_train_workers: 10
  num_frames: 1
  max_num_objects: 5
  base_lr: 5.0e-6
  vision_lr: 3.0e-06
  phases_per_epoch: 1
  num_epochs: 40

dataset:
  # PATHS to Dataset
  img_folder: ./SA-1B
  gt_folder: ./SA-1B 
  multiplier: 2

My dataset config:

data:
    train:
      _target_: training.dataset.sam2_datasets.TorchTrainMixedDataset 
      phases_per_epoch: ${scratch.phases_per_epoch} # Chunks a single epoch into smaller phases
      batch_sizes: # List of batch sizes corresponding to each dataset
        - ${scratch.train_batch_size}
      datasets:
      # Custom Image dataset
      - _target_: training.dataset.vos_dataset.VOSDataset
        training: true
        video_dataset:
          _target_: training.dataset.vos_raw_dataset.SA1BRawDataset # Your custom Dataset class
          img_folder: ${dataset.img_folder}
          gt_folder: ${dataset.gt_folder}
        sampler:
          _target_: training.dataset.vos_sampler.RandomUniformSampler
          num_frames: ${scratch.num_frames}
          max_num_objects: 5
        transforms: ${vos.train_transforms}
        multiplier: ${dataset.multiplier}
      shuffle: True
      num_workers: ${scratch.num_train_workers}
      pin_memory: True
      drop_last: True
      collate_fn:
        _target_: training.utils.data_utils.collate_fn
        _partial_: true
        dict_key: all

The text was updated successfully, but these errors were encountered:

FDD-github · 2024-11-07T02:53:39Z

I have the same problem. My image transform config is such that:

# Image transforms
image_transforms:
  - _target_: training.dataset.transforms.ComposeAPI
    transforms:
      - _target_: training.dataset.transforms.RandomHorizontalFlip
        consistent_transform: True
      - _target_: training.dataset.transforms.RandomResizeAPI
        sizes: ${scratch.resolution}
        square: true
        consistent_transform: True
      - _target_: training.dataset.transforms.ToTensorAPI
      - _target_: training.dataset.transforms.NormalizeAPI
        mean: [ 0.485, 0.456, 0.406 ]
        std: [ 0.229, 0.224, 0.225 ]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Train sam2 on SA-1B #432

Train sam2 on SA-1B #432

xings19 commented Nov 2, 2024 •

edited

Loading

FDD-github commented Nov 7, 2024

Train sam2 on SA-1B #432

Train sam2 on SA-1B #432

Comments

xings19 commented Nov 2, 2024 • edited Loading

FDD-github commented Nov 7, 2024

xings19 commented Nov 2, 2024 •

edited

Loading