Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Train sam2 on SA-1B #432

Open
xings19 opened this issue Nov 2, 2024 · 1 comment
Open

Train sam2 on SA-1B #432

xings19 opened this issue Nov 2, 2024 · 1 comment

Comments

@xings19
Copy link

xings19 commented Nov 2, 2024

I see the config in another issue, and i have a try, but..

Hi @hpichlerbio, thanks for your interest. One way to finetune on your custom image only dataset is to follow the last section in the training README but remove the video dataset from the mix. The config would look something like this (removing the video dataset from the mix):

data:
  train:
    _target_: training.dataset.sam2_datasets.TorchTrainMixedDataset 
    phases_per_epoch: ${phases_per_epoch} # Chunks a single epoch into smaller phases
    batch_sizes: # List of batch sizes corresponding to each dataset
    - ${bs1} # Batch size of dataset 1
    datasets:
    # Custom Image dataset
    - _target_: training.dataset.vos_dataset.VOSDataset
      training: true
      video_dataset:
        _target_: training.dataset.vos_raw_dataset.CustomImageDataset # Your custom Dataset class
        img_folder: ${path_to_img_folder}
        gt_folder: ${path_to_gt_folder}
        file_list_txt: ${path_to_train_filelist} # Optional
      sampler:
        _target_: training.dataset.vos_sampler.RandomUniformSampler
        num_frames: 1
        max_num_objects: ${max_num_objects_per_image}
      transforms: ${image_transforms}
    shuffle: True
    num_workers: ${num_train_workers}
    pin_memory: True
    drop_last: True
    collate_fn:
    _target_: training.utils.data_utils.collate_fn
    _partial_: true
    dict_key: all

Note that if you'd like to use your custom dataset, you should implement your own dataset class (similar to SA1BRawDataset). If your dataset is in SA1B format, you can directly use SA1BRawDataset. Please let me know if you have further questions

I write like this, the program will output an error, i think you miss the multiplier, after i fix the bug, i run the program on SA-1B, the program will always out a warnning, like this:

INFO 2024-11-02 19:49:22,049 train_utils.py: 271: Train Epoch: [0][ 100/2796] | Batch Time: 0.91 (1.31) | Data Time: 0.00 (0.44) | Mem (GB): 52.00 (52.77/54.00) | Time Elapsed: 00d 00h 02m | Losses/train_all_loss: 5.03e-01 (7.52e-01)
WARNING:root:Skip RandomAffine for zero-area mask in first frame after 1 tentatives
WARNING:root:Skip RandomAffine for zero-area mask in first frame after 1 tentatives
WARNING:root:Skip RandomAffine for zero-area mask in first frame after 1 tentatives
WARNING:root:Skip RandomAffine for zero-area mask in first frame after 1 tentatives
WARNING:root:Skip RandomAffine for zero-area mask in first frame after 1 tentatives
WARNING:root:Skip RandomAffine for zero-area mask in first frame after 1 tentatives
WARNING:root:Skip RandomAffine for zero-area mask in first frame after 1 tentatives
WARNING:root:Skip RandomAffine for zero-area mask in first frame after 1 tentatives
WARNING:root:Skip RandomAffine for zero-area mask in first frame after 1 tentatives
WARNING:root:Skip RandomAffine for zero-area mask in first frame after 1 tentatives

What should I do to reduce this warnning?
My hyperparameters:

scratch:
  resolution: 1024
  train_batch_size: 4
  num_train_workers: 10
  num_frames: 1
  max_num_objects: 5
  base_lr: 5.0e-6
  vision_lr: 3.0e-06
  phases_per_epoch: 1
  num_epochs: 40

dataset:
  # PATHS to Dataset
  img_folder: ./SA-1B
  gt_folder: ./SA-1B 
  multiplier: 2

My dataset config:

data:
    train:
      _target_: training.dataset.sam2_datasets.TorchTrainMixedDataset 
      phases_per_epoch: ${scratch.phases_per_epoch} # Chunks a single epoch into smaller phases
      batch_sizes: # List of batch sizes corresponding to each dataset
        - ${scratch.train_batch_size}
      datasets:
      # Custom Image dataset
      - _target_: training.dataset.vos_dataset.VOSDataset
        training: true
        video_dataset:
          _target_: training.dataset.vos_raw_dataset.SA1BRawDataset # Your custom Dataset class
          img_folder: ${dataset.img_folder}
          gt_folder: ${dataset.gt_folder}
        sampler:
          _target_: training.dataset.vos_sampler.RandomUniformSampler
          num_frames: ${scratch.num_frames}
          max_num_objects: 5
        transforms: ${vos.train_transforms}
        multiplier: ${dataset.multiplier}
      shuffle: True
      num_workers: ${scratch.num_train_workers}
      pin_memory: True
      drop_last: True
      collate_fn:
        _target_: training.utils.data_utils.collate_fn
        _partial_: true
        dict_key: all
@FDD-github
Copy link

I have the same problem. My image transform config is such that:

# Image transforms
image_transforms:
  - _target_: training.dataset.transforms.ComposeAPI
    transforms:
      - _target_: training.dataset.transforms.RandomHorizontalFlip
        consistent_transform: True
      - _target_: training.dataset.transforms.RandomResizeAPI
        sizes: ${scratch.resolution}
        square: true
        consistent_transform: True
      - _target_: training.dataset.transforms.ToTensorAPI
      - _target_: training.dataset.transforms.NormalizeAPI
        mean: [ 0.485, 0.456, 0.406 ]
        std: [ 0.229, 0.224, 0.225 ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants