Semi-Supervised Piano Transcription Using Pseudo-Labeling Techniques

This is a Pytorch code repository accompanying the following paper:

@inproceedings{StrahlM24_SemiSupPianoTranscription_ISMIR,
  author    = {Sebastian Strahl and Meinard M{\"u}ller},
  title     = {Semi-Supervised Piano Transcription Using Pseudo-Labeling Techniques},
  booktitle = {Proceedings of the International Society for Music Information Retrieval Conference ({ISMIR})},
  address   = {San Francisco, USA},
  year      = {2024}
}

This repository contains code for all of the paper's experiments. The codebase builds upon the PyTorch Implementation of Onsets and Frames by Jong Wook Kim.

All datasets used in the paper are publicly available:

For details and references, please see the paper.

Instructions

Installation

cd onsets_frames_semisup
conda env create -f environment.yml
conda activate onsets_frames_semisup

Data Preparation

With the following steps, the required datasets can be downloaded and prepared. All audio data is resampled to 16 kHz, and the data which is not needed for the experiments will be deleted. Note that the data preparation requires about 200GB space for intermediate storage and that ffmpeg needs to be installed.

Download the pre-processed MAPS dataset from here by running
```
cd data
./download_maps.sh
```
Determine the training set pieces of the MAPS datasets which do not overlap with the test set by running
```
python get_MAPS_train_test_overlap.py
```
Download and prepare the MAESTRO V3.0.0 dataset from here by running
```
./prepare_maestro.sh
```
Download the SMD dataset from here , convert the annotations into tsv files, resample to 16 kHz, and delete data which is not used by running
```
./download_smd.sh
python prepare_smd.py
cd SMD 
rm -rf csv midi midi_wav_22050_mono wav_22050_mono wav_44100_stereo
cd ../..
```

Experiments

To reproduce all the results from the paper, run the following command:
(Note that carrying out these experiments requires at least 20GB GPU memory.)

python run_experiments.py

Trained models and testing results are stored in runs.

Acknowledgements:

This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Grant No. 350953655 (MU 2686/11-2) and Grant No. 500643750 (MU 2686/15-1). The authors are with the International Audio Laboratories Erlangen, a joint institution of the Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Fraunhofer Institute for Integrated Circuits IIS.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
onsets_and_frames		onsets_and_frames
thirdPartyLegalNotices		thirdPartyLegalNotices
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
evaluate.py		evaluate.py
run_experiments.py		run_experiments.py
threshold_tuning.py		threshold_tuning.py
train_semisup.py		train_semisup.py
transcribe.py		transcribe.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Semi-Supervised Piano Transcription Using Pseudo-Labeling Techniques

Instructions

Installation

Data Preparation

Experiments

Acknowledgements:

About

Releases

Packages

Languages

License

groupmm/onsets_frames_semisup

Folders and files

Latest commit

History

Repository files navigation

Semi-Supervised Piano Transcription Using Pseudo-Labeling Techniques

Instructions

Installation

Data Preparation

Experiments

Acknowledgements:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages