Skip to content

Commit

Permalink
add doc for downloading and preprocessing cmip6
Browse files Browse the repository at this point in the history
  • Loading branch information
tung-nd committed Sep 29, 2023
1 parent bfe02bf commit 3805481
Showing 1 changed file with 13 additions and 1 deletion.
14 changes: 13 additions & 1 deletion docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,19 @@

### Data Preparation

The code for downloading and preprocessing CMIP6 data is coming soon
To download and regrid a CMIP6 dataset to a common resolution (e.g., 1.406525 degree), go to the corresponding directory inside `snakemake_configs` and run
```bash
snakemake all --configfile config_2m_temperature.yml --cores 8
```
This script will download and regrid the `2m_temperature` data in parallel using 8 CPU cores. Modify `configfile` for other variables. After downloading and regrdding, run the following script to preprocess the `.nc` files into `.npz` format for pretraining ClimaX
```bash
python src/data_preprocessing/nc2np_equally_cmip6.py \
--dataset mpi
--path /data/CMIP6/MPI-ESM/1.40625deg/
--num_shards 10
--save_dir /data/CMIP6/MPI-ESM/1.40625deg_np_10shards
```
in which `num_shards` denotes the number of chunks to break each `.nc` file into.

### Training

Expand Down

0 comments on commit 3805481

Please sign in to comment.