Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code and notebooks for life science course #139

Draft
wants to merge 21 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
195 changes: 195 additions & 0 deletions life-science/00_monai_decathlon.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,195 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "33e2ac00-22be-4f7c-9445-5c3220d0f1bf",
"metadata": {},
"source": [
"# Fetching Brain Tumor Segemntation Dataset\n",
"\n",
"In this notebook, we will learn:\n",
"- how we can use [MONAI Core APIs](https://github.com/Project-MONAI/MONAI) to download the brain tumor segmentation data from the [Medical Segmentation Decathlon](http://medicaldecathlon.com) challenge.\n",
"- how we can upload the dataset to Weights & Biases and use it as a dataset artifact."
]
},
{
"cell_type": "markdown",
"id": "813a28eb-8d05-412c-b3d4-9e64eb2962dc",
"metadata": {},
"source": [
"## 🌴 Setup and Installation\n",
"\n",
"First, let us install the latest version of both MONAI and Weights and Biases."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6d8a4eaa-6c15-44f0-81f8-b0c2800b1017",
"metadata": {},
"outputs": [],
"source": [
"!pip install -q -U monai wandb"
]
},
{
"cell_type": "markdown",
"id": "752e1f77-a825-4eb7-afb7-5c2807b29ada",
"metadata": {},
"source": [
"## 🌳 Initialize a W&B Run\n",
"\n",
"We will start a new W&B run to start tracking our experiment."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a2315b79-8c0a-4cfd-aa6d-4fca55d78137",
"metadata": {},
"outputs": [],
"source": [
"import wandb\n",
"\n",
"wandb.init(\n",
" project=\"brain-tumor-segmentation\",\n",
" entity=\"lifesciences\",\n",
" job_type=\"fetch_dataset\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "308bd1ff-0999-4b85-b9a7-2a9d5753e69e",
"metadata": {},
"source": [
"## 🍁 Fetching the Dataset using MONAI\n",
"\n",
"The [`monai.apps.DecathlonDataset`](https://docs.monai.io/en/stable/apps.html#monai.apps.DecathlonDataset) lets us automatically download the data of [Medical Segmentation Decathlon challenge](http://medicaldecathlon.com/) and generate items for training, validation, or testing. We will use this API in the later notebooks to load and transform our datasets automatically."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "42189439-2c3d-403b-915a-98f897d049e4",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"# Make the dataset directory\n",
"os.makedirs(\"./dataset/\", exist_ok=True)\n",
"\n",
"\n",
"from monai.apps import DecathlonDataset\n",
"\n",
"# Fetch the training split of the brain tumor segmentation dataset\n",
"train_dataset = DecathlonDataset(\n",
" root_dir=\"./dataset/\",\n",
" task=\"Task01_BrainTumour\",\n",
" section=\"training\",\n",
" download=True,\n",
" cache_rate=0.0,\n",
" num_workers=4,\n",
")\n",
"\n",
"# Fetch the validation split of the brain tumor segmentation dataset\n",
"val_dataset = DecathlonDataset(\n",
" root_dir=\"./dataset/\",\n",
" task=\"Task01_BrainTumour\",\n",
" section=\"validation\",\n",
" download=False,\n",
" cache_rate=0.0,\n",
" num_workers=4,\n",
")\n",
"\n",
"# Fetch the test split of the brain tumor segmentation dataset\n",
"test_dataset = DecathlonDataset(\n",
" root_dir=\"./dataset/\",\n",
" task=\"Task01_BrainTumour\",\n",
" section=\"test\",\n",
" download=False,\n",
" cache_rate=0.0,\n",
" num_workers=4,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "07461dbc-3056-4f06-bb1a-462246a35791",
"metadata": {},
"outputs": [],
"source": [
"print(\"Train Set Size:\", len(train_dataset))\n",
"print(\"Validation Set Size:\", len(val_dataset))\n",
"print(\"Test Set Size:\", len(test_dataset))"
]
},
{
"cell_type": "markdown",
"id": "93e0609f-3009-4bd0-baf9-e8e10084801c",
"metadata": {},
"source": [
"## 💿 Upload the Dataset to W&B as an Artifact\n",
"\n",
"[W&B Artifacts](https://docs.wandb.ai/guides/artifacts) can be used to track and version any serialized data as the inputs and outputs of your W&B Runs. For example, a model training run might take in a dataset as input and a trained model as output.\n",
"\n",
"![](https://docs.wandb.ai/assets/images/artifacts_landing_page2-b6bd49ea5db62eff00f582a95845fed9.png)\n",
"\n",
"Let us now see how we can upload this dataset as a W&B artifact."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9f1f35e5-927e-4baf-a351-652e7e99fe76",
"metadata": {},
"outputs": [],
"source": [
"artifact = wandb.Artifact(name=\"decathlon_brain_tumor\", type=\"dataset\")\n",
"artifact.add_dir(local_path=\"./dataset/\")\n",
"wandb.log_artifact(artifact)"
]
},
{
"cell_type": "markdown",
"id": "e1cbbe47-f83f-4db3-9c81-879121041881",
"metadata": {},
"source": [
"Now we end the experiment by calling `wandb.finish()`."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "25ea852b-04d7-4e94-97c3-45d972b21886",
"metadata": {},
"outputs": [],
"source": [
"wandb.finish()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading