wandb · soumik12345 · Apr 18, 2024 · Apr 19, 2024 · Apr 25, 2024 · Apr 25, 2024
diff --git a/life-science/00_monai_decathlon.ipynb b/life-science/00_monai_decathlon.ipynb
@@ -0,0 +1,195 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "33e2ac00-22be-4f7c-9445-5c3220d0f1bf",
+   "metadata": {},
+   "source": [
+    "#  Fetching Brain Tumor Segemntation Dataset\n",
+    "\n",
+    "In this notebook, we will learn:\n",
+    "- how we can use [MONAI Core APIs](https://github.com/Project-MONAI/MONAI) to download the brain tumor segmentation data from the [Medical Segmentation Decathlon](http://medicaldecathlon.com) challenge.\n",
+    "- how we can upload the dataset to Weights & Biases and use it as a dataset artifact."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "813a28eb-8d05-412c-b3d4-9e64eb2962dc",
+   "metadata": {},
+   "source": [
+    "## 🌴 Setup and Installation\n",
+    "\n",
+    "First, let us install the latest version of both MONAI and Weights and Biases."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6d8a4eaa-6c15-44f0-81f8-b0c2800b1017",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!pip install -q -U monai wandb"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "752e1f77-a825-4eb7-afb7-5c2807b29ada",
+   "metadata": {},
+   "source": [
+    "## 🌳 Initialize a W&B Run\n",
+    "\n",
+    "We will start a new W&B run to start tracking our experiment."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a2315b79-8c0a-4cfd-aa6d-4fca55d78137",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import wandb\n",
+    "\n",
+    "wandb.init(\n",
+    "    project=\"brain-tumor-segmentation\",\n",
+    "    entity=\"lifesciences\",\n",
+    "    job_type=\"fetch_dataset\"\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "308bd1ff-0999-4b85-b9a7-2a9d5753e69e",
+   "metadata": {},
+   "source": [
+    "## 🍁 Fetching the Dataset using MONAI\n",
+    "\n",
+    "The [`monai.apps.DecathlonDataset`](https://docs.monai.io/en/stable/apps.html#monai.apps.DecathlonDataset) lets us automatically download the data of [Medical Segmentation Decathlon challenge](http://medicaldecathlon.com/) and generate items for training, validation, or testing. We will use this API in the later notebooks to load and transform our datasets automatically."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "42189439-2c3d-403b-915a-98f897d049e4",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "# Make the dataset directory\n",
+    "os.makedirs(\"./dataset/\", exist_ok=True)\n",
+    "\n",
+    "\n",
+    "from monai.apps import DecathlonDataset\n",
+    "\n",
+    "# Fetch the training split of the brain tumor segmentation dataset\n",
+    "train_dataset = DecathlonDataset(\n",
+    "    root_dir=\"./dataset/\",\n",
+    "    task=\"Task01_BrainTumour\",\n",
+    "    section=\"training\",\n",
+    "    download=True,\n",
+    "    cache_rate=0.0,\n",
+    "    num_workers=4,\n",
+    ")\n",
+    "\n",
+    "# Fetch the validation split of the brain tumor segmentation dataset\n",
+    "val_dataset = DecathlonDataset(\n",
+    "    root_dir=\"./dataset/\",\n",
+    "    task=\"Task01_BrainTumour\",\n",
+    "    section=\"validation\",\n",
+    "    download=False,\n",
+    "    cache_rate=0.0,\n",
+    "    num_workers=4,\n",
+    ")\n",
+    "\n",
+    "# Fetch the test split of the brain tumor segmentation dataset\n",
+    "test_dataset = DecathlonDataset(\n",
+    "    root_dir=\"./dataset/\",\n",
+    "    task=\"Task01_BrainTumour\",\n",
+    "    section=\"test\",\n",
+    "    download=False,\n",
+    "    cache_rate=0.0,\n",
+    "    num_workers=4,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "07461dbc-3056-4f06-bb1a-462246a35791",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print(\"Train Set Size:\", len(train_dataset))\n",
+    "print(\"Validation Set Size:\", len(val_dataset))\n",
+    "print(\"Test Set Size:\", len(test_dataset))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "93e0609f-3009-4bd0-baf9-e8e10084801c",
+   "metadata": {},
+   "source": [
+    "## 💿 Upload the Dataset to W&B as an Artifact\n",
+    "\n",
+    "[W&B Artifacts](https://docs.wandb.ai/guides/artifacts) can be used to track and version any serialized data as the inputs and outputs of your W&B Runs. For example, a model training run might take in a dataset as input and a trained model as output.\n",
+    "\n",
+    "![](https://docs.wandb.ai/assets/images/artifacts_landing_page2-b6bd49ea5db62eff00f582a95845fed9.png)\n",
+    "\n",
+    "Let us now see how we can upload this dataset as a W&B artifact."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9f1f35e5-927e-4baf-a351-652e7e99fe76",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "artifact = wandb.Artifact(name=\"decathlon_brain_tumor\", type=\"dataset\")\n",
+    "artifact.add_dir(local_path=\"./dataset/\")\n",
+    "wandb.log_artifact(artifact)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e1cbbe47-f83f-4db3-9c81-879121041881",
+   "metadata": {},
+   "source": [
+    "Now we end the experiment by calling `wandb.finish()`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "25ea852b-04d7-4e94-97c3-45d972b21886",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "wandb.finish()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.12"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}