{ "cells": [ { "metadata": {}, "cell_type": "markdown", "source": [ "# nnUNet Tutorial Notebook\n", "\n", "This notebook will guide you through the process of training a nnUNet model on the BraTS dataset, to segment Adult Gliomas. The notebook will cover the basic steps on how to perform a complete nnUNet experiment, from downloading the data to training the model and making predictions." ], "id": "6aac42df71fc5c69" }, { "metadata": {}, "cell_type": "markdown", "source": [ "## Data Downloading\n", "\n", "First we will download the BraTS dataset from the Decathlon Challenge Website. The dataset is available at https://drive.google.com/uc?id=1A2IU8Sgea1h3fYLpYtFb2v7NYdMjvEhU" ], "id": "5a45cfcf" }, { "cell_type": "code", "id": "a14f956a", "metadata": { "collapsed": true, "is_executing": true, "jupyter": { "outputs_hidden": true } }, "source": [ "!pip install gdown" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "id": "ff374688", "metadata": {}, "source": [ "import gdown\n", "\n", "output_tar = gdown.download(\"https://drive.google.com/uc?id=1A2IU8Sgea1h3fYLpYtFb2v7NYdMjvEhU\")" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "id": "41e0684e", "metadata": {}, "source": [ "import tarfile\n", "tar = tarfile.open(output_tar)\n", "tar.extractall()\n", "tar.close()" ], "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "id": "7966c5ca", "metadata": {}, "source": [ "## Multi-Modal to Single Modality Conversion\n", "\n", "nnUNet requires the data to be in a specific format, where each modality is stored in a separate file. Conversely, the Decathlon BraTS Dataset stores all the 4 Image Modalities in a single multi-channel file. We will convert the multi-modal data to single modal data." ] }, { "cell_type": "code", "id": "a9571d02", "metadata": { "collapsed": true, "jupyter": { "outputs_hidden": true, "is_executing": true }, "scrolled": true }, "source": [ "import SimpleITK as sitk\n", "import os\n", "from pathlib import Path\n", "import numpy as np\n", "from tqdm.notebook import tqdm\n", "\n", "data_dir = \"Task01_BrainTumour\"\n", "data_list = [f.name for f in os.scandir(Path(data_dir).joinpath(\"imagesTr\")) if f.is_file()]\n", "file_extension = \".nii.gz\"\n", "\n", "output_dir = str(Path(data_dir).joinpath(\"imagesTr_Single\"))\n", "\n", "\n", "Path(output_dir).mkdir(parents=True,exist_ok=True)\n", "modality_dict = {\n", " \"_001.nii.gz\": \"FLAIR\",\n", " \"_002.nii.gz\": \"T1w\", \n", " \"_003.nii.gz\": \"t1gd\",\n", " \"_004.nii.gz\": \"T2w\"\n", " }\n", "\n", "for data in tqdm(data_list):\n", " if data.startswith(\".\"):\n", " continue\n", " image = sitk.ReadImage(str(Path(data_dir).joinpath(\"imagesTr\",data)))\n", " data_array = sitk.GetArrayFromImage(image)\n", " for idx,modality in enumerate(modality_dict):\n", " single_image = sitk.GetImageFromArray(data_array[idx])\n", " single_image.SetSpacing(image.GetSpacing())\n", " single_image.SetOrigin(image.GetOrigin())\n", " single_image.SetDirection(image.GetDirection()[:3]+image.GetDirection()[4:7]+image.GetDirection()[8:11])\n", " filename = str(Path(output_dir).joinpath(str(data)[:-len(file_extension)]+modality))\n", " #print(f\"Writing {filename}\")\n", " sitk.WriteImage(single_image, filename)" ], "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "id": "49e4a5dc", "metadata": {}, "source": [ "## Configuration File\n", "\n", "Next, we will create a PyMAIA configuration file for the experiment. The configuration file will contain the following information:" ] }, { "cell_type": "code", "id": "8d90eba8", "metadata": {}, "source": [ "import json\n", "brats_config = {\n", " \"Experiment Name\": \"BraTS\",\n", " \"Seed\": 12345,\n", " \"label_suffix\": \".nii.gz\",\n", " \"Modalities\": modality_dict,\n", " \"label_dict\": {\n", " \"background\": 0,\n", " \"whole_tumor\": [1, 2, 3],\n", " \"tumor_core\": [2, 3],\n", " \"enhancing_tumor\": 3\n", " },\n", " \"n_folds\": 5,\n", " \"FileExtension\": \".nii.gz\",\n", " \"RegionClassOrder\" : [1,2,3]\n", " \n", "}\n", "\n", "with open(\"BraTS_config.json\",\"w\") as f:\n", " json.dump(brats_config,f,indent=4)" ], "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "id": "c914552f", "metadata": {}, "source": [ "## Decathlon Dataset File\n", "\n", "Finally, we will create a dataset.json file that will contain the paths to the training and testing data. The dataset.json file will have the following structure: \n", "```json\n", "{\n", " \"train\": [\n", " {\n", " \"FLAIR\": \"Path to FLAIR Image\",\n", " \"T1w\": \"Path to T1w Image\",\n", " \"t1gd\": \"Path to t1gd Image\",\n", " \"T2w\": \"Path to T2w Image\",\n", " \"label\": \"Path to Label Image\"\n", " }\n", " ],\n", " \"test\": [\n", " {\n", " \"FLAIR\": \"Path to FLAIR Image\",\n", " \"T1w\": \"Path to T1w Image\",\n", " \"t1gd\": \"Path to t1gd Image\",\n", " \"T2w\": \"Path to T2w Image\",\n", " \"label\": \"Path to Label Image\"\n", " }\n", " ]\n", "}\n", "```" ] }, { "cell_type": "code", "id": "0906b00e", "metadata": {}, "source": [ "cases = [f.name[:-len(\"_000.nii.gz\")] \n", " for f in os.scandir(Path(data_dir).joinpath(\"imagesTr_Single\")) \n", " if f.is_file() \n", " if f.name.endswith(file_extension)]\n", "\n", "cases = np.unique(cases)\n", "\n", "data_list = {\n", " \"train\":\n", " [\n", " {\n", " modality_dict[modality_id] : str(Path(data_dir).joinpath(\"imagesTr_Single\",case + modality_id))\n", " for modality_id in modality_dict\n", " }\n", " for case in cases\n", " ],\n", " \"test\": []\n", "}\n", "\n", "for section in data_list:\n", " for idx, case in enumerate(data_list[section]):\n", " f = Path(data_list[section][idx][list(modality_dict.values())[0]]).name\n", " data_list[section][idx][\"label\"] = str(Path(data_dir).joinpath(\"labelsTr\", f[:-len(\"_000.nii.gz\")]+brats_config[\"label_suffix\"]))\n", "\n", "\n", "with open(\"dataset.json\", \"w\") as f:\n", " json.dump(data_list, f, indent=4)" ], "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "id": "263d537a", "metadata": {}, "source": [ "# Create Pipeline" ] }, { "cell_type": "code", "id": "8bc906fd", "metadata": {}, "source": [ "%%bash\n", "\n", "export ROOT_FOLDER=./\n", "\n", "nnunet_create_pipeline.py --input-data-folder dataset.json --config-file BraTS_config.json --task-ID 100 --test-split 0" ], "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "id": "39c9c178", "metadata": {}, "source": [ "## Prepare Data" ] }, { "cell_type": "code", "id": "c82e6c77", "metadata": {}, "source": [ "%%bash\n", "\n", "export ROOT_FOLDER=/opt/code/PyMAIA/Tutorials\n", "nnunet_prepare_data_folder --input-data-folder dataset.json --task-ID 100 --task-name BraTS --config-file BraTS_config.json --test-split 0" ], "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "id": "1da5de19", "metadata": {}, "source": [ "# Pre-Processing" ] }, { "cell_type": "code", "id": "7453040e", "metadata": {}, "source": [ "%%bash\n", "\n", "export ROOT_FOLDER=/opt/code/PyMAIA/Tutorials\n", "\n", "nnunet_run_plan_and_preprocessing --config-file /opt/code/PyMAIA/Tutorials/BraTS/BraTS_results/Dataset100_BraTS.json -np 4" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "id": "a6cb3b7d-e43d-41c2-936b-d27588108296", "metadata": {}, "source": [ "## Skip Plan and Preprocess Only\n", "\n", "Preprocess\n", "\n", "Normalization (Use Non-Zero Mask, Normalization Scheme)\n", "Resampling (spacing-> Transpose Forward)\n" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "id": "587745c6-6250-4a32-9c34-bc97e182f5b0", "metadata": {}, "source": [ "## Train COnfig\n", "\n", "\n", "\n", "\n", "Custom nnUNetTrainer\n", " self.initial_lr = 1e-2\n", " self.weight_decay = 3e-5\n", " self.oversample_foreground_percent = 0.33\n", " self.num_iterations_per_epoch = 250\n", " self.num_val_iterations_per_epoch = 50\n", " self.num_epochs = 1000\n", " self.current_epoch = 0\n", " self.enable_deep_supervision = True\n", "\n", "self.configure_optimizers\n", "self._build_loss\n", "\n", "\n", "Batch Size, Patch Size -> nnUNetPlans" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "id": "a7d60975-c19c-4443-a66f-adba32f9d944", "metadata": {}, "source": [ "%%writefile ../../../../../../nnUNet/nnunetv2/training/nnUNetTrainer/nnUNetTrainerDemo.py\n", "\n", "from nnunetv2.training.nnUNetTrainer.nnUNetTrainer import nnUNetTrainer\n", "import torch\n", "\n", "\n", "class nnUNetTrainerDemo(nnUNetTrainer):\n", " def __init__(\n", " self,\n", " plans: dict,\n", " configuration: str,\n", " fold: int,\n", " dataset_json: dict,\n", " unpack_dataset: bool = True,\n", " device: torch.device = torch.device(\"cuda\"),\n", " ):\n", " super().__init__(plans, configuration, fold, dataset_json, unpack_dataset, device)\n", " self.num_iterations_per_epoch = 10\n", " self.num_val_iterations_per_epoch = 10\n", " self.num_epochs = 5\n" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "id": "9b33cb01", "metadata": {}, "source": [ "%%bash\n", "\n", "export ROOT_FOLDER=/opt/code/PyMAIA/Tutorials\n", "export N_THREADS=4\n", "nnunet_run_training --config-file /opt/code/PyMAIA/Tutorials/BraTS/BraTS_results/Dataset100_BraTS.json --run-fold 0 -tr nnUNetTrainerDemo" ], "execution_count": null, "outputs": [] }, { "metadata": {}, "cell_type": "code", "source": "", "id": "8c4522bb6221e35a", "execution_count": null, "outputs": [] }, { "cell_type": "code", "id": "41973c70", "metadata": {}, "source": [ "Convert nnUNet/nnDet to MONAI Bundle\n", "Export Trained Model and Upload to MLFlow ( both MONAI and nnUNet Original)\n", "\n", "Run Training MLFlow\n", "Run Prediction MLFlow" ], "execution_count": null, "outputs": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.18" } }, "nbformat": 4, "nbformat_minor": 5 }