{
 "cells": [
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "# nnUNet Tutorial Notebook\n",
    "\n",
    "This notebook will guide you through the process of training a nnUNet model on the BraTS dataset, to segment Adult Gliomas. The notebook will cover the basic steps on how to perform a complete nnUNet experiment, from downloading the data to training the model and making predictions."
   ],
   "id": "6aac42df71fc5c69"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "## Data Downloading\n",
    "\n",
    "First we will download the BraTS dataset from the Decathlon Challenge Website. The dataset is available at https://drive.google.com/uc?id=1A2IU8Sgea1h3fYLpYtFb2v7NYdMjvEhU"
   ],
   "id": "5a45cfcf"
  },
  {
   "cell_type": "code",
   "id": "a14f956a",
   "metadata": {
    "collapsed": true,
    "is_executing": true,
    "jupyter": {
     "outputs_hidden": true
    }
   },
   "source": [
    "!pip install gdown"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "code",
   "id": "ff374688",
   "metadata": {},
   "source": [
    "import gdown\n",
    "\n",
    "output_tar = gdown.download(\"https://drive.google.com/uc?id=1A2IU8Sgea1h3fYLpYtFb2v7NYdMjvEhU\")"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "code",
   "id": "41e0684e",
   "metadata": {},
   "source": [
    "import tarfile\n",
    "tar = tarfile.open(output_tar)\n",
    "tar.extractall()\n",
    "tar.close()"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "id": "7966c5ca",
   "metadata": {},
   "source": [
    "## Multi-Modal to Single Modality Conversion\n",
    "\n",
    "nnUNet requires the data to be in a specific format, where each modality is stored in a separate file. Conversely, the Decathlon BraTS Dataset stores all the 4 Image Modalities in a single multi-channel file. We will convert the multi-modal data to single modal data."
   ]
  },
  {
   "cell_type": "code",
   "id": "a9571d02",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true,
     "is_executing": true
    },
    "scrolled": true
   },
   "source": [
    "import SimpleITK as sitk\n",
    "import os\n",
    "from pathlib import Path\n",
    "import numpy as np\n",
    "from tqdm.notebook import tqdm\n",
    "\n",
    "data_dir = \"Task01_BrainTumour\"\n",
    "data_list = [f.name for f in os.scandir(Path(data_dir).joinpath(\"imagesTr\")) if f.is_file()]\n",
    "file_extension = \".nii.gz\"\n",
    "\n",
    "output_dir = str(Path(data_dir).joinpath(\"imagesTr_Single\"))\n",
    "\n",
    "\n",
    "Path(output_dir).mkdir(parents=True,exist_ok=True)\n",
    "modality_dict = {\n",
    "         \"_001.nii.gz\": \"FLAIR\",\n",
    "         \"_002.nii.gz\": \"T1w\", \n",
    "         \"_003.nii.gz\": \"t1gd\",\n",
    "         \"_004.nii.gz\": \"T2w\"\n",
    "    }\n",
    "\n",
    "for data in tqdm(data_list):\n",
    "    if data.startswith(\".\"):\n",
    "        continue\n",
    "    image = sitk.ReadImage(str(Path(data_dir).joinpath(\"imagesTr\",data)))\n",
    "    data_array = sitk.GetArrayFromImage(image)\n",
    "    for idx,modality in enumerate(modality_dict):\n",
    "        single_image = sitk.GetImageFromArray(data_array[idx])\n",
    "        single_image.SetSpacing(image.GetSpacing())\n",
    "        single_image.SetOrigin(image.GetOrigin())\n",
    "        single_image.SetDirection(image.GetDirection()[:3]+image.GetDirection()[4:7]+image.GetDirection()[8:11])\n",
    "        filename = str(Path(output_dir).joinpath(str(data)[:-len(file_extension)]+modality))\n",
    "        #print(f\"Writing {filename}\")\n",
    "        sitk.WriteImage(single_image, filename)"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "id": "49e4a5dc",
   "metadata": {},
   "source": [
    "## Configuration File\n",
    "\n",
    "Next, we will create a PyMAIA configuration file for the experiment. The configuration file will contain the following information:"
   ]
  },
  {
   "cell_type": "code",
   "id": "8d90eba8",
   "metadata": {},
   "source": [
    "import json\n",
    "brats_config = {\n",
    "    \"Experiment Name\": \"BraTS\",\n",
    "    \"Seed\": 12345,\n",
    "    \"label_suffix\": \".nii.gz\",\n",
    "    \"Modalities\": modality_dict,\n",
    "    \"label_dict\": {\n",
    "        \"background\": 0,\n",
    "        \"whole_tumor\": [1, 2, 3],\n",
    "        \"tumor_core\": [2, 3],\n",
    "        \"enhancing_tumor\": 3\n",
    "    },\n",
    "    \"n_folds\": 5,\n",
    "    \"FileExtension\": \".nii.gz\",\n",
    "    \"RegionClassOrder\" : [1,2,3]\n",
    "    \n",
    "}\n",
    "\n",
    "with open(\"BraTS_config.json\",\"w\") as f:\n",
    "    json.dump(brats_config,f,indent=4)"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "id": "c914552f",
   "metadata": {},
   "source": [
    "## Decathlon Dataset File\n",
    "\n",
    "Finally, we will create a dataset.json file that will contain the paths to the training and testing data. The dataset.json file will have the following structure: \n",
    "```json\n",
    "{\n",
    "    \"train\": [\n",
    "        {\n",
    "            \"FLAIR\": \"Path to FLAIR Image\",\n",
    "            \"T1w\": \"Path to T1w Image\",\n",
    "            \"t1gd\": \"Path to t1gd Image\",\n",
    "            \"T2w\": \"Path to T2w Image\",\n",
    "            \"label\": \"Path to Label Image\"\n",
    "        }\n",
    "    ],\n",
    "    \"test\": [\n",
    "        {\n",
    "            \"FLAIR\": \"Path to FLAIR Image\",\n",
    "            \"T1w\": \"Path to T1w Image\",\n",
    "            \"t1gd\": \"Path to t1gd Image\",\n",
    "            \"T2w\": \"Path to T2w Image\",\n",
    "            \"label\": \"Path to Label Image\"\n",
    "        }\n",
    "    ]\n",
    "}\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "id": "0906b00e",
   "metadata": {},
   "source": [
    "cases = [f.name[:-len(\"_000.nii.gz\")] \n",
    "         for f in os.scandir(Path(data_dir).joinpath(\"imagesTr_Single\")) \n",
    "         if f.is_file() \n",
    "         if f.name.endswith(file_extension)]\n",
    "\n",
    "cases = np.unique(cases)\n",
    "\n",
    "data_list = {\n",
    "    \"train\":\n",
    "        [\n",
    "            {\n",
    "                modality_dict[modality_id] : str(Path(data_dir).joinpath(\"imagesTr_Single\",case + modality_id))\n",
    "                for modality_id in modality_dict\n",
    "            }\n",
    "            for case in cases\n",
    "        ],\n",
    "    \"test\": []\n",
    "}\n",
    "\n",
    "for section in data_list:\n",
    "    for idx, case in enumerate(data_list[section]):\n",
    "        f = Path(data_list[section][idx][list(modality_dict.values())[0]]).name\n",
    "        data_list[section][idx][\"label\"] = str(Path(data_dir).joinpath(\"labelsTr\", f[:-len(\"_000.nii.gz\")]+brats_config[\"label_suffix\"]))\n",
    "\n",
    "\n",
    "with open(\"dataset.json\", \"w\") as f:\n",
    "    json.dump(data_list, f, indent=4)"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "id": "263d537a",
   "metadata": {},
   "source": "## Create Pipeline"
  },
  {
   "cell_type": "code",
   "id": "8bc906fd",
   "metadata": {},
   "source": [
    "%%bash\n",
    "\n",
    "export ROOT_FOLDER=./\n",
    "\n",
    "nnunet_create_pipeline.py --input-data-folder dataset.json --config-file BraTS_config.json --task-ID 100 --test-split 0"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "id": "39c9c178",
   "metadata": {},
   "source": [
    "## Prepare Data"
   ]
  },
  {
   "cell_type": "code",
   "id": "c82e6c77",
   "metadata": {},
   "source": [
    "%%bash\n",
    "\n",
    "export ROOT_FOLDER=/opt/code/PyMAIA/Tutorials\n",
    "nnunet_prepare_data_folder --input-data-folder dataset.json --task-ID 100 --task-name BraTS --config-file BraTS_config.json --test-split 0"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "id": "1da5de19",
   "metadata": {},
   "source": "## Pre-Processing"
  },
  {
   "cell_type": "code",
   "id": "7453040e",
   "metadata": {},
   "source": [
    "%%bash\n",
    "\n",
    "export ROOT_FOLDER=/opt/code/PyMAIA/Tutorials\n",
    "\n",
    "nnunet_run_plan_and_preprocessing --config-file /opt/code/PyMAIA/Tutorials/BraTS/BraTS_results/Dataset100_BraTS.json -np 4"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "During the preprocessing step, the nnUnet framework will automatically handle the following steps:\n",
    "\n",
    "- Resampling (Target spacing, followed by Transpose)\n",
    "- Normalization ( Optional use of Non-Zero Mask, Custom Normalization Scheme for different modalities)\n",
    "\n"
   ],
   "id": "62c352e1711406a8"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "## Model Training\n",
    "\n",
    "To customize the nnUNet model training, we can create a custom nnUNetTrainer class that inherits from the nnUNetTrainer class. The custom class can be used to override the default training configuration, such as the learning rate, weight decay, and number of epochs:\n",
    "\n",
    "```python\n",
    "\n",
    "from nnunetv2.training.nnUNetTrainer.nnUNetTrainer import nnUNetTrainer\n",
    "import torch\n",
    "\n",
    "\n",
    "class nnUNetTrainerDemo(nnUNetTrainer):\n",
    "    def __init__(\n",
    "            self,\n",
    "            plans: dict,\n",
    "            configuration: str,\n",
    "            fold: int,\n",
    "            dataset_json: dict,\n",
    "            unpack_dataset: bool = True,\n",
    "            device: torch.device = torch.device(\"cuda\"),\n",
    "    ):\n",
    "        super().__init__(plans, configuration, fold, dataset_json, unpack_dataset, device)\n",
    "        self.num_iterations_per_epoch = 10\n",
    "        self.num_val_iterations_per_epoch = 10\n",
    "        self.num_epochs = 5\n",
    "        self.initial_lr = 1e-2\n",
    "        self.weight_decay = 3e-5\n",
    "        self.oversample_foreground_percent = 0.33\n",
    "        self.num_iterations_per_epoch = 250\n",
    "        self.num_val_iterations_per_epoch = 50\n",
    "        self.num_epochs = 1000\n",
    "        self.current_epoch = 0\n",
    "        self.enable_deep_supervision = False\n",
    "    \n",
    "    def configure_optimizers(self):\n",
    "        return torch.optim.Adam(self.network.parameters(), self.initial_lr, weight_decay=self.weight_decay)\n",
    "\n",
    "\n",
    "    def _build_loss(self):\n",
    "        self.loss = torch.nn.CrossEntropyLoss()\n",
    "        return self.loss\n",
    "\n",
    "```\n",
    "\n",
    "To customize the Batch and the Patch size, we can modify the corresponding entries in the nnUNetPlans file.\n",
    "\n"
   ],
   "id": "466bd808d5030e5c"
  },
  {
   "cell_type": "code",
   "id": "9b33cb01",
   "metadata": {},
   "source": [
    "%%bash\n",
    "\n",
    "export ROOT_FOLDER=/opt/code/PyMAIA/Tutorials\n",
    "export N_THREADS=4\n",
    "nnunet_run_training --config-file /opt/code/PyMAIA/Tutorials/BraTS/BraTS_results/Dataset100_BraTS.json --run-fold 0 -tr nnUNetTrainerDemo"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "## Export nnUNet Model\n",
    "\n",
    "After the 5-fold cross-validation training is complete, we can export the nnUNet model to a zip file. The zip file contains the model weights, the configuration file, and the training logs."
   ],
   "id": "550c78a3781a5a2d"
  },
  {
   "metadata": {},
   "cell_type": "code",
   "outputs": [],
   "execution_count": null,
   "source": [
    "%%bash\n",
    "\n",
    "export ROOT_FOLDER=/opt/code/PyMAIA/Tutorials\n",
    "export N_THREADS=4\n",
    "nnunet_run_training --config-file /opt/code/PyMAIA/Tutorials/BraTS/BraTS_results/Dataset100_BraTS.json --run-fold -1 --output-model-file BraTS_nnuNet.zip -tr nnUNetTrainerDemo"
   ],
   "id": "f233617ba50d54e9"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "## Convert nnUNet to MONAI Bundle [Coming Soon]",
   "id": "c6384e2ddcd530d6"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "## Export Trained Model and Upload to MLFlow [Coming Soon]",
   "id": "d4f79c9f7473c65a"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "## Run Inference [Coming Soon]",
   "id": "c468087a2e1d984a"
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.18"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}