{
 "cells": [
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "# nnUNet Tutorial Notebook\n",
    "\n",
    "This notebook will guide you through the process of training a nnUNet model on the BraTS dataset, to segment Adult Gliomas. The notebook will cover the basic steps on how to perform a complete nnUNet experiment, from downloading the data to training the model and making predictions."
   ],
   "id": "6aac42df71fc5c69"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "## Data Downloading\n",
    "\n",
    "First we will download the BraTS dataset from the Decathlon Challenge Website. The dataset is available at https://drive.google.com/uc?id=1A2IU8Sgea1h3fYLpYtFb2v7NYdMjvEhU"
   ],
   "id": "5a45cfcf"
  },
  {
   "cell_type": "code",
   "id": "a14f956a",
   "metadata": {
    "collapsed": true,
    "is_executing": true,
    "jupyter": {
     "outputs_hidden": true
    }
   },
   "source": [
    "!pip install gdown"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "code",
   "id": "ff374688",
   "metadata": {},
   "source": [
    "import gdown\n",
    "\n",
    "output_tar = gdown.download(\"https://drive.google.com/uc?id=1A2IU8Sgea1h3fYLpYtFb2v7NYdMjvEhU\")"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "code",
   "id": "41e0684e",
   "metadata": {},
   "source": [
    "import tarfile\n",
    "tar = tarfile.open(output_tar)\n",
    "tar.extractall()\n",
    "tar.close()"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "id": "7966c5ca",
   "metadata": {},
   "source": [
    "## Multi-Modal to Single Modality Conversion\n",
    "\n",
    "nnUNet requires the data to be in a specific format, where each modality is stored in a separate file. Conversely, the Decathlon BraTS Dataset stores all the 4 Image Modalities in a single multi-channel file. We will convert the multi-modal data to single modal data."
   ]
  },
  {
   "cell_type": "code",
   "id": "a9571d02",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true,
     "is_executing": true
    },
    "scrolled": true
   },
   "source": [
    "import SimpleITK as sitk\n",
    "import os\n",
    "from pathlib import Path\n",
    "import numpy as np\n",
    "from tqdm.notebook import tqdm\n",
    "\n",
    "data_dir = \"Task01_BrainTumour\"\n",
    "data_list = [f.name for f in os.scandir(Path(data_dir).joinpath(\"imagesTr\")) if f.is_file()]\n",
    "file_extension = \".nii.gz\"\n",
    "\n",
    "output_dir = str(Path(data_dir).joinpath(\"imagesTr_Single\"))\n",
    "\n",
    "\n",
    "Path(output_dir).mkdir(parents=True,exist_ok=True)\n",
    "modality_dict = {\n",
    "         \"_001.nii.gz\": \"FLAIR\",\n",
    "         \"_002.nii.gz\": \"T1w\", \n",
    "         \"_003.nii.gz\": \"t1gd\",\n",
    "         \"_004.nii.gz\": \"T2w\"\n",
    "    }\n",
    "\n",
    "for data in tqdm(data_list):\n",
    "    if data.startswith(\".\"):\n",
    "        continue\n",
    "    image = sitk.ReadImage(str(Path(data_dir).joinpath(\"imagesTr\",data)))\n",
    "    data_array = sitk.GetArrayFromImage(image)\n",
    "    for idx,modality in enumerate(modality_dict):\n",
    "        single_image = sitk.GetImageFromArray(data_array[idx])\n",
    "        single_image.SetSpacing(image.GetSpacing())\n",
    "        single_image.SetOrigin(image.GetOrigin())\n",
    "        single_image.SetDirection(image.GetDirection()[:3]+image.GetDirection()[4:7]+image.GetDirection()[8:11])\n",
    "        filename = str(Path(output_dir).joinpath(str(data)[:-len(file_extension)]+modality))\n",
    "        #print(f\"Writing {filename}\")\n",
    "        sitk.WriteImage(single_image, filename)"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "id": "49e4a5dc",
   "metadata": {},
   "source": [
    "## Configuration File\n",
    "\n",
    "Next, we will create a PyMAIA configuration file for the experiment. The configuration file will contain the following information:"
   ]
  },
  {
   "cell_type": "code",
   "id": "8d90eba8",
   "metadata": {},
   "source": [
    "import json\n",
    "brats_config = {\n",
    "    \"Experiment Name\": \"BraTS\",\n",
    "    \"Seed\": 12345,\n",
    "    \"label_suffix\": \".nii.gz\",\n",
    "    \"Modalities\": modality_dict,\n",
    "    \"label_dict\": {\n",
    "        \"background\": 0,\n",
    "        \"whole_tumor\": [1, 2, 3],\n",
    "        \"tumor_core\": [2, 3],\n",
    "        \"enhancing_tumor\": 3\n",
    "    },\n",
    "    \"n_folds\": 5,\n",
    "    \"FileExtension\": \".nii.gz\",\n",
    "    \"RegionClassOrder\" : [1,2,3]\n",
    "    \n",
    "}\n",
    "\n",
    "with open(\"BraTS_config.json\",\"w\") as f:\n",
    "    json.dump(brats_config,f,indent=4)"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "id": "c914552f",
   "metadata": {},
   "source": [
    "## Decathlon Dataset File\n",
    "\n",
    "Finally, we will create a dataset.json file that will contain the paths to the training and testing data. The dataset.json file will have the following structure: \n",
    "```json\n",
    "{\n",
    "    \"train\": [\n",
    "        {\n",
    "            \"FLAIR\": \"Path to FLAIR Image\",\n",
    "            \"T1w\": \"Path to T1w Image\",\n",
    "            \"t1gd\": \"Path to t1gd Image\",\n",
    "            \"T2w\": \"Path to T2w Image\",\n",
    "            \"label\": \"Path to Label Image\"\n",
    "        }\n",
    "    ],\n",
    "    \"test\": [\n",
    "        {\n",
    "            \"FLAIR\": \"Path to FLAIR Image\",\n",
    "            \"T1w\": \"Path to T1w Image\",\n",
    "            \"t1gd\": \"Path to t1gd Image\",\n",
    "            \"T2w\": \"Path to T2w Image\",\n",
    "            \"label\": \"Path to Label Image\"\n",
    "        }\n",
    "    ]\n",
    "}\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "id": "0906b00e",
   "metadata": {},
   "source": [
    "cases = [f.name[:-len(\"_000.nii.gz\")] \n",
    "         for f in os.scandir(Path(data_dir).joinpath(\"imagesTr_Single\")) \n",
    "         if f.is_file() \n",
    "         if f.name.endswith(file_extension)]\n",
    "\n",
    "cases = np.unique(cases)\n",
    "\n",
    "data_list = {\n",
    "    \"train\":\n",
    "        [\n",
    "            {\n",
    "                modality_dict[modality_id] : str(Path(data_dir).joinpath(\"imagesTr_Single\",case + modality_id))\n",
    "                for modality_id in modality_dict\n",
    "            }\n",
    "            for case in cases\n",
    "        ],\n",
    "    \"test\": []\n",
    "}\n",
    "\n",
    "for section in data_list:\n",
    "    for idx, case in enumerate(data_list[section]):\n",
    "        f = Path(data_list[section][idx][list(modality_dict.values())[0]]).name\n",
    "        data_list[section][idx][\"label\"] = str(Path(data_dir).joinpath(\"labelsTr\", f[:-len(\"_000.nii.gz\")]+brats_config[\"label_suffix\"]))\n",
    "\n",
    "\n",
    "with open(\"dataset.json\", \"w\") as f:\n",
    "    json.dump(data_list, f, indent=4)"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "id": "263d537a",
   "metadata": {},
   "source": [
    "# Create Pipeline"
   ]
  },
  {
   "cell_type": "code",
   "id": "8bc906fd",
   "metadata": {},
   "source": [
    "%%bash\n",
    "\n",
    "export ROOT_FOLDER=./\n",
    "\n",
    "nnunet_create_pipeline.py --input-data-folder dataset.json --config-file BraTS_config.json --task-ID 100 --test-split 0"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "id": "39c9c178",
   "metadata": {},
   "source": [
    "## Prepare Data"
   ]
  },
  {
   "cell_type": "code",
   "id": "c82e6c77",
   "metadata": {},
   "source": [
    "%%bash\n",
    "\n",
    "export ROOT_FOLDER=/opt/code/PyMAIA/Tutorials\n",
    "nnunet_prepare_data_folder --input-data-folder dataset.json --task-ID 100 --task-name BraTS --config-file BraTS_config.json --test-split 0"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "id": "1da5de19",
   "metadata": {},
   "source": [
    "# Pre-Processing"
   ]
  },
  {
   "cell_type": "code",
   "id": "7453040e",
   "metadata": {},
   "source": [
    "%%bash\n",
    "\n",
    "export ROOT_FOLDER=/opt/code/PyMAIA/Tutorials\n",
    "\n",
    "nnunet_run_plan_and_preprocessing --config-file /opt/code/PyMAIA/Tutorials/BraTS/BraTS_results/Dataset100_BraTS.json -np 4"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "code",
   "id": "a6cb3b7d-e43d-41c2-936b-d27588108296",
   "metadata": {},
   "source": [
    "## Skip Plan and Preprocess Only\n",
    "\n",
    "Preprocess\n",
    "\n",
    "Normalization (Use Non-Zero Mask, Normalization Scheme)\n",
    "Resampling (spacing-> Transpose Forward)\n"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "code",
   "id": "587745c6-6250-4a32-9c34-bc97e182f5b0",
   "metadata": {},
   "source": [
    "## Train COnfig\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "Custom nnUNetTrainer\n",
    " self.initial_lr = 1e-2\n",
    "        self.weight_decay = 3e-5\n",
    "        self.oversample_foreground_percent = 0.33\n",
    "        self.num_iterations_per_epoch = 250\n",
    "        self.num_val_iterations_per_epoch = 50\n",
    "        self.num_epochs = 1000\n",
    "        self.current_epoch = 0\n",
    "        self.enable_deep_supervision = True\n",
    "\n",
    "self.configure_optimizers\n",
    "self._build_loss\n",
    "\n",
    "\n",
    "Batch Size, Patch Size -> nnUNetPlans"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "code",
   "id": "a7d60975-c19c-4443-a66f-adba32f9d944",
   "metadata": {},
   "source": [
    "%%writefile ../../../../../../nnUNet/nnunetv2/training/nnUNetTrainer/nnUNetTrainerDemo.py\n",
    "\n",
    "from nnunetv2.training.nnUNetTrainer.nnUNetTrainer import nnUNetTrainer\n",
    "import torch\n",
    "\n",
    "\n",
    "class nnUNetTrainerDemo(nnUNetTrainer):\n",
    "    def __init__(\n",
    "            self,\n",
    "            plans: dict,\n",
    "            configuration: str,\n",
    "            fold: int,\n",
    "            dataset_json: dict,\n",
    "            unpack_dataset: bool = True,\n",
    "            device: torch.device = torch.device(\"cuda\"),\n",
    "    ):\n",
    "        super().__init__(plans, configuration, fold, dataset_json, unpack_dataset, device)\n",
    "        self.num_iterations_per_epoch = 10\n",
    "        self.num_val_iterations_per_epoch = 10\n",
    "        self.num_epochs = 5\n"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "code",
   "id": "9b33cb01",
   "metadata": {},
   "source": [
    "%%bash\n",
    "\n",
    "export ROOT_FOLDER=/opt/code/PyMAIA/Tutorials\n",
    "export N_THREADS=4\n",
    "nnunet_run_training --config-file /opt/code/PyMAIA/Tutorials/BraTS/BraTS_results/Dataset100_BraTS.json --run-fold 0 -tr nnUNetTrainerDemo"
   ],
   "execution_count": null,
   "outputs": []
  },
  {
   "metadata": {},
   "cell_type": "code",
   "source": "",
   "id": "8c4522bb6221e35a",
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "code",
   "id": "41973c70",
   "metadata": {},
   "source": [
    "Convert nnUNet/nnDet to MONAI Bundle\n",
    "Export Trained Model and Upload to MLFlow ( both MONAI and nnUNet Original)\n",
    "\n",
    "Run Training MLFlow\n",
    "Run Prediction MLFlow"
   ],
   "execution_count": null,
   "outputs": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.18"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}