Quickstart for MacTrack2
========================

Through this file, you can find all the information you need to make this project your own. Don't forget to create your Python environment with the packages available in the `requirements.txt` file.

How to create your own model
----------------------------

Dataset
++++++++

First, you need a dataset to create your model and a **models** folder to store it.

.. code-block:: python

   import os

   dirs = [
       "model",
       "model/dataset",
       "model/dataset/train",
       "model/dataset/train/train_x",
       "model/dataset/train/train_y",
       "model/dataset/test",
       "model/dataset/test/test_x",
       "model/dataset/test/test_y",
       "model/models",
   ]
   for d in dirs:
       os.makedirs(d, exist_ok=True)

This dataset is essential to create your model, you need to add the following:

* ``train_x``: a training set of images that you hand-cut yourself using `Fiji <https://imagej.net/software/fiji/downloads>`_ and more specifically `ImageJ <https://imagej.net/software/imagej>`_.
* ``train_y``: the masks for each segmented frame in a zip file containing all the ROI files.
* ``test_x`` and ``test_y``: you should do the same, it will serve to test the model that has been trained on the training set.

Here is an example of an image that will be placed in the **x** sets (whether it is the training or the test set) on the left. And on the right, each yellow lining represents a ROI (region of interest). Those are saved in a zip file and in the **y** sets.

.. raw:: html

   <p align="center">
     <img src="./_static/images/example_frame.jpg" alt="frame_ex" width="45%"/>
     <img src="./_static/images/example_frame_ROI.jpg" alt="ROI_ex" width="45%"/>
   </p>

.. note::
   Documentation on Fiji can be found `here <https://imagej.net/ij/docs/index.html>`_.

Next, the `kartezio` package that will help us build the model needs two other files: `META.json` and `dataset.csv`.

The META file is here to help the package recognize the formats of the different objects in the dataset folder.

.. code-block:: python

   import json

   # Path to the JSON file
   meta_path = "model/dataset/META.json"

   # Content of the JSON file
   meta_data = {
       "name": "Macrophage",
       "scale": 1.0,
       "label_name": "macrophage",
       "mode": "dataframe",
       "input": {"type": "image", "format": "hsv"},
       "label": {"type": "roi", "format": "polygon"},
   }

   # Create the JSON file with the content in it
   with open(meta_path, "w", encoding="utf-8") as f:
       json.dump(meta_data, f, indent=4)

   print(f"File {meta_path} created successfully!")

The CSV file is here to help kartezio read your dataset.

.. code-block:: python

   from Set_up.dataset_csv import create_dataset_csv

   create_dataset_csv(input_folder="model/dataset", output_csv="model/dataset/dataset.csv")

Your two files are now created and located in the `model/dataset` folder. We can now train our model.

Training
++++++++

Now that we have the structure kartezio needs to function correctly, we can create and train a model using the `create_segmentation_model` function of the kartezio package. You can find the following code in the `Set_up/train_model.py` file.

.. code-block:: python

   from kartezio.apps.segmentation import create_segmentation_model
   from kartezio.endpoint import EndpointThreshold
   from kartezio.dataset import read_dataset
   from kartezio.training import train_model

   import time
   from datetime import timedelta

   t0_train = time.time()

   DATASET = "model/dataset"
   OUTPUT = "model/models"

   generations = 1000
   _lambda = 5
   frequency = 5
   rate = 0.1
   print(rate)
   model = create_segmentation_model(
       generations,
       _lambda,
       inputs=3,
       nodes=30,
       node_mutation_rate=rate,
       output_mutation_rate=rate,
       outputs=1,
       fitness="IOU",
       endpoint=EndpointThreshold(threshold=4),
   )

   dataset = read_dataset(DATASET)
   elite, a = train_model(model, dataset, OUTPUT, callback_frequency=frequency)

   t1_train = time.time()
   elapsed_train = int(t1_train - t0_train)

Testing
+++++++

Now that you trained your model, you can test it to see if it gives good predictions.

.. code-block:: python

   import numpy as np
   import pandas as pd

   from kartezio.easy import print_stats
   from kartezio.dataset import read_dataset
   from kartezio.fitness import FitnessIOU
   from kartezio.inference import ModelPool

   t0_test = time.time()

   scores_all = {}
   pool = ModelPool(f"model/models", FitnessIOU(), regex="*/elite.json").to_ensemble()
   dataset = read_dataset(f"model/dataset", counting=True)
   annotations_test = 0
   annotations_training = 0
   roi_pixel_areas = []
   for y_true in dataset.train_y:
       n_annotations = y_true[1]
       annotations_training += n_annotations
   for y_true in dataset.test_y:
       annotations = y_true[0]
       n_annotations = y_true[1]
       annotations_test += n_annotations
       for i in range(1, n_annotations + 1):
           roi_pixel_areas.append(np.count_nonzero(annotations[annotations == i]))
   print(f"Total annotations for training set: {annotations_training}")
   print(f"Total annotations for test set: {annotations_test}")
   print(f"Mean pixel area for test set: {np.mean(roi_pixel_areas)}")

   scores_test = []
   scores_training = []
   for i, model in enumerate(pool.models):
       # Test set
       _, fitness, _ = model.eval(dataset, subset="test")
       scores_test.append(1.0 - fitness)

       # Training set
       _, fitness, _ = model.eval(dataset, subset="train")
       scores_training.append(1.0 - fitness)

   t1_test = time.time()
   elapsed_test = int(t1_test - t0_test)
   elapsed_total = elapsed_train + elapsed_test

   print(scores_training)
   print(scores_test)

   scores_all[f"training"] = scores_training
   scores_all[f"test"] = scores_test

   pd.DataFrame(scores_all).to_csv("model/scores.csv", index=False)

   print(f"Total runtime : {timedelta(seconds=elapsed_total)}")
   print(f"Training time : {timedelta(seconds=elapsed_train)}")
   print(f"Testing time : {timedelta(seconds=elapsed_test)}")

Model summary
+++++++++++++

You can get a summary of your model by running the following.

.. code-block:: python

   from Set_up.explain_model import summary_model

   summaries = summary_model("model")

You can create a ``csv`` file for the nodes that will help you understand your model.

.. code-block:: python

   for summary in summaries:
       print(summary.to_csv())

You can also access other parameters that are not displayed in the summary, for instance:

.. code-block:: python

   print(summaries[0].keys())

And you can extract any object from that list.

.. code-block:: python

   print(summaries[0].endpoint)

Model visualization
++++++++++++++++++++

You can visualize the model you've just created and tested by running the following code.

.. code-block:: python

   from mactrack.visualisation.viz_model import comp_model

   comp_model("./model", train=False)

You have two options to visualize your model:

* If you put `train=False`, it will compare the test set with the predictions on the test set.
* If you put `train=True`, it will compare the training set with the predictions on the training set.

You can then find the results in the `output` folder.

.. code-block:: python

   comp_model("./model", train=True)

Calculate IOU
+++++++++++++

You have to be careful with the second argument. If you've put `train=False` (default), you will have to use the test set. If you've put `train=True`, you will have to use the training set, which means that your second argument will be `./model/dataset/train/train_y/` and not `./model/dataset/test/test_y/`.

.. code-block:: python

   from mactrack.visualisation.iou import mean_global_iou

   mean_global_iou(
       "./model_output/test_def/ROIs_pred_def/",
       "./model/dataset/test/test_y/",
       "./model_output/comparison/",
   )