{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Tutorial 7: Advanced Customization with Custom Models\n",
    "\n",
    "Welcome to our final tutorial! You have now mastered the main workflows of `NeuralMI`, from simple estimates to rigorous, publication-ready analyses. But what happens when your research requires a model architecture that isn't built into the library? \n",
    "\n",
    "This tutorial is for the advanced user who wants maximum flexibility. We will show you how to define your own models using PyTorch and seamlessly integrate them into the `nmi.run` pipeline, ensuring that `NeuralMI` can grow with your research needs."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. The Requirements for a Custom Model\n",
    "\n",
    "To be compatible with the `NeuralMI` trainer, any custom model must meet two simple requirements:\n",
    "\n",
    "1.  It must inherit from **`nmi.models.BaseCritic`**.\n",
    "2.  Its `forward` method must accept two arguments, `x` and `y`, and return a **tuple** containing:\n",
    "    - `scores`: A `(batch_size, batch_size)` tensor of similarity scores.\n",
    "    - `kl_loss`: A scalar tensor for any KL divergence loss. If not using a variational model, this should be `torch.tensor(0.0)`.\n",
    "\n",
    "Let's explore the two main ways to achieve this."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "import torch.nn as nn\n",
    "import numpy as np\n",
    "import neural_mi as nmi\n",
    "import seaborn as sns\n",
    "\n",
    "sns.set_context(\"talk\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Method 1: Full Control with `custom_critic`\n",
    "\n",
    "This method gives you complete control. You define the entire critic architecture from scratch and pass a pre-initialized **instance** of your model to `nmi.run`.\n",
    "\n",
    "Let's build a simple custom critic that uses a linear embedding layer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Custom critic class defined successfully!\n"
     ]
    }
   ],
   "source": [
    "# Define a simple embedding model (can be anything that inherits from BaseEmbedding)\n",
    "class LinearEmbedding(nmi.models.BaseEmbedding):\n",
    "    def __init__(self, input_dim, embedding_dim):\n",
    "        super().__init__()\n",
    "        self.layer = nn.Linear(input_dim, embedding_dim)\n",
    "\n",
    "    def forward(self, x):\n",
    "        x_flat = x.view(x.shape[0], -1)\n",
    "        return self.layer(x_flat)\n",
    "\n",
    "# Define our custom critic that uses the embedding model\n",
    "class MyCustomSeparableCritic(nmi.models.BaseCritic):\n",
    "    def __init__(self, input_dim, embedding_dim):\n",
    "        super().__init__()\n",
    "        self.embedding_net = LinearEmbedding(input_dim, embedding_dim)\n",
    "\n",
    "    def forward(self, x, y):\n",
    "        x_embedded = self.embedding_net(x)\n",
    "        y_embedded = self.embedding_net(y)\n",
    "        scores = torch.matmul(x_embedded, y_embedded.t())\n",
    "        return scores, torch.tensor(0.0, device=scores.device)\n",
    "\n",
    "print(\"Custom critic class defined successfully!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Using the Custom Critic in `nmi.run`\n",
    "\n",
    "Using our new model is simple: we instantiate our critic and pass the **instance** to the `custom_critic` argument. The library will then skip its internal model-building logic and use our model directly. Any model architecture parameters in `base_params` (like `embedding_dim`, `hidden_dim`, etc.) will be ignored."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2025-10-20 00:10:41 - neural_mi - INFO - Starting parameter sweep sequentially (n_workers=1)...\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "b94c5d97570f448384ad8db9cc48c7b2",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Sequential Sweep Progress:   0%|          | 0/1 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "66971a3c32dd44d1aa3bf5921da232b6",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Run 776e4ccf-f321-429f-b1f3-0bb125f8bcea_c0:   0%|          | 0/50 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2025-10-20 00:10:45 - neural_mi - INFO - Parameter sweep finished.\n",
      "\n",
      "--- Results with custom_critic ---\n",
      "Ground Truth MI:  2.000 bits\n",
      "Estimated MI:     1.579 bits\n"
     ]
    }
   ],
   "source": [
    "# --- Generate some simple data ---\n",
    "x_raw, y_raw = nmi.datasets.generate_correlated_gaussians(n_samples=5000, dim=5, mi=2.0)\n",
    "\n",
    "# --- Instantiate our model ---\n",
    "my_critic_instance = MyCustomSeparableCritic(input_dim=5, embedding_dim=16)\n",
    "\n",
    "# --- Define trainer parameters (no model architecture params needed) ---\n",
    "base_params = {\n",
    "    'n_epochs': 50, 'learning_rate': 1e-3, 'batch_size': 128,\n",
    "    'patience': 10\n",
    "}\n",
    "\n",
    "# --- Run the estimation ---\n",
    "results = nmi.run(\n",
    "    x_data=x_raw.T, y_data=y_raw.T,\n",
    "    mode='estimate',\n",
    "    processor_type_x='continuous',\n",
    "    processor_params_x={'window_size': 1},\n",
    "    base_params=base_params,\n",
    "    split_mode='random',\n",
    "    custom_critic=my_critic_instance, # Here is the magic!\n",
    "    n_workers=1,\n",
    "    random_seed=42\n",
    ")\n",
    "\n",
    "print(f\"\\n--- Results with custom_critic ---\")\n",
    "print(f\"Ground Truth MI:  2.000 bits\")\n",
    "print(f\"Estimated MI:     {results.mi_estimate:.3f} bits\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Method 2: Modular Control with `custom_embedding_cls`\n",
    "\n",
    "Sometimes you don't need to reinvent the wheel. You might like the library's built-in `SeparableCritic`, but you just want to swap out the embedding model (e.g., use a Transformer instead of an MLP).\n",
    "\n",
    "The `custom_embedding_cls` parameter is perfect for this. Instead of a model *instance*, you provide the **class** of your custom embedding model. The library will then handle instantiating it for you, using the architecture parameters from `base_params`.\n",
    "\n",
    "**Important:** For this to work, your custom embedding's `__init__` method must be designed to accept the standard parameters that the library provides: `input_dim`, `hidden_dim`, `embed_dim`, and `n_layers`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CustomMLP class defined successfully!\n"
     ]
    }
   ],
   "source": [
    "# Define a more complex custom embedding that is compatible with the library's builder\n",
    "class CustomMLP(nmi.models.BaseEmbedding):\n",
    "    # This __init__ signature matches the arguments the library's internal builder will provide\n",
    "    def __init__(self, input_dim: int, hidden_dim: int, embed_dim: int, n_layers: int, activation: str = 'relu'):\n",
    "        super().__init__()\n",
    "        \n",
    "        # You can define any architecture you want inside\n",
    "        layers = [nn.Linear(input_dim, hidden_dim), nn.ReLU()]\n",
    "        for _ in range(n_layers - 1):\n",
    "            layers.extend([nn.Linear(hidden_dim, hidden_dim), nn.ReLU()])\n",
    "        self.network = nn.Sequential(*layers)\n",
    "        self.output_layer = nn.Linear(hidden_dim, embed_dim)\n",
    "\n",
    "    def forward(self, x: torch.Tensor) -> torch.Tensor:\n",
    "        return self.output_layer(self.network(x.view(x.shape[0], -1)))\n",
    "\n",
    "print(\"CustomMLP class defined successfully!\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2025-10-20 00:10:45 - neural_mi - INFO - Starting parameter sweep sequentially (n_workers=1)...\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "b4a013c02bdc42fca57ff8504b1416e7",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Sequential Sweep Progress:   0%|          | 0/1 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "649d9c6add3e4a8488ea1de7b7304f8f",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Run a2e7333f-7d3c-4d85-a03a-e4565e03e208_c0:   0%|          | 0/50 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2025-10-20 00:10:48 - neural_mi - INFO - Parameter sweep finished.\n",
      "\n",
      "--- Results with custom_embedding_cls ---\n",
      "Ground Truth MI:  2.000 bits\n",
      "Estimated MI:     1.944 bits\n"
     ]
    }
   ],
   "source": [
    "# --- Define model and trainer parameters ---\n",
    "# This time, we DO need to provide the architecture params, as the library will use them\n",
    "# to instantiate our CustomMLP class.\n",
    "base_params_cls = {\n",
    "    'n_epochs': 50, 'learning_rate': 1e-3, 'batch_size': 128,\n",
    "    'patience': 10, 'embedding_dim': 16, 'hidden_dim': 64, 'n_layers': 2,\n",
    "    'critic_type': 'separable'\n",
    "}\n",
    "\n",
    "# --- Run the estimation ---\n",
    "results_cls = nmi.run(\n",
    "    x_data=x_raw.T, y_data=y_raw.T,\n",
    "    mode='estimate',\n",
    "    processor_type_x='continuous',\n",
    "    processor_params_x={'window_size': 1},\n",
    "    split_mode='random',\n",
    "    base_params=base_params_cls,\n",
    "    custom_embedding_cls=CustomMLP, # Pass the CLASS here\n",
    "    n_workers=1,\n",
    "    random_seed=42\n",
    ")\n",
    "\n",
    "print(f\"\\n--- Results with custom_embedding_cls ---\")\n",
    "print(f\"Ground Truth MI:  2.000 bits\")\n",
    "print(f\"Estimated MI:     {results_cls.mi_estimate:.3f} bits\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Success! The estimate is probably more accurate. This modular approach allows you to leverage the library's tested critic architectures while still having the freedom to design novel embedding models for your specific data."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Conclusion\n",
    "\n",
    "Congratulations! You have completed the `NeuralMI` learning path. You now have the skills to handle complex neural data, choose the right model architecture, perform scientifically rigorous analyses, and even extend the library with your own custom models.\n",
    "\n",
    "The `custom_critic` and `custom_embedding_cls` features provide escape hatches for maximum flexibility, ensuring that `NeuralMI` can serve as the foundation for your analysis, no matter how specialized your research becomes."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:base] *",
   "language": "python",
   "name": "conda-base-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}