Joint Model#

This notebook contains the code for a simple implementation of the Leaspy Joint model on synthetic data.

The following imports are required libraries for numerical computation and data manipulation.

import os
import pandas as pd
import leaspy
from leaspy.io.data import Data

leaspy_root = os.path.dirname(leaspy.__file__)
data_path = os.path.join(leaspy_root, "datasets/data/simulated_data_for_joint.csv")

df = pd.read_csv(data_path, dtype={"ID": str}, sep=";")
print(df.head())
    ID    TIME  EVENT_TIME  EVENT_BOOL       Y0    Y1   Y2   Y3
0  116  78.461        85.5           1  0.44444  0.04  0.0  0.0
1  116  78.936        85.5           1  0.60000  0.00  0.0  0.2
2  116  79.482        85.5           1  0.39267  0.04  0.0  0.2
3  116  79.939        85.5           1  0.58511  0.00  0.0  0.0
4  116  80.491        85.5           1  0.57044  0.00  0.0  0.0

To use the Joint Model in Leaspy, your dataset must include the following columns:

  1. ID : Patient identifier

  2. TIME : Time of measurement

  3. EVENT_TIME : Time of the event

  4. EVENT_BOOL : Event indicator: - 1 if the event occurred - 0 if censored - 2 if a competing event occurred

For one patient, the event time and event bool are the same for each row.

We load the Joint Model from the leaspy.models and transform the dataset in a leaspy-compatible form with the built-in functions.

from leaspy.models import JointModel

data = Data.from_dataframe(df, "joint")
model = JointModel(name="test_model", nb_events=1, source_dimension=2)

Warning

For Joint models you MUST include “joint” as the second argument of the Data.from_dataframe method. This is necessary to ensure that the data is correctly processed and that the model can be fitted without errors.

The parameter nb_events should match the number of distinct event types present in the EVENT_BOOL column:

  • If EVENT_BOOL contains values {0, 1}, then nb_events=1.

  • If it contains values {0, 1, 2}, then nb_events=2.

Once the model is initialized, we can fit it to the data.

model.fit(data, "mcmc_saem", seed=1312, n_iter=100, progress_bar=False)
model.summary()
Fit with `AlgorithmName.FIT_MCMC_SAEM` took: 1.88s

================================================================================
                                 Model Summary
================================================================================
Model Name: joint
Model Type: JointModel
Features (4): Y0, Y1, Y2, Y3
Sources (2): Source 0 (s0), Source 1 (s1)
Observation Models: gaussian-diagonal, weibull-right-censored-with-sources
Neg. Log-Likelihood: -570.1557
Parameters: 24
BIC: -1212.91
AIC: -1232.91

Training Metadata
--------------------------------------------------------------------------------
Algorithm: mcmc_saem
Seed: 1312
Iterations: 100

Data Context
--------------------------------------------------------------------------------
Subjects: 17
Visits: 157
Total Observations: 628
Leaspy Version: 2.1.0
================================================================================

Population Parameters
--------------------------------------------------------------------------------
  log_rho_mean       [1.7669]
  n_log_nu_mean      [-2.0032]
  betas_mean:
                          s0        s1
            b0       -0.0650   -0.0294
            b1       -0.0159   -0.0192
            b2        0.0029   -0.0845
                          Y0        Y1        Y2        Y3
  log_g_mean          0.1157    2.9694    2.6022    1.1908
                          Y0        Y1        Y2        Y3
  log_v0_mean        -3.0769   -3.7779   -3.8124   -2.9112

Individual Parameters
--------------------------------------------------------------------------------
  tau_mean           [78.7148]
  tau_std            [5.6925]
  xi_std             [0.3405]
  zeta_mean:
            s0        0.0421
            s1        0.0695

Noise Model
--------------------------------------------------------------------------------
                          Y0        Y1        Y2        Y3
  noise_std           0.0766    0.0425    0.0745    0.1567

Derived Parameters (interpretable scale)
--------------------------------------------------------------------------------
                          Y0        Y1        Y2        Y3
  v0                  0.0461    0.0229    0.0221    0.0544
                          Y0        Y1        Y2        Y3
  p0                  0.4711    0.0488    0.0690    0.2331
================================================================================

We can also access the model information and parameters after fitting it to the data.

model.info()
================================================================================
                               Model Information
================================================================================
Statistical Model
Type: JointModel
Name: joint
Dimension: 4
Source Dimension: 2
Observation Models: gaussian-diagonal, weibull-right-censored-with-sources
Parameters: 24

Latent Variables
--------------------------------------------------------------------------------
  Population:
    betas                Normal(betas_mean, betas_std)
    log_g                Normal(log_g_mean, log_g_std)
    log_rho              Normal(log_rho_mean, log_rho_std)
    log_v0               Normal(log_v0_mean, log_v0_std)
    n_log_nu             Normal(n_log_nu_mean, n_log_nu_std)
    zeta                 Normal(zeta_mean, zeta_std)
  Individual:
    sources              Normal(sources_mean, sources_std)
    tau                  Normal(tau_mean, tau_std)
    xi                   Normal(xi_mean, xi_std)
--------------------------------------------------------------------------------

Training Dataset
--------------------------------------------------------------------------------
Subjects: 17
Visits: 157
Scores (Features): 4
Total Observations: 628
Visits per Subject: Median 9.0 [Min 7, Max 11, IQR 1.0]
Events Observed: 11

Training Details
--------------------------------------------------------------------------------
Algorithm: mcmc_saem
Seed: 1312
Iterations: 100
  Burn-in: 90/100 (90%)
  Burn-out: 10
Duration: 1.883s

Hyperparameters (fixed values from the source code)
--------------------------------------------------------------------------------
  betas_std: 0.01
  log_g_std: 0.01
  log_rho_std: 0.01
  log_v0_std: 0.01
  n_log_nu_std: 0.01
  sources_mean: [0.0, 0.0]
  sources_std: 1.0
  xi_mean: 0.0
  zeta_std: 0.01

Leaspy Version: 2.1.0
================================================================================

The Joint Model includes specific parameters such as log_rho_mean and zeta_mean.

print(model.parameters)
{'betas_mean': tensor([[-0.0650, -0.0294],
        [-0.0159, -0.0192],
        [ 0.0029, -0.0845]]), 'log_g_mean': tensor([0.1157, 2.9694, 2.6022, 1.1908]), 'log_rho_mean': tensor([1.7669]), 'log_v0_mean': tensor([-3.0769, -3.7779, -3.8124, -2.9112]), 'n_log_nu_mean': tensor([-2.0032]), 'noise_std': tensor([0.0766, 0.0425, 0.0745, 0.1567], dtype=torch.float64), 'tau_mean': tensor([78.7148], dtype=torch.float64), 'tau_std': tensor([5.6925], dtype=torch.float64), 'xi_std': tensor([0.3405], dtype=torch.float64), 'zeta_mean': tensor([[0.0421],
        [0.0695]])}

We have seen how to fit a Joint Model using Leaspy. It also provides other models such as the Mixture Model that can be explored in the next example.

Total running time of the script: (0 minutes 2.121 seconds)

Gallery generated by Sphinx-Gallery