leaspy.io.data.data.Data

class Data

Bases: Iterable

Main data container for a collection of individuals

It can be iterated over and sliced, both of these operations being applied to the underlying individuals attribute.

Attributes:
individualsDict[IDType, IndividualData]

Included individuals and their associated data

iter_to_idxDict[int, IDType]

Maps an integer index to the associated individual ID

headersList[FeatureType]

Feature names

dimensionint

Number of features

n_individualsint

Number of individuals

n_visitsint

Total number of visits

cofactorsList[FeatureType]

Feature names corresponding to cofactors

Methods

from_csv_file(path, **kws)

Create a Data object from a CSV file.

from_dataframe(df, **kws)

Create a Data object from a pandas.DataFrame.

from_individual_values(indices, timepoints, ...)

Construct Data from a collection of individual data points

from_individuals(individuals, headers)

Construct Data from a list of individuals

load_cofactors(df, *[, cofactors])

Load cofactors from a pandas.DataFrame to the Data object

to_dataframe(*[, cofactors, reset_index])

Convert the Data object to a pandas.DataFrame

property cofactors: List[str]

Feature names corresponding to cofactors

property dimension: int | None

Number of features

static from_csv_file(path: str, **kws) Data

Create a Data object from a CSV file.

Parameters:
pathstr

Path to the CSV file to load (with extension)

**kws

Keyword arguments that are sent to CSVDataReader

Returns:
Data
static from_dataframe(df: DataFrame, **kws) Data

Create a Data object from a pandas.DataFrame.

Parameters:
dfpandas.DataFrame

Dataframe containing ID, TIME and features.

**kws

Keyword arguments that are sent to DataframeDataReader

Returns:
Data
static from_individual_values(indices: List[str], timepoints: List[List[float]], values: List[List[List[float]]], headers: List[str]) Data

Construct Data from a collection of individual data points

Parameters:
indicesList[IDType]

List of the individuals’ unique ID

timepointsList[List[float]]

For each individual i, list of timepoints associated with the observations. The number of such timepoints is noted n_timepoints_i

valuesList[array-like[float, 2D]]

For each individual i, two-dimensional array-like object containing observed data points. Its expected shape is (n_timepoints_i, n_features)

headersList[FeatureType]

Feature names. The number of features is noted n_features

Returns:
Data
static from_individuals(individuals: List[IndividualData], headers: List[str]) Data

Construct Data from a list of individuals

Parameters:
individualsList[IndividualData]

List of individuals

headersList[FeatureType]

List of feature names

Returns:
Data
load_cofactors(df: DataFrame, *, cofactors: List[str] | None = None) None

Load cofactors from a pandas.DataFrame to the Data object

Parameters:
dfpandas.DataFrame

The dataframe where the cofactors are stored. Its index should be ID, the identifier of subjects and it should uniquely index the dataframe (i.e. one row per individual).

cofactorsList[FeatureType] or None (default)

Names of the column(s) of df which shall be loaded as cofactors. If None, all the columns from the input dataframe will be loaded as cofactors.

Raises:
LeaspyDataInputError
property n_individuals: int

Number of individuals

property n_visits: int

Total number of visits

to_dataframe(*, cofactors: List[str] | str | None = None, reset_index: bool = True) DataFrame

Convert the Data object to a pandas.DataFrame

Parameters:
cofactorsList[FeatureType], ‘all’, or None (default None)

Cofactors to include in the DataFrame. If None (default), no cofactors are included. If “all”, all the available cofactors are included.

reset_indexbool (default True)

Whether to reset index levels in output.

Returns:
pandas.DataFrame

A DataFrame containing the individuals’ ID, timepoints and associated observations (optional - and cofactors).

Raises:
LeaspyDataInputError
LeaspyTypeError