leaspy.io.data.data module

class Data

Bases: Iterable

Main data container for a collection of individuals

It can be iterated over and sliced, both of these operations being applied to the underlying individuals attribute.

Attributes:

individualsDict[IDType, IndividualData]: Included individuals and their associated data
iter_to_idxDict[int, IDType]: Maps an integer index to the associated individual ID
headersList[FeatureType]: Feature names
dimensionint: Number of features
n_individualsint: Number of individuals
n_visitsint: Total number of visits
cofactorsList[FeatureType]: Feature names corresponding to cofactors

Methods

`from_csv_file`(path, **kws)	Create a Data object from a CSV file.
`from_dataframe`(df, **kws)	Create a Data object from a `pandas.DataFrame`.
`from_individual_values`(indices, timepoints, ...)	Construct Data from a collection of individual data points
`from_individuals`(individuals, headers)	Construct Data from a list of individuals
`load_cofactors`(df, *[, cofactors])	Load cofactors from a pandas.DataFrame to the Data object
`to_dataframe`(*[, cofactors, reset_index])	Convert the Data object to a `pandas.DataFrame`

property cofactors: List[str]: Feature names corresponding to cofactors

property dimension: int | None: Number of features

static from_csv_file(path: str, **kws) → Data

Create a Data object from a CSV file.

Parameters:

pathstr: Path to the CSV file to load (with extension)
**kws: Keyword arguments that are sent to CSVDataReader

Returns:

Data

static from_dataframe(df: DataFrame, **kws) → Data

Create a Data object from a pandas.DataFrame.

Parameters:

dfpandas.DataFrame: Dataframe containing ID, TIME and features.
**kws: Keyword arguments that are sent to DataframeDataReader

Returns:

Data

static from_individual_values(indices: List[str], timepoints: List[List[float]], values: List[List[List[float]]], headers: List[str]) → Data

Construct Data from a collection of individual data points

Parameters:

indicesList[IDType]: List of the individuals’ unique ID
timepointsList[List[float]]: For each individual i, list of timepoints associated with the observations. The number of such timepoints is noted n_timepoints_i
valuesList[array-like[float, 2D]]: For each individual i, two-dimensional array-like object containing observed data points. Its expected shape is (n_timepoints_i, n_features)
headersList[FeatureType]: Feature names. The number of features is noted n_features

Returns:

Data

static from_individuals(individuals: List[IndividualData], headers: List[str]) → Data

Construct Data from a list of individuals

Parameters:

individualsList[IndividualData]: List of individuals
headersList[FeatureType]: List of feature names

Returns:

Data

load_cofactors(df: DataFrame, *, cofactors: List[str] | None = None) → None

Load cofactors from a pandas.DataFrame to the Data object

Parameters:

dfpandas.DataFrame: The dataframe where the cofactors are stored. Its index should be ID, the identifier of subjects and it should uniquely index the dataframe (i.e. one row per individual).
cofactorsList[FeatureType] or None (default): Names of the column(s) of df which shall be loaded as cofactors. If None, all the columns from the input dataframe will be loaded as cofactors.

Raises:

LeaspyDataInputError

property n_individuals: int: Number of individuals

property n_visits: int: Total number of visits

to_dataframe(*, cofactors: List[str] | str | None = None, reset_index: bool = True) → DataFrame

Convert the Data object to a pandas.DataFrame

Parameters:

cofactorsList[FeatureType], ‘all’, or None (default None): Cofactors to include in the DataFrame. If None (default), no cofactors are included. If “all”, all the available cofactors are included.
reset_indexbool (default True): Whether to reset index levels in output.

Returns:

pandas.DataFrame: A DataFrame containing the individuals’ ID, timepoints and associated observations (optional - and cofactors).

Raises:

LeaspyDataInputError
LeaspyTypeError