leaspy.io.data.data module

class Data

Bases: Iterable

Main data container for a collection of individuals

It can be iterated over and sliced, both of these operations being applied to the underlying individuals attribute.

Attributes
individualsDict[IDType, IndividualData]

Included individuals and their associated data

iter_to_idxDict[int, IDType]

Maps an integer index to the associated individual ID

headersList[FeatureType]

Feature names

dimensionint

Number of features

n_individualsint

Number of individuals

n_visitsint

Total number of visits

cofactorsList[FeatureType]

Feature names corresponding to cofactors

Methods

from_csv_file(path, **kws)

Create a Data object from a CSV file.

from_dataframe(df, **kws)

Create a Data object from a pandas.DataFrame.

from_individual_values(indices, timepoints, ...)

Construct Data from a collection of individual data points

from_individuals(individuals, headers)

Construct Data from a list of individuals

load_cofactors(df, *[, cofactors])

Load cofactors from a pandas.DataFrame to the Data object

to_dataframe(*[, cofactors])

Convert the Data object to a pandas.DataFrame

property cofactors: List[str]

Feature names corresponding to cofactors

property dimension: Optional[int]

Number of features

static from_csv_file(path: str, **kws) Data

Create a Data object from a CSV file.

Parameters
pathstr

Path to the CSV file to load (with extension)

**kws

Keyword arguments that are sent to CSVDataReader

Returns
Data
static from_dataframe(df: DataFrame, **kws) Data

Create a Data object from a pandas.DataFrame.

Parameters
dfpandas.DataFrame

Dataframe containing ID, TIME and features.

**kws

Keyword arguments that are sent to DataframeDataReader

Returns
Data
static from_individual_values(indices: List[str], timepoints: List[List[float]], values: List[List[List[float]]], headers: List[str]) Data

Construct Data from a collection of individual data points

Parameters
indicesList[IDType]

List of the individuals’ unique ID

timepointsList[List[float]]

For each individual i, list of timepoints associated with the observations. The number of such timepoints is noted n_timepoints_i

valuesList[array-like[float, 2D]]

For each individual i, two-dimensional array-like object containing observed data points. Its expected shape is (n_timepoints_i, n_features)

headersList[FeatureType]

Feature names. The number of features is noted n_features

Returns
Data
static from_individuals(individuals: List[IndividualData], headers: List[str]) Data

Construct Data from a list of individuals

Parameters
individualsList[IndividualData]

List of individuals

headersList[FeatureType]

List of feature names

Returns
Data
load_cofactors(df: DataFrame, *, cofactors: Optional[List[str]] = None) None

Load cofactors from a pandas.DataFrame to the Data object

Parameters
dfpandas.DataFrame

The dataframe where the cofactors are stored. Its index should be ID, the identifier of subjects and it should uniquely index the dataframe (i.e. one row per individual).

cofactorsList[FeatureType] or None (default)

Names of the column(s) of df which shall be loaded as cofactors. If None, all the columns from the input dataframe will be loaded as cofactors.

Raises
LeaspyDataInputError
property n_individuals: int

Number of individuals

property n_visits: int

Total number of visits

to_dataframe(*, cofactors: Optional[Union[List[str], str]] = None) DataFrame

Convert the Data object to a pandas.DataFrame

Parameters
cofactorsList[FeatureType], ‘all’, or None (default None)

Cofactors to include in the DataFrame. If None (default), no cofactors are included. If “all”, all the available cofactors are included.

Returns
pandas.DataFrame

A DataFrame containing the individuals’ ID, timepoints and associated observations (optional - and cofactors).

Raises
LeaspyDataInputError
LeaspyTypeError