leaspy.io.data.data =================== .. py:module:: leaspy.io.data.data Classes ------- .. autoapisummary:: leaspy.io.data.data.Data Module Contents --------------- .. py:class:: Data Bases: :py:obj:`collections.abc.Iterable` Main data container for a collection of individuals It can be iterated over and sliced, both of these operations being applied to the underlying `individuals` attribute. :Attributes: **individuals** : :class:`~leaspy.utils.typing.Dict` [:class:`~leaspy.utils.typing.IDType` , :class:`~leaspy.individual_data.IndividualData`] Included individuals and their associated data **iter_to_idx** : :class:`~leaspy.utils.typing.Dict` [:obj:`int`, :class:`~leaspy.utils.typing.IDType`] Maps an integer index to the associated individual ID **headers** : :class:`~leaspy.utils.typing.List` [:class:`~leaspy.utils.typing.FeatureType`] Feature names **dimension** : :obj:`int` Number of features **n_individuals** : :obj:`int` Number of individuals **n_visits** : :obj:`int` Total number of visits **cofactors** : :class:`~leaspy.utils.typing.List` [:class:`~leaspy.utils.typing.FeatureType`] Feature names corresponding to cofactors **event_time_name** : :obj:`str` Name of the header that store the time at event in the original dataframe **event_bool_name** : :obj:`str` Name of the header that store the bool at event (censored or observed) in the original dataframe .. !! processed by numpydoc !! .. py:attribute:: individuals :type: dict[leaspy.utils.typing.IDType, leaspy.io.data.individual_data.IndividualData] .. py:attribute:: iter_to_idx :type: dict[int, leaspy.utils.typing.IDType] .. py:attribute:: headers :type: Optional[list[leaspy.utils.typing.FeatureType]] :value: None .. py:attribute:: event_time_name :type: Optional[str] :value: None .. py:attribute:: event_bool_name :type: Optional[str] :value: None .. py:attribute:: covariate_names :type: Optional[list[str]] :value: None .. py:property:: dimension :type: Optional[int] Number of features :Returns: :obj:`int` or None: Number of features in the dataset. If no features are present, returns None. .. !! processed by numpydoc !! .. py:property:: n_individuals :type: int Number of individuals :Returns: :obj:`int`: Number of individuals in the dataset. .. !! processed by numpydoc !! .. py:property:: n_visits :type: int Total number of visits :Returns: :obj:`int`: Total number of visits in the dataset. .. !! processed by numpydoc !! .. py:property:: cofactors :type: list[leaspy.utils.typing.FeatureType] Feature names corresponding to cofactors :Returns: :class:`~leaspy.utils.typing.List` [:class:`~leaspy.utils.typing.FeatureType`]: List of feature names corresponding to cofactors. .. !! processed by numpydoc !! .. py:method:: load_cofactors(df, *, cofactors = None) Load cofactors from a `pandas.DataFrame` to the `Data` object :Parameters: **df** : :obj:`pandas.DataFrame` The dataframe where the cofactors are stored. Its index should be ID, the identifier of subjects and it should uniquely index the dataframe (i.e. one row per individual). **cofactors** : :class:`~leaspy.utils.typing.List` [:class:`~leaspy.utils.typing.FeatureType`], optional Names of the column(s) of dataframe which shall be loaded as cofactors. If None, all the columns from the input dataframe will be loaded as cofactors. Default: None .. !! processed by numpydoc !! .. py:method:: from_csv_file(path, data_type = 'visit', *, pd_read_csv_kws = {}, facto_kws = {}, **df_reader_kws) :staticmethod: Create a `Data` object from a CSV file. :Parameters: **path** : :obj:`str` Path to the CSV file to load (with extension) **data_type** : :obj:`str` Type of data to read. Can be 'visit' or 'event'. **pd_read_csv_kws** : :obj:`dict` Keyword arguments that are sent to :func:`pandas.read_csv` **facto_kws** : :obj:`dict` Keyword arguments **\*\*df_reader_kws** Keyword arguments that are sent to :class:`~AbstractDataframeDataReader` to :func:`dataframe_data_reader_factory` :Returns: :class:`~leaspy.utils.typing.Data`: A Data object containing the data from the CSV file. .. !! processed by numpydoc !! .. py:method:: to_dataframe(*, cofactors = None, reset_index = True) Convert the Data object to a :obj:`pandas.DataFrame` :Parameters: **cofactors** : :class:`~leaspy.utils.typing.List` [:class:`~leaspy.utils.typing.FeatureType`] or :obj:`int`, optional Cofactors to include in the DataFrame. If None (default), no cofactors are included. If "all", all the available cofactors are included. Default: None **reset_index** : :obj:`bool`, optional Whether to reset index levels in output. Default: True :Returns: :obj:`pandas.DataFrame`: A DataFrame containing the individuals' ID, timepoints and associated observations (optional - and cofactors). :Raises: :exc:`.LeaspyDataInputError` If the Data object does not contain any cofactors. :exc:`.LeaspyTypeError` If the cofactors argument is not of a valid type. .. !! processed by numpydoc !! .. py:method:: from_dataframe(df, data_type = 'visit', factory_kws = {}, **kws) :staticmethod: Create a `Data` object from a :class:`~pandas.DataFrame`. :Parameters: **df** : :obj:`pandas.DataFrame` Dataframe containing ID, TIME and features. **data_type** : :obj:`str` Type of data to read. Can be 'visit', 'event', 'joint' **factory_kws** : :class:`~leaspy.utils.typing.Dict` Keyword arguments that are sent to :func:`.dataframe_data_reader_factory` **\*\*kws** Keyword arguments that are sent to :class:`~leaspy.utils.typing.DataframeDataReader` :Returns: :class:`~leaspy.utils.typing.Data` .. .. !! processed by numpydoc !! .. py:method:: from_individual_values(indices, timepoints = None, values = None, headers = None, event_time_name = None, event_bool_name = None, event_time = None, event_bool = None, covariate_names = None, covariates = None) :staticmethod: Construct `Data` from a collection of individual data points :Parameters: **indices** : :class:`~leaspy.utils.typing.List` [:class:`~leaspy.utils.typing.IDType`] List of the individuals' unique ID **timepoints** : :class:`~leaspy.utils.typing.List` [:class:`~leaspy.utils.typing.List` [:obj:`float`]] For each individual ``i``, list of timepoints associated with the observations. The number of such timepoints is noted ``n_timepoints_i`` **values** : :class:`~leaspy.utils.typing.List` [:obj:`array-like` [:obj:`float`, :obj:`2D`]] For each individual ``i``, two-dimensional array-like object containing observed data points. Its expected shape is ``(n_timepoints_i, n_features)`` **headers** : :class:`~leaspy.utils.typing.List` [:class:`~leaspy.utils.typing.FeatureType`] Feature names. The number of features is noted ``n_features`` :Returns: :class:`~leaspy.utils.typing.Data`: A Data object containing the individuals and their data. .. !! processed by numpydoc !! .. py:method:: from_individuals(individuals, headers = None, event_time_name = None, event_bool_name = None, covariate_names = None) :staticmethod: Construct `Data` from a list of individuals :Parameters: **individuals** : :class:`~leaspy.utils.typing.List` [:class:`~leaspy.individual_data.IndividualData`] List of individuals **headers** : :class:`~leaspy.utils.typing.List` [:class:`~leaspy.utils.typing.FeatureType`] List of feature names :Returns: :class:`~leaspy.utils.typing.Data`: A Data object containing the individuals and their data. .. !! processed by numpydoc !! .. py:method:: extract_longitudinal_only() Extract longitudinal data from the Data object :Returns: :class:`~leaspy.utils.typing.Data`: A Data object containing only longitudinal data. :Raises: :exc:`.LeaspyDataInputError` If the Data object does not contain any longitudinal data. .. !! processed by numpydoc !!