leaspy.io.data.data
===================

.. py:module:: leaspy.io.data.data


Classes
-------

.. autoapisummary::

   leaspy.io.data.data.Data


Module Contents
---------------

.. py:class:: Data

   Bases: :py:obj:`collections.abc.Iterable`


   Main data container for a collection of individuals

   It can be iterated over and sliced, both of these operations being
   applied to the underlying `individuals` attribute.


   :Attributes:

       **individuals** : :class:`~leaspy.utils.typing.Dict` [:class:`~leaspy.utils.typing.IDType` , :class:`~leaspy.individual_data.IndividualData`]
           Included individuals and their associated data

       **iter_to_idx** : :class:`~leaspy.utils.typing.Dict` [:obj:`int`, :class:`~leaspy.utils.typing.IDType`]
           Maps an integer index to the associated individual ID

       **headers** : :class:`~leaspy.utils.typing.List` [:class:`~leaspy.utils.typing.FeatureType`]
           Feature names

       **dimension** : :obj:`int`
           Number of features

       **n_individuals** : :obj:`int`
           Number of individuals

       **n_visits** : :obj:`int`
           Total number of visits

       **cofactors** : :class:`~leaspy.utils.typing.List` [:class:`~leaspy.utils.typing.FeatureType`]
           Feature names corresponding to cofactors

       **event_time_name** : :obj:`str`
           Name of the header that store the time at event in the original dataframe

       **event_bool_name** : :obj:`str`
           Name of the header that store the bool at event (censored or observed) in the original dataframe


   ..
       !! processed by numpydoc !!

   .. py:attribute:: individuals
      :type:  dict[leaspy.utils.typing.IDType, leaspy.io.data.individual_data.IndividualData]


   .. py:attribute:: iter_to_idx
      :type:  dict[int, leaspy.utils.typing.IDType]


   .. py:attribute:: headers
      :type:  Optional[list[leaspy.utils.typing.FeatureType]]
      :value: None


   .. py:attribute:: event_time_name
      :type:  Optional[str]
      :value: None


   .. py:attribute:: event_bool_name
      :type:  Optional[str]
      :value: None


   .. py:attribute:: covariate_names
      :type:  Optional[list[str]]
      :value: None


   .. py:property:: dimension
      :type: Optional[int]


      Number of features


      :Returns:

          :obj:`int` or None:
              Number of features in the dataset. If no features are present, returns None.


      ..
          !! processed by numpydoc !!


   .. py:property:: n_individuals
      :type: int


      Number of individuals


      :Returns:

          :obj:`int`:
              Number of individuals in the dataset.


      ..
          !! processed by numpydoc !!


   .. py:property:: n_visits
      :type: int


      Total number of visits


      :Returns:

          :obj:`int`:
              Total number of visits in the dataset.


      ..
          !! processed by numpydoc !!


   .. py:property:: cofactors
      :type: list[leaspy.utils.typing.FeatureType]


      Feature names corresponding to cofactors


      :Returns:

          :class:`~leaspy.utils.typing.List` [:class:`~leaspy.utils.typing.FeatureType`]:
              List of feature names corresponding to cofactors.


      ..
          !! processed by numpydoc !!


   .. py:method:: load_cofactors(df, *, cofactors = None)

      
      Load cofactors from a `pandas.DataFrame` to the `Data` object


      :Parameters:

          **df** : :obj:`pandas.DataFrame`
              The dataframe where the cofactors are stored.
              Its index should be ID, the identifier of subjects
              and it should uniquely index the dataframe (i.e. one row per individual).

          **cofactors** : :class:`~leaspy.utils.typing.List` [:class:`~leaspy.utils.typing.FeatureType`], optional
              Names of the column(s) of dataframe which shall be loaded as cofactors.
              If None, all the columns from the input dataframe will be loaded as cofactors.
              Default: None


      ..
          !! processed by numpydoc !!


   .. py:method:: from_csv_file(path, data_type = 'visit', *, pd_read_csv_kws = {}, facto_kws = {}, **df_reader_kws)
      :staticmethod:


      Create a `Data` object from a CSV file.


      :Parameters:

          **path** : :obj:`str`
              Path to the CSV file to load (with extension)

          **data_type** : :obj:`str`
              Type of data to read. Can be 'visit' or 'event'.

          **pd_read_csv_kws** : :obj:`dict`
              Keyword arguments that are sent to :func:`pandas.read_csv`

          **facto_kws** : :obj:`dict`
              Keyword arguments

          **\*\*df_reader_kws**
              Keyword arguments that are sent to :class:`~AbstractDataframeDataReader` to :func:`dataframe_data_reader_factory`


      :Returns:

          :class:`~leaspy.utils.typing.Data`:
              A Data object containing the data from the CSV file.


      ..
          !! processed by numpydoc !!


   .. py:method:: to_dataframe(*, cofactors = None, reset_index = True)

      
      Convert the Data object to a :obj:`pandas.DataFrame`


      :Parameters:

          **cofactors** : :class:`~leaspy.utils.typing.List` [:class:`~leaspy.utils.typing.FeatureType`] or :obj:`int`, optional
              Cofactors to include in the DataFrame.
              If None (default), no cofactors are included.
              If "all", all the available cofactors are included.
              Default: None

          **reset_index** : :obj:`bool`, optional
              Whether to reset index levels in output.
              Default: True


      :Returns:

          :obj:`pandas.DataFrame`:
              A DataFrame containing the individuals' ID, timepoints and
              associated observations (optional - and cofactors).


      :Raises:

          :exc:`.LeaspyDataInputError`
              If the Data object does not contain any cofactors.

          :exc:`.LeaspyTypeError`
              If the cofactors argument is not of a valid type.


      ..
          !! processed by numpydoc !!


   .. py:method:: from_dataframe(df, data_type = 'visit', factory_kws = {}, **kws)
      :staticmethod:


      Create a `Data` object from a :class:`~pandas.DataFrame`.


      :Parameters:

          **df** : :obj:`pandas.DataFrame`
              Dataframe containing ID, TIME and features.

          **data_type** : :obj:`str`
              Type of data to read. Can be 'visit', 'event', 'joint'

          **factory_kws** : :class:`~leaspy.utils.typing.Dict`
              Keyword arguments that are sent to :func:`.dataframe_data_reader_factory`

          **\*\*kws**
              Keyword arguments that are sent to :class:`~leaspy.utils.typing.DataframeDataReader`


      :Returns:

          :class:`~leaspy.utils.typing.Data`
              ..


      ..
          !! processed by numpydoc !!


   .. py:method:: from_individual_values(indices, timepoints = None, values = None, headers = None, event_time_name = None, event_bool_name = None, event_time = None, event_bool = None, covariate_names = None, covariates = None)
      :staticmethod:


      Construct `Data` from a collection of individual data points


      :Parameters:

          **indices** : :class:`~leaspy.utils.typing.List` [:class:`~leaspy.utils.typing.IDType`]
              List of the individuals' unique ID

          **timepoints** : :class:`~leaspy.utils.typing.List` [:class:`~leaspy.utils.typing.List` [:obj:`float`]]
              For each individual ``i``, list of timepoints associated
              with the observations.
              The number of such timepoints is noted ``n_timepoints_i``

          **values** : :class:`~leaspy.utils.typing.List` [:obj:`array-like` [:obj:`float`, :obj:`2D`]]
              For each individual ``i``, two-dimensional array-like object
              containing observed data points.
              Its expected shape is ``(n_timepoints_i, n_features)``

          **headers** : :class:`~leaspy.utils.typing.List` [:class:`~leaspy.utils.typing.FeatureType`]
              Feature names.
              The number of features is noted ``n_features``


      :Returns:

          :class:`~leaspy.utils.typing.Data`:
              A Data object containing the individuals and their data.


      ..
          !! processed by numpydoc !!


   .. py:method:: from_individuals(individuals, headers = None, event_time_name = None, event_bool_name = None, covariate_names = None)
      :staticmethod:


      Construct `Data` from a list of individuals


      :Parameters:

          **individuals** : :class:`~leaspy.utils.typing.List` [:class:`~leaspy.individual_data.IndividualData`]
              List of individuals

          **headers** : :class:`~leaspy.utils.typing.List` [:class:`~leaspy.utils.typing.FeatureType`]
              List of feature names


      :Returns:

          :class:`~leaspy.utils.typing.Data`:
              A Data object containing the individuals and their data.


      ..
          !! processed by numpydoc !!


   .. py:method:: extract_longitudinal_only()

      
      Extract longitudinal data from the Data object


      :Returns:

          :class:`~leaspy.utils.typing.Data`:
              A Data object containing only longitudinal data.


      :Raises:

          :exc:`.LeaspyDataInputError`
              If the Data object does not contain any longitudinal data.


      ..
          !! processed by numpydoc !!