niimpy.reading.mhealth module

Read data from various formats, user entery point.

This module contains various functions read_* which load data from different formats into pandas.DataFrame:s. As a side effect, it provides the authoritative information on how incoming data is converted to dataframes.

niimpy.reading.mhealth.duration_to_timedelta(df, duration_col)[source]

Format a duration entry in the mHealth format. Duration is a dictionary that contains a value and a unit. The dataframe should contain two columns, DURATION_COL_NAME.value and DURATION_COL_NAME.unit.

Returns a dataframe with a DURATION_COL_NAME column containing a a timedelta object. The original columns will be dropped.

niimpy.reading.mhealth.format_part_of_day(df, prefix)[source]

Format columns with mHealth formatted part of day. Returns a dataframe with date stored in the column “date” and “part_of_day”. Options for part of day are “morning”, “afternoon”, “evening”, “night”.

niimpy.reading.mhealth.format_time_interval(df, prefix)[source]

Format a database containing columns in the mHealth time interval.

A time interval in the mHealth format has either
  • a date and a time of day (morning, afternoon, evening or night), or

  • any two of start time, end time and duration.

In the first case, the formatted database will contain two columns: measure_date and time_of_day.

In the second case, the formatted database will contain two columns: start and end.

Also sets the timestamp to “start” or “date” if available.

niimpy.reading.mhealth.geolocation(data_list)[source]

Format the geolocation json data into a niimpy dataframe.

Parameters:
data_list: list of dictionaries

MHealth formatted geolocation data loaded using json.load().

Returns:
data: A pandas.DataFrame containing geolocation data
niimpy.reading.mhealth.geolocation_from_file(filename)[source]

Read mHealth formatted geolocation data from a file and convert it to a Niimpy compatible dataframe.

Parameters:
filename: string

Path to the file containing mhealth formatted geolocation data.

Returns:
data: A pandas.DataFrame containing geolocation data
niimpy.reading.mhealth.heart_rate(data_list)[source]

Format the heart rate json data into a niimpy dataframe.

The dataframe contains the columns
  • heart_rate : Heart rate measurement in beats per minute

  • (optional) time interval columns

  • (optional) descriptive statistics column, a string

  • (optional) temporal relationship to sleep column, a string

  • (optional) temporal relationship to physical activity column, a

    string

Measurement time or interval columns. If exact time is given, only the index is set. If a time interval is given, we set two additional columns

  • start : start time of the measurement interval

  • end : end time of the measurement interval

and set the index to the start time.

The descriptive statistics column describes how the value is calculated over the given time interval. For example, “average” would denote a mean over the time period.

The temporal relationship to sleep is one of “before sleeping”, “during sleep” or “on waking”.

The temporal relationship to physical activity is one of “at rest”, “active”, “before exercise”, “after exercise” or “during exercise”.

Parameters:
data_list: list of dictionaries

MHealth formatted heart rate data loaded using json.load().

Returns:
data: A pandas.DataFrame containing geolocation data
niimpy.reading.mhealth.heart_rate_from_file(filename)[source]

Read mHealth formatted heart rate data from a file and convert it to a Niimpy compatible dataframe.

The dataframe contains the columns
  • heart_rate : Heart rate measurement in beats per minute

  • (optional) time interval columns

  • (optional) descriptive statistics column, a string

  • (optional) temporal relationship to sleep column, a string

  • (optional) temporal relationship to physical activity column, a

    string

Measurement time or interval columns. If exact time is given, only the index is set. If a time interval is given, we set two additional columns

  • start : start time of the measurement interval

  • end : end time of the measurement interval

and set the index to the start time.

The descriptive statistics column describes how the value is calculated over the given time interval. For example, “average” would denote a mean over the time period.

The temporal relationship to sleep is one of “before sleeping”, “during sleep” or “on waking”.

The temporal relationship to physical activity is one of “at rest”, “active”, “before exercise”, “after exercise” or “during exercise”.

Parameters:
filename: string

Path to the file containing mhealth formatted heart rate data.

Returns:
data: A pandas.DataFrame containing heart rate data
niimpy.reading.mhealth.total_sleep_time(data_list)[source]

Format mHealth total sleep data from json formatted data to a Niimpy compatible DataFrame. The DataFrame contains the columns

  • total_sleep_time : The total sleep time measurement

  • total_sleep_time_unit : The unit the measuremnet is expressed in

  • measurement interval columns

  • possible descriptive statistics columns

The measurement interval column are either
  • start : start time of the measurement interval

  • end : end time of the measurement interval

or
  • date : the date of the measurement

  • part_of_day : the time of day the measurement was made

The descriptive statistics columns would be
  • descriptive_statistics : Describes how the measurement is calculated

  • descriptive_statistics_denominator : Time interval the above desciption refers to.

The dataframe is indexed by “timestamp”, which is either the “start” or the “date”.

Parameters:
data_list: list of dictionaries

MHealth formatted sleep duration data loaded with json.load()

Returns:
data: A pandas.DataFrame containing sleep duration data
niimpy.reading.mhealth.total_sleep_time_from_file(filename)[source]

Read mHealth total sleep time from a file and convert into a Niimpy compatible DataFrame using total_sleep_time. The dataframe contains the columns

  • total_sleep_time : The total sleep time measurement

  • total_sleep_time_unit : The unit the measuremnet is expressed in

  • measurement interval columns

  • possible descriptive statistics columns

The measurement interval column are either
  • start : start time of the measurement interval

  • end : end time of the measurement interval

or
  • date : the date of the measurement

  • part_of_day : the time of day the measurement was made

The descriptive statistics columns would be
  • descriptive_statistics : Describes how the measurement is calculated

  • descriptive_statistics_denomirator : Time interval the above description refers to.

The dataframe is indexed by “timestamp”, which is either the “start” or the “date”.

Parameters:
filename: string

Path to the file containing mhealth formatted sleep duration data.

Returns:
data: A pandas.DataFrame containing sleep duration data