niimpy.reading.mhealth module

Read data from various formats, user entery point.

This module contains various functions read_* which load data from different formats into pandas.DataFrame:s. As a side effect, it provides the authoritative information on how incoming data is converted to dataframes.

niimpy.reading.mhealth.duration_to_timedelta(df, duration_col)[source]

Format a duration entry in the mHealth format. Duration is a dictionary that contains a value and a unit. The dataframe should contain two columns, DURATION_COL_NAME.value and DURATION_COL_NAME.unit.

Returns a dataframe with a DURATION_COL_NAME column containing a a timedelta object. The original columns will be dropped.

niimpy.reading.mhealth.format_part_of_day(df, prefix)[source]: Format columns with mHealth formatted part of day. Returns a dataframe with date stored in the column “date” and “part_of_day”. Options for part of day are “morning”, “afternoon”, “evening”, “night”.

niimpy.reading.mhealth.format_time_interval(df, prefix)[source]

Format a database containing columns in the mHealth time interval.

A time interval in the mHealth format has either

a date and a time of day (morning, afternoon, evening or night), or
any two of start time, end time and duration.

In the first case, the formatted database will contain two columns: measure_date and time_of_day.

In the second case, the formatted database will contain two columns: start and end.

Also sets the timestamp to “start” or “date” if available.

niimpy.reading.mhealth.geolocation(data_list)[source]

Format the geolocation json data into a niimpy dataframe.

Parameters:

data_list: list of dictionaries: MHealth formatted geolocation data loaded using json.load().

Returns:

data: A pandas.DataFrame containing geolocation data

niimpy.reading.mhealth.geolocation_from_file(filename)[source]

Read mHealth formatted geolocation data from a file and convert it to a Niimpy compatible dataframe.

Parameters:

filename: string: Path to the file containing mhealth formatted geolocation data.

Returns:

data: A pandas.DataFrame containing geolocation data

niimpy.reading.mhealth.heart_rate(data_list)[source]

Format the heart rate json data into a niimpy dataframe.

The dataframe contains the columns

heart_rate : Heart rate measurement in beats per minute
(optional) time interval columns
(optional) descriptive statistics column, a string
(optional) temporal relationship to sleep column, a string
(optional) temporal relationship to physical activity column, a
string

Measurement time or interval columns. If exact time is given, only the index is set. If a time interval is given, we set two additional columns

start : start time of the measurement interval

end : end time of the measurement interval

and set the index to the start time.

The descriptive statistics column describes how the value is calculated over the given time interval. For example, “average” would denote a mean over the time period.

The temporal relationship to sleep is one of “before sleeping”, “during sleep” or “on waking”.

The temporal relationship to physical activity is one of “at rest”, “active”, “before exercise”, “after exercise” or “during exercise”.

Parameters:

data_list: list of dictionaries: MHealth formatted heart rate data loaded using json.load().

Returns:

data: A pandas.DataFrame containing geolocation data

niimpy.reading.mhealth.heart_rate_from_file(filename)[source]

Read mHealth formatted heart rate data from a file and convert it to a Niimpy compatible dataframe.

The dataframe contains the columns

heart_rate : Heart rate measurement in beats per minute
(optional) time interval columns
(optional) descriptive statistics column, a string
(optional) temporal relationship to sleep column, a string
(optional) temporal relationship to physical activity column, a
string

Measurement time or interval columns. If exact time is given, only the index is set. If a time interval is given, we set two additional columns

start : start time of the measurement interval

end : end time of the measurement interval

and set the index to the start time.

The descriptive statistics column describes how the value is calculated over the given time interval. For example, “average” would denote a mean over the time period.

The temporal relationship to sleep is one of “before sleeping”, “during sleep” or “on waking”.

The temporal relationship to physical activity is one of “at rest”, “active”, “before exercise”, “after exercise” or “during exercise”.

Parameters:

filename: string: Path to the file containing mhealth formatted heart rate data.

Returns:

data: A pandas.DataFrame containing heart rate data

niimpy.reading.mhealth.total_sleep_time(data_list)[source]

Format mHealth total sleep data from json formatted data to a Niimpy compatible DataFrame. The DataFrame contains the columns

total_sleep_time : The total sleep time measurement

total_sleep_time_unit : The unit the measuremnet is expressed in

measurement interval columns

possible descriptive statistics columns

The measurement interval column are either

start : start time of the measurement interval
end : end time of the measurement interval

or

date : the date of the measurement
part_of_day : the time of day the measurement was made

The descriptive statistics columns would be

descriptive_statistics : Describes how the measurement is calculated
descriptive_statistics_denominator : Time interval the above desciption refers to.

The dataframe is indexed by “timestamp”, which is either the “start” or the “date”.

Parameters:

data_list: list of dictionaries: MHealth formatted sleep duration data loaded with json.load()

Returns:

data: A pandas.DataFrame containing sleep duration data

niimpy.reading.mhealth.total_sleep_time_from_file(filename)[source]

Read mHealth total sleep time from a file and convert into a Niimpy compatible DataFrame using total_sleep_time. The dataframe contains the columns

total_sleep_time : The total sleep time measurement

total_sleep_time_unit : The unit the measuremnet is expressed in

measurement interval columns

possible descriptive statistics columns

The measurement interval column are either

start : start time of the measurement interval
end : end time of the measurement interval

or

date : the date of the measurement
part_of_day : the time of day the measurement was made

The descriptive statistics columns would be

descriptive_statistics : Describes how the measurement is calculated
descriptive_statistics_denomirator : Time interval the above description refers to.

The dataframe is indexed by “timestamp”, which is either the “start” or the “date”.

Parameters:

filename: string: Path to the file containing mhealth formatted sleep duration data.

Returns:

data: A pandas.DataFrame containing sleep duration data