niimpy.preprocessing.tracker module

niimpy.preprocessing.tracker.extract_features_tracker(df, features=None)[source]

This function computes and organizes the selected features for tracker data recorded using Polar Ignite.

The complete list of features that can be calculated are: tracker_daily_step_distribution

Parameters:
df: pandas.DataFrame

Input data frame

features: dict, optional

Dictionary keys contain the names of the features to compute. The value of the keys is the list of parameters that will be passed to the function. If none is given, all features will be computed.

Returns:
result: dataframe

Resulting dataframe

niimpy.preprocessing.tracker.group_data(df, columns=['user', 'device'])[source]

Group the dataframe by a standard set of columns listed in group_by_columns.

niimpy.preprocessing.tracker.reset_groups(df, columns=['user', 'device'])[source]

Group the dataframe by a standard set of columns listed in group_by_columns.

niimpy.preprocessing.tracker.step_summary(df, config={})[source]

Return the summary of step count in a time range. The summary includes the following information of step count per day: mean, standard deviation, min, max

Parameters:
dfPandas Dataframe

Dataframe containing the hourly step count of an individual. The dataframe must be date time index.

config: dict

Dictionary keys containing optional arguments. These can be:

value_col: str.

Column contains step values. Default value is “values”.

user_id: list. Optional

List of user id. If none given, returns summary for all users.

start_date: string. Optional

Start date of time segment used for computing the summary. If not given, acquire summary for the whole time range.

end_date: string. Optional

End date of time segment used for computing the summary. If not given, acquire summary for the whole time range.

Returns:
summary_df: pandas DataFrame

A dataframe containing user id and associated step summary.

niimpy.preprocessing.tracker.tracker_step_distribution(steps_df, config={})[source]

Return distribution of steps within a time range. The number of step is sampled according to the frequency rule in resample_args. This is divided by the total number of steps in a larger time frame, given by the timeframe argument.

Using default parameters produces a daily step distribution.

Parameters:
steps_dfPandas Dataframe

Dataframe the step distribution of each individual.

config: dict

Dictionary keys containing optional arguments. These can be:

steps_column: str. Optional

Column contains step values. Default value is “steps”.

resample_args: dict. Optional

Dictionary containing the resample arguments. Default value is {‘rule’: ‘h’}.

timeframe: string. Optional

Time frame used for computing the distribution. Default value is ‘D’.

Returns:
df: pandas DataFrame

A dataframe containing the distribution of step count.