{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "d4d5d7e4",
   "metadata": {},
   "source": [
    "# Tracker Data"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "5d4a7163",
   "metadata": {},
   "source": [
    "## Introduction\n",
    "\n",
    "Fitness tracker is a rich source of longitudinal data captured at high frequency. Those can include step counts, heart rate, calories expenditure, or sleep time. This notebook explains how we can use `niimpy` to extract some basic statistic and features from step count data.\n",
    "\n",
    "A dataframe with fittness data should contain the following columns (column names can be different, but in that case they must be provided as parameters):\n",
    "- `user`: Subject ID\n",
    "- `device`: Device ID\n",
    "- `steps`: Number of steps measured on the time interval\n",
    "\n",
    "As usual, the index should be the time of the measurements. Step count is calculated between that time and the previous timestamp."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "d802fffa",
   "metadata": {},
   "source": [
    "## Read data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "b170d597",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import niimpy.preprocessing.tracker as tracker\n",
    "from niimpy import config\n",
    "import warnings\n",
    "warnings.filterwarnings(\"ignore\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "7a13bbd4",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(73, 4)"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data = pd.read_csv(config.STEP_SUMMARY_PATH, index_col=0)\n",
    "# Converting the index as date\n",
    "data.index = pd.to_datetime(data.index)\n",
    "data.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "245e4af7",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>user</th>\n",
       "      <th>date</th>\n",
       "      <th>time</th>\n",
       "      <th>steps</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2021-07-01 00:00:00</th>\n",
       "      <td>wiam9xme</td>\n",
       "      <td>2021-07-01</td>\n",
       "      <td>00:00:00.000</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2021-07-01 01:00:00</th>\n",
       "      <td>wiam9xme</td>\n",
       "      <td>2021-07-01</td>\n",
       "      <td>01:00:00.000</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2021-07-01 02:00:00</th>\n",
       "      <td>wiam9xme</td>\n",
       "      <td>2021-07-01</td>\n",
       "      <td>02:00:00.000</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2021-07-01 03:00:00</th>\n",
       "      <td>wiam9xme</td>\n",
       "      <td>2021-07-01</td>\n",
       "      <td>03:00:00.000</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2021-07-01 04:00:00</th>\n",
       "      <td>wiam9xme</td>\n",
       "      <td>2021-07-01</td>\n",
       "      <td>04:00:00.000</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                         user        date          time  steps\n",
       "2021-07-01 00:00:00  wiam9xme  2021-07-01  00:00:00.000      0\n",
       "2021-07-01 01:00:00  wiam9xme  2021-07-01  01:00:00.000      0\n",
       "2021-07-01 02:00:00  wiam9xme  2021-07-01  02:00:00.000      0\n",
       "2021-07-01 03:00:00  wiam9xme  2021-07-01  03:00:00.000      0\n",
       "2021-07-01 04:00:00  wiam9xme  2021-07-01  04:00:00.000      0"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data.head()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "d3141ebc",
   "metadata": {},
   "source": [
    "## Getting basic statistics\n",
    "\n",
    "Using `niimpy` we can extract a user's step count statistic within a time window. The statistics include:\n",
    "\n",
    "- `mean`: average number of steps taken within the time range\n",
    "- `standard deviation`: standard deviation of steps \n",
    "- `max`: max steps taken within a day during the time range\n",
    "- `min`: min steps taken within a day during the time range\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "c86865e6",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>min_sum_step</th>\n",
       "      <th>max_sum_step</th>\n",
       "      <th>std_sum_step</th>\n",
       "      <th>avg_sum_step</th>\n",
       "      <th>median_sum_step</th>\n",
       "      <th>user</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>5616</td>\n",
       "      <td>13025</td>\n",
       "      <td>3352.347745</td>\n",
       "      <td>8437.383562</td>\n",
       "      <td>6480.0</td>\n",
       "      <td>wiam9xme</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   min_sum_step  max_sum_step  std_sum_step  avg_sum_step  median_sum_step  \\\n",
       "0          5616         13025   3352.347745   8437.383562           6480.0   \n",
       "\n",
       "       user  \n",
       "0  wiam9xme  "
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tracker.step_summary(data, value_col = 'steps')"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "7c6805a0",
   "metadata": {},
   "source": [
    "## Feature extraction\n",
    "\n",
    "Assuming that the step count comes in at hourly resolution, we can compute the distribution of daily step count at each hour. The daily distribution is helpful to look at if for example, we want to see at what hours a user is most active at."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "6cf2639c",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{<function tracker_step_distribution at 0x7a2e3a9e6fc0>: {}} {}\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>user</th>\n",
       "      <th>step_distribution</th>\n",
       "      <th>step_sum</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2021-07-01 00:00:00</th>\n",
       "      <td>wiam9xme</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>5616.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2021-07-01 01:00:00</th>\n",
       "      <td>wiam9xme</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>5616.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2021-07-01 02:00:00</th>\n",
       "      <td>wiam9xme</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>5616.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2021-07-01 03:00:00</th>\n",
       "      <td>wiam9xme</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>5616.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2021-07-01 04:00:00</th>\n",
       "      <td>wiam9xme</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>5616.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2021-07-03 19:00:00</th>\n",
       "      <td>wiam9xme</td>\n",
       "      <td>0.025162</td>\n",
       "      <td>12002.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2021-07-03 20:00:00</th>\n",
       "      <td>wiam9xme</td>\n",
       "      <td>0.001000</td>\n",
       "      <td>12002.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2021-07-03 21:00:00</th>\n",
       "      <td>wiam9xme</td>\n",
       "      <td>0.029495</td>\n",
       "      <td>12002.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2021-07-03 22:00:00</th>\n",
       "      <td>wiam9xme</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>12002.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2021-07-03 23:00:00</th>\n",
       "      <td>wiam9xme</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>12002.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>72 rows × 3 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                         user  step_distribution  step_sum\n",
       "2021-07-01 00:00:00  wiam9xme           0.000000    5616.0\n",
       "2021-07-01 01:00:00  wiam9xme           0.000000    5616.0\n",
       "2021-07-01 02:00:00  wiam9xme           0.000000    5616.0\n",
       "2021-07-01 03:00:00  wiam9xme           0.000000    5616.0\n",
       "2021-07-01 04:00:00  wiam9xme           0.000000    5616.0\n",
       "...                       ...                ...       ...\n",
       "2021-07-03 19:00:00  wiam9xme           0.025162   12002.0\n",
       "2021-07-03 20:00:00  wiam9xme           0.001000   12002.0\n",
       "2021-07-03 21:00:00  wiam9xme           0.029495   12002.0\n",
       "2021-07-03 22:00:00  wiam9xme           0.000000   12002.0\n",
       "2021-07-03 23:00:00  wiam9xme           0.000000   12002.0\n",
       "\n",
       "[72 rows x 3 columns]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "f = tracker.tracker_step_distribution\n",
    "step_distribution = tracker.extract_features_tracker(data, features={f: {}})\n",
    "step_distribution"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "niimpy",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}