niimpy.reading.database module
Read data from sqlite3 databases.
Direct use of this module is mostly deprecated.
Read data from sqlite3 databases, both into pandas.DataFrame:s (Database.raw(), among other functions), and Database objects. The Database object does not immediately load data, but provides some methods to load data on demand later, possibly doing various filtering and preprocessing already at the loading stage. This can save memory and processing time, but is much more complex.
This module is mostly out-of-use now: read.read_sqlite is used instead, which wraps the .raw() method and reads all data into memory.
Database format
When reading data, a table name must be specified (which allows multiple datasets to be put in one file). Table column names map to dataframe column names, with various standard processing (for example the ‘time’ column being converted to the index)
Quick usage
db = database.open(FILE_NAME, tz=TZ) df = db.raw(TABLE_NAME, user=database.ALL)
Recommend usage:
df = niimpy.read_sqlite(FILE_NAME, TABLE_NAME, tz=TZ)
See also
niimpy.reading.read_*: currently recommended functions to access all types of data, including databases.
- class niimpy.reading.database.Data1(db, tz=None)[source]
Bases:
object
Database wrapper for niimpy data.
This opens a database and provides methods to do common operations.
Methods
count
(*args, **kwargs)Return the number of rows
execute
(*args, **kwargs)Execute rauw SQL code.
exists
(*args, **kwargs)Returns True if any data exists
first
(table, user[, start, end, offset, ...])Return earliest data point.
get_survey_score
(table, user, survey[, ...])Get the survey results, summing scores.
last
(*args, **kwargs)Return the latest timestamp.
raw
(table, user[, limit, offset, start, end])Read all data in a table and return it as a DataFrame.
tables
()List all tables that are inside of this database.
Return table of number of data points per user, per table.
users
([table])Return set of all users in all tables
validate_username
(user)Validate a username, for single/multiuser database and so on.
hourly
occurrence
timestamps
- execute(*args, **kwargs)[source]
Execute rauw SQL code.
Execute raw SQL. Smply proxy all arguments to self.conn.execute(). This is simply a convenience shortcut.
- exists(*args, **kwargs)[source]
Returns True if any data exists
Follows the same syntax as .first(), .last(), and .count(), but the limit argument is not used.
- first(table, user, start=None, end=None, offset=None, _aggregate='min', _limit=None)[source]
Return earliest data point.
Return None if there is no data.
- get_survey_score(table, user, survey, limit=None, start=None, end=None)[source]
Get the survey results, summing scores.
survey: The servey prefix in the ‘id’ column, e.g. ‘PHQ9’. An ‘_’ is appended.
- raw(table, user, limit=None, offset=None, start=None, end=None)[source]
Read all data in a table and return it as a DataFrame.
This reads all data (subject to several possible filters) and returns it as a DataFrame.
- user_table_counts()[source]
Return table of number of data points per user, per table.
Return a dataframe of row=table, column=user, value=number of counts of that user in that table.
- validate_username(user)[source]
Validate a username, for single/multiuser database and so on.
This function considers if the database is single or multi-user, and ensures a valid username or ALL.
It returns a valid username, so can be used as a wrapper, to handle future special cases, e.g.:
user = db.validate_username(user)
- class niimpy.reading.database.sqlite3_stdev[source]
Bases:
object
Sqlite sample standard deviation function in pure Python.
With conn.create_aggregate(“stdev”, 1, sqlite3_stdev), this adds a stdev function to sqlite.
Edge cases:
Empty list = nan (different than C function, which is zero)
Ignores nan input values (does not count them). (different than numpy: returns nan)
ignores non-numeric types (no conversion)
Methods
finalize
step