Contribution Guide

General principles

Niimpy is an open source project and general open source contribution guidelines apply.

Contributions are welcome and encouraged.

  • You don’t need to be perfect. Suggest what you can and we will help it improve.

Communication

We use GitHub issues and pull requests for communication. - Before you start, create an issue and describe your contribution. - This is the primary discussion channel about your contribution and allows us to plan effectively. - When accepted, you agree to publish your contribution under the MIT license. The full license text can be found in the “LICENSE” file in the project root

Bugs

  • If you find a bug, problem or potential enhancement, let us know in an Issue on the Niimpy GitHub page.

  • Before sinking time into fixing the issue or improving Niimpy, discuss with us on the Issue. This ensures that no-one else is working on it and that we can help you with the process.

  • To suggest a change, preferably fork the repository and create a pull request.

New Functionality

Function inputs

  • Each function should accept two parameters: a dataframe and an optional dictionary containing other arguments.

  • Column names should be passed in the optional arguments dictionary. They can have default values, but these should be adjustable.

  • Group by ‘user’ and ‘device’ columns if they are present in the input

  • You should always use the DataFrame index to retrieve data/time values, not the datetime column (which is a convenience thing but not guaranteed to be there).

  • Don’t fail if there are extra columns passed (or missing some non-essential columns). Look at what columns/data is passed and and use that, but don’t do anything unexpected if someone makes a mistake with input data

Function outputs

  • Have any times returned in the index (unless each row needs multiple times, then do what you need)

  • Resample by the time index, using given resample arguments (or a default value).

other

  • Please add documentation to each new function using docstrings. This should include enough description so that someone else can understand and reproduce all relevant features - enough to describe the method for a scientific article.

  • Please add unit tests which test each relevant feature (and each claimed method feature) with a minimal example. Each function can have multiple tests. For examples of unit tests, see below or niimpy/test_screen.py. You can create some sample data within each test module which can be used both during development and for tests.

  • The Zen of Python is always good advice

Improving old functions

  • Add tests for existing functionality

    • For every functionality Niimpy claims, there should be a minimal test for it.

  • Don’t fail if unnecessary columns are not there (don’t drop unneeded columns, select only the needed ones).

  • Use the index, not the datetime column.

  • Improve the docstring of the function: we use the numpydoc format

  • Add a documentation page for each sensor, document each function and include an example.

  • Document what parameters the function groups by when analyzing

    • For example an ideal case is that any ‘user’ and ‘device’ columns are grouped by in the final output.

Example unit test

You can read about testing in general in the CodeRefinery testing lesson.

First you would define some sample data. You could reuse existing data (or data from niimpy.sampledata), but if data is reused too much it can become harder to edit existing tests. Do share data when possible but split it when it’s relevant.

@pytest.fixture
def screen1():
    return niimpy.read_csv(io.StringIO("""\
time,screen_status
0,1
60,0
"""))

Then you can make a test function:

def test_screen_off(screen1):
    off = niimpy.preprocess.screen_off(screen1)
    assert pd.Timestamp(60,  unit='s', tz=TZ) in off.index

The assert is the actual test: if the statement after assert is false, the test fails. You can have multiple asserts in a function to test multiple things. When something fails, pytest will provide much more useful error messages than you might expect.

You run tests with pytest or pytest tests/preprocessing/test_screen.py. You can limit to certain tests with -k and engage a debugger on errors with --pdb.

Documentation notes

  • You can use Jupyter or ReST. ReST is better for narrative documentation.