Contribution Guide
General principles
Niimpy is an open source project and general open source contribution guidelines apply.
Contributions are welcome and encouraged.
You don’t need to be perfect. Suggest what you can and we will help it improve.
Communication
We use GitHub issues and pull requests for communication. - Before you start, create an issue and describe your contribution. - This is the primary discussion channel about your contribution and allows us to plan effectively. - When accepted, you agree to publish your contribution under the MIT license. The full license text can be found in the “LICENSE” file in the project root
Bugs
If you find a bug, problem or potential enhancement, let us know in an Issue on the Niimpy GitHub page.
Before sinking time into fixing the issue or improving Niimpy, discuss with us on the Issue. This ensures that no-one else is working on it and that we can help you with the process.
To suggest a change, preferably fork the repository and create a pull request.
New Functionality
Function inputs
Each function should accept two parameters: a dataframe and an optional dictionary containing other arguments.
Column names should be passed in the optional arguments dictionary. They can have default values, but these should be adjustable.
Group by ‘user’ and ‘device’ columns if they are present in the input
You should always use the DataFrame index to retrieve data/time values, not the
datetime
column (which is a convenience thing but not guaranteed to be there).Don’t fail if there are extra columns passed (or missing some non-essential columns). Look at what columns/data is passed and and use that, but don’t do anything unexpected if someone makes a mistake with input data
Function outputs
Have any times returned in the index (unless each row needs multiple times, then do what you need)
Resample by the time index, using given resample arguments (or a default value).
other
Please add documentation to each new function using docstrings. This should include enough description so that someone else can understand and reproduce all relevant features - enough to describe the method for a scientific article.
Please add unit tests which test each relevant feature (and each claimed method feature) with a minimal example. Each function can have multiple tests. For examples of unit tests, see below or
niimpy/test_screen.py
. You can create some sample data within each test module which can be used both during development and for tests.The Zen of Python is always good advice
Improving old functions
Add tests for existing functionality
For every functionality Niimpy claims, there should be a minimal test for it.
Don’t fail if unnecessary columns are not there (don’t drop unneeded columns, select only the needed ones).
Use the index, not the
datetime
column.Improve the docstring of the function: we use the numpydoc format
Add a documentation page for each sensor, document each function and include an example.
Document what parameters the function groups by when analyzing
For example an ideal case is that any ‘user’ and ‘device’ columns are grouped by in the final output.
Example unit test
You can read about testing in general in the CodeRefinery testing lesson.
First you would define some sample data. You could reuse existing data (or data from niimpy.sampledata
), but if data is reused too much it can become harder to edit existing tests. Do share data when possible but split it when it’s relevant.
@pytest.fixture
def screen1():
return niimpy.read_csv(io.StringIO("""\
time,screen_status
0,1
60,0
"""))
Then you can make a test function:
def test_screen_off(screen1):
off = niimpy.preprocess.screen_off(screen1)
assert pd.Timestamp(60, unit='s', tz=TZ) in off.index
The assert
is the actual test: if the statement after assert is false, the test fails. You can have multiple asserts in a function to test multiple things. When something fails, pytest
will provide much more useful error messages than you might expect.
You run tests with pytest
or pytest tests/preprocessing/test_screen.py
. You can limit to certain tests with -k
and engage a debugger on errors with --pdb
.
Documentation notes
You can use Jupyter or ReST. ReST is better for narrative documentation.