Skip to content

Commit

Permalink
Merge branch 'release/0.3.0' into main
Browse files Browse the repository at this point in the history
  • Loading branch information
DevMattM committed Dec 7, 2017
2 parents 158940c + 1d6cbb4 commit dc118df
Show file tree
Hide file tree
Showing 17 changed files with 653 additions and 165 deletions.
109 changes: 109 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
# Created by https://www.gitignore.io/api/python

### Python ###
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/

# End of https://www.gitignore.io/api/python

filters_config.cfg
current-db-subjects.csv
28 changes: 27 additions & 1 deletion CHANGELOG
Original file line number Diff line number Diff line change
@@ -1,3 +1,29 @@
## [0.3.0] - 2017-12-07
### Added
* add a reqs file for using a venv (Matthew McConnell)
* Create new filters.py script to pull data from REDCap and run filters on it (AjanthaRamineni)
* Add a filter to change headers based on config (Matthew McConnell)
* Add fiter to remove unneccessary RedCap Events (AjanthaRamineni)
* Add a script to run all the filters with a config file (Matthew McConnell)
### Changed
* Add a config across all filters with a decorator for validation (Matthew McConnell)
* Change how filter_ptid works - it now checks ptid, visit type, and visit num (AjanthaRamineni)
* Update the aod range for kids and siblings. Changed field from num to char (Matthew McConnell)
* Update changelog and setup (Tarun Gupta Akirala)
* Update README in preperation for release (Matthew McConnell)
* Update notes (Tarun Gupta Akirala)
### Fixed]
* Fix logic for determing existance of C1S and C2 forms (AjanthaRamineni)
* Fix column numbering on FVP B8 form (Naomi Braun)
* Fix bug where 2 questions should be able to hold values regardless of other values (Matthew McConnell)
* fixes #25 and adds more debugging logs (Tarun Gupta Akirala)


## [0.2.4] - 2017-03-27
### Added
* Added few log statements as changes.
* Project handover from takirala to ctsit

## [0.2.3] - 2017-02-14
### Changed
* Refactored c1s form - redcap C1 form to C1S in alz website
Expand Down Expand Up @@ -49,4 +75,4 @@
* Created error messages where data does not meet form definitions (Tarun Gupta Akirala)
* Added the C1S temporary Spanish form definitions and rules (Tarun Gupta Akirala)
* Added flag to have NACCulator output only the Neuropathology form (Tarun Gupta Akirala)
* Added ability to check for data in C2 or C1S form and make determination on which form to use based on data present or not present (Tarun Gupta Akirala)
* Added ability to check for data in C2 or C1S form and make determination on which form to use based on data present or not present (Tarun Gupta Akirala)j
77 changes: 59 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,9 @@ This is not exhaustive, but here is an explanation of some important files.
* `tools/generator.py`:
generates Python objects based on NACC Data Element Dictionaries in CSV.

* `/run_filters.py and run_filters.sh`:
pulls data from REDCap based on the settings found in nacculator_cfg.ini (for .py)
and filters_config.cfg (for .sh).

HOW TO Convert from REDCap to NACC
---------------------------------
Expand All @@ -46,7 +49,7 @@ The program accepts two arguments -file and -(ivp|fvp|np). Both the arguments ar
$ PYTHONPATH=. ./nacc/redcap2nacc.py -h
usage: redcap2nacc.py [-h]
[-fvp | -ivp | -np | -f {cleanPtid,updateField,replaceDrugId,fillDefault,fixC1S}]
[-file FILE] [-meta FILTER_META]
[-file FILE] [-meta FILTER_CONFIG]

Process redcap form output to nacculator.

Expand All @@ -58,17 +61,15 @@ The program accepts two arguments -file and -(ivp|fvp|np). Both the arguments ar
-f or --filter Accepts one of {cleanPtid,updateField,replaceDrugId,fillDefault,fixC1S}
Set this flag to process the filter
-file FILE Path of the csv file to be processed.
-meta FILTER_META Input file for the filter metadata (in case cleanPtid is used)
-meta FILTER_META Filter config file (nacculator_cfg.ini) when running filters

Example Usage

PYTHONPATH=. ./nacc/redcap2nacc.py -np -file data.csv > data.txt

To use a filter,

PYTHONPATH=. ./nacc/redcap2nacc.py -f cleanPtid -meta someFileName.csv < data.csv > data.txt

Only cleanPtid filter requires a meta file to be passed to it. Other filters do not need a meta tag.
PYTHONPATH=. ./nacc/redcap2nacc.py -f cleanPtid -meta nacculator_cfg.ini < data.csv > data.txt

_Note: output is written to `STDOUT`; errors are written to `STDERR`; input can
be `STDIN` or the first argument passed to `redcap2nacc`._
Expand All @@ -81,18 +82,29 @@ HOW TO Use nacculator to filter data
If your data is not clean enough to be processed by nacculator, there are some
built in functions to clean (read transform) the data.

In order to properly use the filters, the first step is to check and validate
that the nacculator_cfg.ini file has the proper settings for the filter to run.
The config file contains sections with in-code filter function name. Each of
these sections contains elements necessary for the filter to run.
The filters described below will discuss what is required, if anything.
If the filter requires the config, it must be passed with the -meta flag like
the example above shows.

* **cleanPtid**

This filter requires the meta option to be set using the -meta flag. The meta
file can be a csv file of ptids to be removed. All the records whose ptid is
found in the passed meta file will be discarded in the output file.
**Filter config required**
This filter requires a section in the config called 'filter_clean_ptid'. This
section will contain a single key 'filepath' which will point to a csv file
of ptids to be removed. All the records whose ptid with same packet and visit
num found in the passed meta file will be discarded in the output file.

Example meta file:

$ cat sampleRemovePtidFile.csv
ptids
110001
110003
Patient ID,Packet type,Visit Num,Status
110001,I,1,Current
110001,M,M1,Current
110003,I,001,Current
110003,F,002,Current

* **replaceDrugId**

Expand All @@ -101,16 +113,24 @@ built in functions to clean (read transform) the data.

This filter does not require any meta data file as of now.

* **fixC1S**
* **fixHeaders**

This filter fixes the column names of some of the fields in C1S form. This
filter does not check for any data. It always replaces the column names if found.
**Filter config required**
This filter requires a section in the config called 'filter_fix_headers' with
as many keys as needed to replace the necessary columns. See example below.
This filter fixes the column names of any column found in the filter mapping.
This filter does not check for any data. It always replaces the column names
if found.

Currently, below replacements are used:

c1s_2a_npsylan -> c1s_2_npsycloc
c1s_2a_npsylanx -> c1s_2a_npsylan
b6s_2a1_npsylanx -> c1s_2a1_npsylanx
config:
c1s_2a_npsylan: c1s_2_npsycloc
c1s_2a_npsylanx: c1s_2a_npsylan
b6s_2a1_npsylanx: c1s_2a1_npsylanx
fu_otherneur: fu_othneur
fu_otherneurx: fu_othneurxs
fu_strokedec: fu_strokdec


* **fillDefault**
Expand All @@ -131,6 +151,27 @@ built in functions to clean (read transform) the data.
This filter is used to update non blank fields. Currently, only adcid is updated
to 41.

* **removePtid**

**Filter config required**
This filter requires a section in the config called 'filter_remove_ptid' with
a single key called 'ptid_format'. The value for that key is a regex string
to match ptids that are to be kept.

This filter is used to remove ptids that may have a different set of ids for a
different study, or help limit which ids show up in the final result.

config:
ptid_format: 11\d.*

* **removeDateRecord**

This filter is used to remove records who may be missing visit dates. It
searches for rows missing the visit day, month, or year. If any of those
fields are missing, it removes the row.



HOW TO Generate New Forms
------------------------

Expand Down
2 changes: 2 additions & 0 deletions dev-requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
-e git+git@github.com:ctsit/cappy.git@1.2.0#egg=cappy
PyYAML
12 changes: 12 additions & 0 deletions filter_config_example.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
token: Your RedCap Token
redcap_server: Your Redcap Server
filter_config:
ptid_format: Your Ptid Format
current_sub: Path/to/Current_db.csv
header_mapping:
c1s_2a_npsylan : c1s_2_npsycloc
c1s_2a_npsylanx : c1s_2a_npsylan
b6s_2a1_npsylanx : c1s_2a1_npsylanx
fu_otherneur : fu_othneur
fu_otherneurx : fu_othneurxs
fu_strokedec : fu_strokdec
5 changes: 5 additions & 0 deletions filters_config.cfg.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
token="1234567890"
content="record"
format="csv"
type="flat"
redcap_server="https://my.redcap.server/redcap/api"
16 changes: 16 additions & 0 deletions getPacketStatus.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
var statusList = document.getElementsByName('XID')[0].options;
packets = [['Patient ID','Packet type','Visit Num','Status']];
for(i=6; i < statusList.length; i++){
label = statusList[i].label.split(" ");
csv = [label[3], label[6], label[10], label[19]];
packets[i] = csv;}

var csvContent = "data:text/csv;charset=utf-8,";
packets.forEach(function(infoArray, index){

dataString = infoArray.join(",");
csvContent += index < packets.length ? dataString+ "\n" : dataString;
});

var encodedUri = encodeURI(csvContent);
window.open(encodedUri);
26 changes: 18 additions & 8 deletions nacc/redcap2nacc.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,11 +144,19 @@ def main():
"""
parser = argparse.ArgumentParser(description='Process redcap form output to nacculator.')

filters_names = { 'cleanPtid' : 'clean_ptid',
filters_names = {
'cleanPtid' : 'clean_ptid',
'replaceDrugId' : 'replace_drug_id',
'fixC1S' : 'fix_c1s',
'fixHeaders' : 'fix_headers',
'fillDefault' : 'fill_default',
'updateField' : 'update_field'}
'updateField' : 'update_field',
'removePtid' : 'remove_ptid',
'removeDateRecord' : 'eliminate_empty_date'}

filter_exclusive_names = {
'cleanPtid' : 'clean_ptid',
'removeRedCapEvent':'eliminate_redcapeventname'
}

option_group = parser.add_mutually_exclusive_group()
option_group.add_argument('-fvp', action='store_true', dest='fvp', help='Set this flag to process as fvp data')
Expand All @@ -172,14 +180,16 @@ def main():
output = sys.stdout

if options.filter:
filter_method = getattr(filters, 'filter_' + filters_names[options.filter])
filter_method(fp, options.filter_meta, output)
filter_method = 'filter_' + filters_names[options.filter]
filter_func = getattr(filters, filter_method)
filter_func(fp, options.filter_meta, output)
else:
reader = csv.DictReader(fp)
for record in reader:
for record in reader:
print >> sys.stderr, "[START] ptid : " + str(record['ptid'])
try:
if options.ivp:
packet = ivp_builder.build_uds3_ivp_form(record)
packet = ivp_builder.build_uds3_ivp_form(record)
elif options.np:
packet = np_builder.build_uds3_np_form(record)
elif options.fvp:
Expand All @@ -206,4 +216,4 @@ def main():
print form

if __name__ == '__main__':
main()
main()
Loading

0 comments on commit dc118df

Please sign in to comment.