AP R Cookiecutter

This is a project template powered by Cookiecutter for use with datakit-project.

Structure

.
├── .Rprofile
├── .gitignore
├── README.md
├── analysis
│   └── archive
│   └── markdown
├── data
│   ├── documentation
│   ├── handmade
│   ├── html_reports
│   ├── processed
│   ├── public
│   └── source
├── etl
├── publish
├── scratch
├── viz
└── {{cookiecutter.project_slug}}.Rproj

.Rprofile
- Stores environment variables for local R projects.
.gitignore
- Ignores packrat and R user profile temporary files.
README.md
- Project-specific readme with boilerplate for data projects.
- Includes sourcing details and places to explain how to replicate/remake the project.
analysis
- R code that involves analysis on already-cleaned data. Code for cleaning data should go in etl.
  - Multiple analysis files are numbered sequentially.
  - If we are sharing the data, last analysis script is called make_dw_files.R to write_csv to public folder.
- analysis/archive
  - Any analyses for story threads that are no longer being investigated are placed here for reference.
- analysis/markdown
  - Any R Markdown files go here.
  - The AP has an R Markdown template here: https://github.com/associatedpress/apstyle
data
- This is the directory used with our datakit-data plugin.
- data/documentation
  - Documentation on data files should go here - data dictionaries, manuals, interview notes.
- data/handmade
  - Manually created data sets by reporters go here.
- data/html_reports
  - Any HTML reports or pages generated by code should go here. These are usually RMarkdown reports for sharing with reporters.
- data/processed
  - Data that has been processed by scripts in this project and is clean and ready for analysis goes here.
- data/public
  - Public-facing data files (i.e., final datasets we share with reporters/make accessible) go here - data files which are 'live'.
- data/source
  - Original data from sources goes here.
etl
- ETL (extract, transform, load) scripts for reading in source data and cleaning and standardizing it to prepare for analysis go here.
  - Multiple etl files are numbered.
  - Joins are included in etl process.
  - Last step of ETL process is to output an RDS file to data/processed.
    - naming convention: etl_WHATEVERNAME.rds
publish
- This directory holds all documents in the project that will be public facing (e.g. data.world RMarkdown files).
scratch
- This directory contains scratch materials that will not be used in the project at the end.
- Common cases are filtered tables or quick visualizations for reporters.
- This directory is not tracked in git.
viz
- Graphics and visualization development specific work such as web interactive code should go here.
{{cookiecutter.project_slug}}.Rproj
- This is the .Rproj file that can be used with RStudio to work within the project.

Usage

You will need to clone this repository to ~/.cookiecutters/ (make the directory if it doesn't exist):

cd path/to/.cookiecutters
git clone git@github.com:associatedpress/cookiecutter-r-project

Then, use datakit project:

datakit project create --template cookiecutter-r-project

If you'd like to avoid specifying the template each time, you can edit ~/.datakit/plugins/datakit-project/config.json to use this template by default:

{"default_template": "/Users/lfenn/.cookiecutters/cookiecutter-r-project"}

Configuration

You can set the default name, email, etc. for a project in the cookiecutter.json file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

AP R Cookiecutter

Structure

Usage

Configuration

Files

README.md

Latest commit

History

README.md

File metadata and controls

AP R Cookiecutter

Structure

Usage

Configuration