Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Continuous integration testing and documentation #3

Merged
merged 25 commits into from
Sep 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
118 changes: 118 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
name: Implementation Testing

on:
push:
branches:
- '*'
pull_request:
branches:
- '*'

env:
CACHE_NUMBER: 0 # increase to reset cache manually

jobs:
foundation:

strategy:
matrix:
python-version: ["3.10"]
defaults:
run:
shell: bash -l {0}
name: linux-64-py${{ matrix.python-version }}
runs-on: ubuntu-latest
steps:
# checkout the code in this repository
- uses: actions/checkout@v4
with:
path: 'arc-activitysim'

# checkout the main branch of ActivitySim itself
- uses: actions/checkout@v4
with:
repository: 'ActivitySim/activitysim'
ref: main
path: 'activitysim'
fetch-depth: 0 # get all tags, lets setuptools_scm do its thing

- name: Setup Miniforge
uses: conda-incubator/setup-miniconda@v3
with:
miniforge-version: latest
activate-environment: asim-test
python-version: ${{ matrix.python-version }}

- name: Set cache date for year and month
run: echo "DATE=$(date +'%Y%m')" >> $GITHUB_ENV

- uses: actions/cache@v4
with:
path: ~/conda_pkgs_dir
key: linux-64-conda-${{ hashFiles('activitysim/conda-environments/github-actions-tests.yml') }}-${{ env.DATE }}-${{ env.CACHE_NUMBER }}
id: cache

- name: Update environment
run: |
conda env update -n asim-test -f activitysim/conda-environments/github-actions-tests.yml

- name: Install activitysim
# installing without dependencies is faster, we trust that all needed dependencies
# are in the conda environment defined above. Also, this avoids pip getting
# confused and reinstalling tables (pytables).
run: |
python -m pip install ./activitysim --no-deps

- name: Conda checkup
run: |
conda info -a
conda list

- name: Get the Fulton data
run: |
cd arc-activitysim
python scripts/fetch-fulton.py

- name: Run progressive tests
run: |
cd arc-activitysim
python -m pytest tests/test_activitysim.py

- name: Run without Sharrow
run: |
cd arc-activitysim
python scripts/run-fulton.py

- name: Upload legacy artifacts
uses: actions/upload-artifact@v4
with:
name: legacy-outputs
path: |
${{ github.workspace }}/arc-activitysim/output-fulton-legacy/final_*.csv
${{ github.workspace }}/arc-activitysim/output-fulton-legacy/*.log
${{ github.workspace }}/arc-activitysim/output-fulton-legacy/timing_log.csv

- name: Check legacy outputs
run: |
cd arc-activitysim
python scripts/check-fulton.py --check-dir ${{ github.workspace }}/arc-activitysim/output-fulton-legacy

- name: Run with Sharrow
run: |
cd arc-activitysim
python scripts/run-fulton.py --sharrow

- name: Upload Sharrow artifacts
uses: actions/upload-artifact@v4
with:
name: sharrow-outputs
path: |
${{ github.workspace }}/arc-activitysim/output-fulton-sharrow/final_*.csv
${{ github.workspace }}/arc-activitysim/output-fulton-sharrow/*.log
${{ github.workspace }}/arc-activitysim/output-fulton-sharrow/timing_log.csv

- name: Check sharrow outputs
run: |
cd arc-activitysim
python scripts/check-fulton.py --check-dir ${{ github.workspace }}/arc-activitysim/output-fulton-sharrow

116 changes: 116 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,118 @@
# arc-activitysim
The standalone activitysim implementation for ARC travel demand model.

## Installation

To install the ARC ActivitySim model, simply clone this repository:

```bash
git clone https://github.com/atlregional/arc-activitysim.git
cd arc-activitysim
```

Using this model requires ActivitySim 1.3 or later. This is most easily
accomplished using the `activitysim` conda package, which first requires the
installation of the `conda` package manager. For most systems, the Miniforge
distribution is recommended, which can be downloaded from
[conda-forge](https://github.com/conda-forge/miniforge?tab=readme-ov-file#miniforge3).

Once `conda` is installed, the `activitysim` package can be installed from the
MiniForge Prompt (or the terminal on Linux/Mac):

```bash
conda create -n ARC-ASIM activitysim -c conda-forge --override-channels
```

This will create a new conda environment named `ARC-ASIM` with the `activitysim`
package installed. To activate the environment, use:

```bash
conda activate ARC-ASIM
```

## Running the Model

The ARC ActivitySim model can be run using the `activitysim` command line tool.
The model is configured using the `configs` directory, which contains the
configuration files for the model. From the directory where this repository
has been cloned, the model can be run using the following command:

```bash
activitysim run -c configs -d data_dir -o output_dir
```

Where `data_dir` is the directory containing the input data for the model, and
`output_dir` is the directory where the model output will be written. The data
directory should contain the necessary input files (houeholds, persons, land use,
and skims), which can be the full scale ARC data or a smaller test data set (see
instructions to access the Fulton County test data below). The output directory
will be created if it does not exist, and the model output will be written to
subdirectories of this directory.

## Running the Model with Sharrow

The ARC ActivitySim model can also be run with the sharrow enabled. This is
done by adding the relevant sharrow configs directory to the command. For
example, to run the model with the sharrow in compile-test mode, use the
following command:

```bash
activitysim run -c configs -c configs_sh_compile -d data_dir -o output_dir
```

This will run the model with the sharrow enabled, and will compile the numba
code and run tests to ensure the results match between sharrow and legacy modes.
Once the sharrow compiling is complete, the model can subsequently be run in
sharrow's "production" mode, which will be much faster:

```bash
activitysim run -c configs -c configs_sh -d data_dir -o output_dir
```

## Testing Dataset (Fulton County)

This model is built to run with the data that simulates the full-scale
model of the ARC region, but this scale of data can be overwhelming
for testing the operation of the model, especially on more limited
platforms.

To facilitate testing, data for a smaller slice of the region is available.
This test data includes just Fulton County, which has 1,296 zones; this is
a small enough area to run the model on a laptop or within the CI testing
infrastructure, as it will require only about 6GB of RAM to to store the
skims in memory, and another 1 or 2 GB for the rest of the model. But this
area is still large enough to provide a meaningful test of the model, with
enough zones to exercise the model's capabilities and complexity. The Fulton
County data can be downloaded with this Python script (also available
as [fetch-fulton.py](./scripts/fetch-fulton.py)):

```python
from pathlib import Path
from activitysim.examples.external import download_external_example

example_dir = download_external_example(
name=".",
working_dir=Path.cwd(),
assets={
"arc-fulton-data.tar.zst": {
"url": "https://github.com/atlregional/arc-activitysim/releases/download/v1.3.0/arc-fulton-data.tar.zst",
"sha256": "402c3cf1fdd96ae0342f17146453b713602ca8454b78f1e8ff8cbc403e03441e",
"unpack": "arc-fulton-data",
},
}
)
```

## Continuous Integration Testing

This repository is configured to run continuous integration testing
using GitHub Actions. The tests are run on a small subset of the data
for Fulton County, and the results are uploaded to the `Actions` tab
of the repository. The tests are configured in the `.github/workflows`
directory, and use the scripts in the `scripts` directory.

Note that the tests are run in a clean environment every time, so the
first sharrow test includes the overhead of compiling all the numba code.
This will make it appear that this sharrow test is *much* slower than the
comparable legacy test; this is normal an not an indication that sharrow is
slower than the legacy code for production runs.
Loading