Skip to content

Commit

Permalink
Don't Install Tableau API on arm64
Browse files Browse the repository at this point in the history
The tableauhyperapi module is unavailable on mac with arm64 processors.

* Update poetry to take advantage of dependency groups.
* Add a tableau dependency group that can be excluded when installing
  dependencies.
* Update python_src to fallback with error messages when using the
  tableau subdir that creates hyperfiles fails to import everything.
* Use new `BUILDPLATFORM` arg in dockerfile to build without tableau
  deps on an arm64 platform.
* Add an export for the BUILDPLATFORM variable to .envrc and pass it
  through via docker-compose when building the lamp py images.
  • Loading branch information
mzappitello committed Jan 9, 2024
1 parent 7dd25a3 commit 39a5843
Show file tree
Hide file tree
Showing 9 changed files with 683 additions and 757 deletions.
2 changes: 1 addition & 1 deletion .env
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,4 @@ PUBLIC_ARCHIVE_BUCKET=mbta-ctd-dataplatform-dev-archive
# Tableau
TABLEAU_USER=DOUPDATE
TABLEAU_PASSWORD=DOUPDATE
TABLEAU_SERVER=http://awtabDEV02.mbta.com
TABLEAU_SERVER=http://awtabDEV02.mbta.com
6 changes: 5 additions & 1 deletion .envrc
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
use asdf

dotenv
dotenv

# used in dockercompose and dockerfile
export BUILDPLATFORM=$(uname -m)

2 changes: 1 addition & 1 deletion .tool-versions
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
poetry 1.4.2
poetry 1.7.1
python 3.10.13
direnv 2.32.2
4 changes: 4 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@ services:
env_file: .env
build:
context: ./python_src
args:
BUILDPLATFORM: $BUILDPLATFORM
depends_on:
- local_rds
working_dir: /lamp
Expand All @@ -31,6 +33,8 @@ services:
env_file: .env
build:
context: ./python_src
args:
BUILDPLATFORM: $BUILDPLATFORM
depends_on:
- local_rds
working_dir: /lamp
Expand Down
12 changes: 10 additions & 2 deletions python_src/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,21 @@ RUN chmod a=r /usr/local/share/amazon-certs.pem

# Install poetry
RUN pip install -U pip
RUN pip install "poetry==1.4.2"
RUN pip install "poetry==1.7.1"

# copy poetry and pyproject files and install dependencies
WORKDIR /lamp/
COPY poetry.lock poetry.lock
COPY pyproject.toml pyproject.toml
RUN poetry install --no-dev --no-interaction --no-ansi -v

# Tableau dependencies for arm64 cannot be resolved (since salesforce doesn't
# support them yet). For that buildplatform build without those dependencies
ARG BUILDPLATFORM
RUN echo "Installing python dependencies for ${BUILDPLATFORM}."
RUN if [ "$BUILDPLATFORM" = "arm64" ]; then \
poetry install --without tableau --no-interaction --no-ansi -v ;\
else poetry install --no-interaction --no-ansi -v ;\
fi

# Copy src directory to run against and build lamp py
COPY src src
Expand Down
1,373 changes: 625 additions & 748 deletions python_src/poetry.lock

Large diffs are not rendered by default.

9 changes: 7 additions & 2 deletions python_src/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -27,10 +27,15 @@ psutil = "^5.9.1"
schedule = "^1.1.0"
alembic = "^1.10.2"
types-pytz = "^2023.3.0.1"

[tool.poetry.group.tableau]
optional = false

[tool.poetry.group.tableau.dependencies]
tableauhyperapi = "^0.0.17971"
tableauserverclient = "0.25"

[tool.poetry.dev-dependencies]
[tool.poetry.group.dev.dependencies]
black = "^23.1.0"
mypy = "^1.1.1"
pylint = "^2.17.0"
Expand Down Expand Up @@ -80,6 +85,6 @@ max-line-length = 80
min-similarity-lines = 10
# ignore session maker as it gives pylint fits
# https://github.com/PyCQA/pylint/issues/7090
ignored-classes = ['sqlalchemy.orm.session.sessionmaker','pyarrow.compute']
ignored-classes = ['sqlalchemy.orm.session.sessionmaker', 'pyarrow.compute']
# ignore the migrations directory. its going to have duplication and _that is ok_.
ignore-paths = ["^src/lamp_py/migrations/.*$"]
2 changes: 1 addition & 1 deletion python_src/src/lamp_py/performance_manager/pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
from lamp_py.runtime_utils.env_validation import validate_environment
from lamp_py.runtime_utils.process_logger import ProcessLogger

from lamp_py.tableau.pipeline import start_parquet_updates
from lamp_py.tableau import start_parquet_updates

from .flat_file import write_flat_files
from .l0_gtfs_rt_events import process_gtfs_rt_files
Expand Down
30 changes: 29 additions & 1 deletion python_src/src/lamp_py/tableau/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1,29 @@
"""Utilites for Interacting with Tableau and Hyper files"""
"""Utilities for Interacting with Tableau and Hyper files"""

try:
# pylint: disable=C0414
#
# Import alias does not rename original package. The intent is to grab it
# here and pass it through other portions of the codebase.
from .pipeline import start_parquet_updates as start_parquet_updates

# pylint: enable=C0414

except ModuleNotFoundError as mfl_exception:
import logging
from lamp_py.postgres.postgres_utils import DatabaseManager

# pylint: disable=W0613
#
# db_manaager is unused because this method has to match the function
# signature of the method its replacing.
def start_parquet_updates(db_manager: DatabaseManager) -> None:
"""
re-implimentation of start parquet updates in the event that the
tableauhyperapi module cannot be found.
"""
logging.exception(
"Unable to run parquet files on this machine due to Module Not Found error"
)

# pylint: enable=W0613

0 comments on commit 39a5843

Please sign in to comment.