Skip to content

yaronha/nyc-taxi-demo

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MLOps Tutorial - NYC Taxi fare

Open In Studio Lab

This project demonstrates a complete ML project and the development flow from initial exploration to continuous deployment at scale. The example is based on a Kaggle competition. Its goal is to predict the correct trip fare, using the public NYC Taxi dataset.

This example is intended to explain and demonstrate the overall MLOps flow by using the MLRun MLOps orchestration framework. It is not designed to dive into the individual components or models.

It is recommended to fork this repo into your GitHub account and clone it into your development environment.

Overview

The ML application development and productization flow consists of the following steps (demonstrated through notebooks):

project-dev-flow

You can find the python source code under /src and the tests under /tests.

Installation

This project can run in different development environments:

The project works with the MLRun service. Y ou can deploy the MLRun service (API, DB, UI, and execution environment) over Docker or, preferabley, over Kubernetes. The make mlrun-docker launches a local MLRun service using Docker compose (the MLRun UI can be viewed in: http://localhost:8060). Alternatively edit the mlrun.env file to configure a remote MLRun service (over Kubernetes).

For resource-constrained environments without Docker you can start the MLRun service as a process (no UI) with the make mlrun-api command.

Install in a local environment

First, install the package dependencies and the environment.

Using pip (install the requirements):

make install-requirements

Your environment should include MLRUN_ENV_FILE=<absolute path to the ./mlrun.env file> (point to the mlrun .env file in this repo). See the mlrun client setup instructions for details.

Using conda (create the mlrun conda env and install packages and env vars in it):

make conda-env
conda activate mlrun

Make sure all your tasks and Notebooks use the mlrun python environment!

Next, start or connect to the MLRun service:

Start a local Docker MLRun service by running make mlrun-docker or edit the DBPATH and credentials in the mlrun.env file to use a remote MLRun service.

Install and run inside GitHub Codespaces

This project is configured to run "as is" inside GitHub Codespaces (see the config files under /.devcontainer). After the codespaces environment starts, you need to start a local MLRun service or connect to a remote one.

  • For a minimal, local MLRun (no UI), run: make mlrun-api
  • For a local Docker installation (requires 8 CPUs configuration or larger), run: make mlrun-docker. To view MLRun UI open the ports tab and browse to MLRun UI.
  • For a remote MLRun service, edit the DBPATH and credentials in the mlrun.env file.

The local MLRun service must be started every time the codespaces environment is restarted.

Install and run in Sagemaker Studio and Studio Labs

First, load this project into Sagemaker by clicking Open In Studio Lab or through Sagemaker UI.

After the project is loaded, open a console terminal and enter the project directory (using cd command) and type:

make conda-env

For a minimal setup, run MLRun service as a local process (no UI):

conda activate mlrun && make mlrun-api

To use a remote MLRun service, edit the DBPATH and credentials in the mlrun.env file.

Make sure all your tasks and Notebooks use the mlrun python environment !

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.4%
  • Other 0.6%