Skip to content

Release 0.21.0rc0

Pre-release
Pre-release
Compare
Choose a tag to compare
@dhruvesh09 dhruvesh09 released this 30 Jan 20:40
· 58 commits to r0.21 since this release
220dff9

Version 0.21.0rc0

Major Features and Improvements

  • TFX version 0.21.0 will be the last version of TFX supporting Python 2.
  • Added support for RuntimeParameters to allow users can specify templated
    values at runtime. This is currently only supported in Kubeflow Pipelines.
    Currently, only attributes in ComponentSpec.PARAMETERS and the URI of
    external artifacts can be parameterized (component inputs / outputs can
    not yet be parameterized). See
    tfx/examples/chicago_taxi_pipeline/taxi_pipeline_runtime_parameter.py
    for example usage.
  • Users can access the parameterized pipeline root when defining the
    pipeline by using the pipeline.ROOT_PARAMETER placeholder in
    KubeflowDagRunner.
  • Users can pass appropriately encoded Python dict objects to specify
    protobuf parameters in ComponentSpec.PARAMETERS; these will be decoded
    into the proper protobuf type. Users can avoid manually constructing complex
    nested protobuf messages in the component interface.
  • Added support in Trainer for using other model artifacts. This enables
    scenarios such as warm-starting.
  • Updated trainer executor to pass through custom config to the user module.
  • Artifact type-specific properties can be defined through overriding the
    PROPERTIES dictionary of a types.artifact.Artifact subclass.
  • Added new example of chicago_taxi_pipeline on Google Cloud Bigquery ML.
  • Added support for multi-core processing in the Flink and Spark Chicago Taxi
    PortableRunner example.
  • Added a metadata adapter in Kubeflow to support logging the Argo pod ID as
    an execution property.
  • Added a prototype Tuner component and an end-to-end iris example.
  • Created new generic trainer executor for non estimator based model, e.g.,
    native Keras.
  • Updated to support passing tfma.EvalConfig in evaluator when calling TFMA.
  • Users can create a pipeline using a new experimental CLI command,
    template.

Bug fixes and other changes

  • Added support for an hparams artifact as an input to Trainer in
    preparation for tuner support.
  • Refactored common dependencies in the TFX dockerfile to a base image to
    improve the reliability of image building process.
  • Fixes missing Tensorboard link in KubeflowDagRunner.
  • Depends on apache-beam[gcp]>=2.17,<3.
  • Depends on ml-metadata>=0.21,<0.22.
  • Depends on tensorflow-data-validation>=0.21,<0.22.
  • Depends on tensorflow-model-analysis>=0.21,<0.22.
  • Depends on tensorflow-transform>=0.21,<0.22.
  • Depends on tfx-bsl>=0.21,<0.22.
  • Depends on pyarrow>=0.14,<0.15.
  • Removed tf.compat.v1 usage for iris and cifar10 examples.
  • CSVExampleGen: started using the CSV decoding utilities in tfx-bsl
    (tfx-bsl>=0.15.2)
  • Fixed problems with Airflow tutorial notebooks.
  • Added performance improvements for the Transform Component (for statistics
    generation).
  • Raised exceptions when container building fails.
  • Enhanced custom slack component by adding a kubeflow example.
  • Allowed windows style paths in Transform component cache.
  • Fixed bug in CLI (--engine=kubeflow) which uses hard coded obsolete image
    (TFX 0.14.0) as the base image.
  • Fixed bug in CLI (--engine=kubeflow) which could not handle skaffold
    response when an already built image is reused.
  • Allowed users to specify the region to use when serving with AI Platform.
  • Allowed users to give deterministic job id to AI Platform Training job.
  • System-managed artifact properties ("name", "state", "pipeline_name" and
    "producer_component") are now stored as ML Metadata artifact custom
    properties.
  • Fixed loading trainer and transformation functions from python module files
    without the .py extension.
  • Fixed some ill-formed visualization when running on KFP.
  • Removed system info from artifact properties and use channels to hold info
    for generating MLMD queries.
  • Rely on MLMD context for inter-component artifact resolution and execution
    publishing.
  • Added pipeline level context and component run level context.
  • Included test data for examples/chicago_taxi_pipeline in package.
  • Changed BaseComponentLauncher to require the user to pass in an ML
    Metadata connection object instead of a ML Metadata connection config.
  • Capped version of Tensorflow runtime used in Google Cloud integration to
    1.15.
  • Updated Chicago Taxi example dependencies to Beam 2.17.0, Flink 1.9.1, Spark
    2.4.4.
  • Fixed an issue where build_ephemeral_package() used an incorrect path to
    locate the tfx directory.
  • The ImporterNode now allows specification of general artifact properties.
  • Added 'tfx_executor', 'tfx_version' and 'tfx_py_version' labels for CAIP,
    BQML and Dataflow jobs submitted from TFX components.

Deprecations

Breaking changes

For pipeline authors

  • Standard artifact TYPE_NAME strings were reconciled to match their class
    names in types.standard_artifacts.
  • The "split" property on multiple artifacts has been replaced with the
    JSON-encoded "split_names" property on a single grouped artifact.
  • The execution caching mechanism was changed to rely on ML Metadata
    pipeline context. Existing cached executions will not be reused when running
    on this version of TFX for the first time.
  • The "split" property on multiple artifacts has been replaced with the
    JSON-encoded "split_names" property on a single grouped artifact.

For component authors

  • Artifact type name strings to the types.artifact.Artifact and
    types.channel.Channel classes are no longer supported; usage here should
    be replaced with references to the artifact subclasses defined in
    types.standard_artfacts.* or to custom subclasses of
    types.artifact.Artifact.

Documentation updates