Skip to content

TFX 1.0.0-rc2

Pre-release
Pre-release
Compare
Choose a tag to compare
@dhruvesh09 dhruvesh09 released this 16 Jul 20:03
· 2 commits to r1.0.0 since this release
ca4ee1b

Major Features and Improvements

  • Added tfx.v1 Public APIs, please refer to
    API doc for details.
  • Transform component now computes pre-transform and post-transform statistics
    and stores them in new, indvidual outputs ('pre_transform_schema',
    'pre_transform_stats', 'post_transform_schema', 'post_transform_stats',
    'post_transform_anomalies'). This can be disabled by setting
    disable_statistics=True in the Transform component.
  • BERT cola and mrpc examples now demonstrate how to calculate statistics for
    NLP features.
  • TFX CLI now supports
    Vertex Pipelines.
    use it with --engine=vertex flag.
  • Telemetry: Only first-party tfx component's executor telemetry will be
    collected. All other executors will be recorded as third_party_executor.
    For labels longer than 63, keep first 63 characters (instead of last 63
    characters before).
  • Supports text type (use proto json string format) RuntimeParam for protos.
  • Combined/moved taxi's runtime_parameter, kubeflow_local and kubleflow_gcp
    example pipelines into one penguin_pipeline_kubeflow example
  • Transform component now supports passing stats_options_updater_fn directly
    as well as through the module file.
  • Placeholders support accessing artifact property and custom property.
  • Removed the extra node information in IR for KubeflowDagRunner, to reduce
    size of generated IR.

Breaking Changes

  • Removed unneccessary default values for required component input Channels.
  • The _PropertyDictWrapper internal wrapper for component.inputs and
    component.outputs was removed: component.inputs and component.outputs
    are now unwrapped dictionaries, and the attribute accessor syntax (e.g.
    components.outputs.output_name) is no longer supported. Please use the
    dictionary indexing syntax (e.g. components.outputs['output_name'])
    instead.

For Pipeline Authors

  • N/A

For Component Authors

  • Apache Beam support is migrated from TFX Base Components and Executors to
    dedicated Beam Components and Executors. BaseExecutor will no longer embed
    beam_pipeline_args. Custom executors for Beam powered components should
    now extend BaseBeamExecutor instead of BaseExecutor.

Deprecations

  • Deprecated nested RuntimeParam for Proto, Please use text type (proto json
    string) RuntimeParam instead of Proto dict with nested RuntimeParam in it.

Bug Fixes and Other Changes

  • Forces keyword arguments for AirflowComponent to make it compatible with
    Apache Airflow 2.1.0 and later.
  • Fixed issue where passing analyzer_cache to tfx.components.Transform
    before there are any Transform cache artifacts published would fail.
  • Included type information according to PEP-561. However, protobuf generated
    files don't have type information, and you might need to ignore errors from
    them. For example, if you are using mypy, see
    the related doc.
  • Removed six dependency.
  • Depends on apache-beam[gcp]>=2.29,<3.
  • Depends on google-cloud-bigquery>=1.28.0,<2.21
  • Depends on ml-metadata>=1.0.0,<1.1.0.
  • Depends on protobuf>=3.13,<4.
  • Depends on struct2tensor>=0.31.0,<0.32.0.
  • Depends on tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,<3.
  • Depends on tensorflow-data-validation>=1.0.0,<1.1.0.
  • Depends on tensorflow-hub>=0.9.0,<0.13.
  • Depends on tensorflowjs>=3.6.0,<4.
  • Depends on tensorflow-model-analysis>=0.31.0,<0.32.0.
  • Depends on tensorflow-serving-api>=1.15,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,<3.
  • Depends on tensorflow-transform>=1.0.0,<1.1.0.
  • Depends on tfx-bsl>=1.0.0,<1.1.0.

Documentation Updates

  • Update the Guide of TFX to adopt 1.0 API.
  • TFT and TFDV component documentation now describes how to
    configure pre-transform and post-transform statistics, which can be used for
    validating text features.