You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This commit was created on GitHub.com and signed with GitHub’s verified signature.
The key has expired.
Major Features and Improvements
Introduced experimental Python function component decorator (@component
decorator under tfx.dsl.component.experimental.decorators) allowing
Python function-based component definition.
Added the experimental TemplatedExecutorContainerSpec executor class that
supports structural placeholders (not Jinja placeholders).
Added the experimental function "create_container_component" that
simplifies creating container-based components.
Implemented a TFJS rewriter.
Added the scripts/run_component.py script which makes it easy to run the
component code and executor code. (Similar to scripts/run_executor.py)
Added support for container component execution to BeamDagRunner.
Introduced experimental generic Artifact types for ML workflows.
Added support for float execution properties.
Bug fixes and other changes
Migrated BigQueryExampleGen to the new (experimental) ReadFromBigQuery
PTramsform when not using Dataflow runner.
Enhanced add_downstream_node / add_upstream_node to apply symmetric changes
when being called. This method enables task-based dependencies by enforcing
execution order for synchronous pipelines on supported platforms. Currently,
the supported platforms are Airflow, Beam, and Kubeflow Pipelines. Note that
this API call should be considered experimental, and may not work with
asynchronous pipelines, sub-pipelines and pipelines with conditional nodes.
Added the container-based sample pipeline (download, filter, print)
Removed the incomplete cifar10 example.
Removed python-snappy from [all] extra dependency list.
Tests depends on apache-airflow>=1.10.10,<2;
Removed test dependency to tzlocal.
Fixes unintentional overriding of user-specified setup.py file for Dataflow
jobs when running on KFP container.
Made ComponentSpec().inputs and .outputs behave more like real dictionaries.
Depends on kerastuner>=1,<2.
Depends on pyyaml>=3.12,<6.
Depends on apache-beam[gcp]>=2.21,<3.
Depends on grpcio>=2.18.1,<3.
Depends on kubernetes>=10.0.1,<12.
Depends on tensorflow>=1.15,!=2.0.*,<3.
Depends on tensorflow-data-validation>=0.22.0,<0.23.0.
Depends on tensorflow-model-analysis>=0.22.1,<0.23.0.
Depends on tensorflow-transform>=0.22.0,<0.23.0.
Depends on tfx-bsl>=0.22.0,<0.23.0.
Depends on ml-metadata>=0.22.0,<0.23.0.
Fixed a bug in io_utils.copy_dir which prevent it to work correctly for
nested sub-directories.
Breaking changes
For pipeline authors
Changed custom config for the Do function of Trainer and Pusher to accept
a JSON-serialized dict instead of a dict object. This also impacts all the
Do functions under tfx.extensions.google_cloud_ai_platform and tfx.extensions.google_cloud_big_query_ml. Note that this breaking
change occurs at the signature of the executor's Do function. Therefore, if
the user did not customize the Do function, and the compile time SDK version
is aligned with the run time SDK version, previous pipelines should still
work as intended. If the user is using a custom component with customized
Do function, custom_config should be assumed to be a JSON-serialized
string from next release.
For users of BigQueryExampleGen, --temp_location is now a required Beam
argument, even for DirectRunner. Previously this argument was only required
for DataflowRunner. Note that the specified value of --temp_location
should point to a Google Cloud Storage bucket.
Revert current per-component cache API (with enable_cache, which was only
available in tfx>=0.21.3,<0.22), in preparing for a future redesign.
For component authors
Converted the BaseNode class attributes to the constructor parameters. This
won't affect any components derived from BaseComponent.
Changed the encoding of the Integer and Float artifacts to be more portable.
Documentation updates
Added concept guides for understanding TFX pipelines and components.
Added guides to building Python function-based components and
container-based components.
Added BulkInferrer component and TFX CLI documentation to the table of
contents.