Skip to content

DataHub v0.8.25

Compare
Choose a tag to compare
@shirshanka shirshanka released this 07 Feb 22:32
· 5455 commits to master since this release
ec062b6

Known Issues

  • Adding Glossary Terms to schema fields does not work with this version due to a bug. Upgrade to v0.8.26 for the fix.

Release Highlights

Buckle up, folks! v0.8.25 brings some very exciting (and highly-requested!) updates.

Notable UI-Based Features

  • UI-based Ingestion - as demoed in December Town Hall, we now support creating, configuring, scheduling, & executing batch metadata ingestion using the DataHub user interface. This makes getting metadata into DataHub easier by minimizing the overhead required to operate custom integration pipelines.
  • Data Domains - DataHub now supports grouping data assets into logical collections called Domains. Domains are curated, top-level folders or categories where related assets can be explicitly grouped. Read the guide here!
  • Data Containers are now supported! This is the physical grouping of entities, ex. a Schema is a container of 1 or more Datasets; a Dashboard is a container of 1 or more Charts.

Notable Metadata Model & Ingestion-Based Features

  • Data Quality test results are now supported in the DataHub metadata model. This is the first milestone toward surfacing Dataset & Column-level Data Quality results in the UI (read full scope of work here). Future releases will include a Great Expectations integration & UI support - we’re on track to complete this in Q1 as planned.
  • Avro files are now supported in the Data Lake File ingestion source
  • Ingest metadata from multiple instances of the same platform type. This has been a very common use case within the Community - you can now differentiate multiple instances of the same platform type! If you already have pre-existing entries, use the datahub migrate command to migrate them over to platform instances.
  • Ignore users from Top Users calculation
    • feat(ingestion): Adding ability to ignore users from top users calculation by @treff7es in #3735
  • BigQuery - Data Profiling on only the latest partition/shard
    • feat(ingestion) bigquery: Profiling only the latest partition/shard on bigquery by @treff7es in #3930
  • (feat)(Business Glossary) add tabular schema and new UI for business glossary by @saxo-lalrishav in #3813

Notable Fixes

  • Fix to support View in Looker * feat(looker): Adding optional Looker external url base url config by @jjoyce0510 in #3985
  • fix(graphql): support group display name in ownership by @thomasplarsson in #3979
  • fix(profiling): Enabling profiling for low cardinality number columns by @treff7es in #3990
  • fix(ingestion): match default username for Azure OIDC and Azure ingestion source by @iasoon in #3926

DataHub Usage Guides

What's Changed

New Contributors

Full Changelog: v0.8.24...v0.8.25