DataHub v0.8.27
Release Highlights
Notable UI-Based Features
-
The User Page has a new look! You can now quickly filter & search for entities owned by a User, update/edit the user profile, and see details of which Groups the User belongs to. See it in action here.
-
Search for Entities by Owner - Easily filter search results by User/Group Owner
-
Edit existing Glossary Terms - you can now edit/update Glossary Term descriptions via the UI. Future work will allow creating Terms from the UI as well - stay tuned!
-
Improved Metadata Analytics - keep tabs on your DataHub entities across Domains, Platforms, Glossary Terms, Environments, & more. Check out the new & improved Analytics tab!
Notable Metadata Model & Ingestion-Based Features
-
ClickHouse integration is now incubating! This is a 100% Community-led integration - huge shoutout to @ne1r0n & @havramar for pushing initial code & moving this work through!
-
Kafka Stateful Ingestion - shoutout to @claudio-benfatto for building this out!
-
Extract Airflow Task Description - big thanks to @guidoturtu for the contrib!
-
BigQuery: profile latest Partition/Shard - We know that Data Profiling can be computationally expensive for partitioned/sharded BQ instances. We now support profiling only the latest partition/shard to minimize processing load.
Notable Docs Updates
-
NEW! Tips for Searching within DataHub - Ever wondered how to make the most of Searching within DataHub? Check out this doc put together by @xiphl
-
Improvements to Metadata Model Docs - This is a huge win for the Community - we’re taking a big step toward providing auto-generated & curated docs related to the Metadata Model - take a look here.
What's Changed
- feat(deprecation): Entity Deprecation Backend by @jjoyce0510 in #4073
- Fixed auto complete pr coments by @Ankit-Keshari-Vituity in #4072
- fix(ingestion): enforce correct behaviour for commit policy by @claudio-benfatto in #4092
- fix(aggregate): Fix NPE in aggregate api by @dexter-mh-lee in #4095
- add Haibo corp by @wangqinghuan in #4082
- fix(ingestion): Add psutil dependency required for stateful ingestion reporting. by @rslanka in #4099
- docs(kafka): add example for using domains, change for clarity by @anshbansal in #4100
- feat(ui): Add display name & title to editable corp user properties. by @jjoyce0510 in #4097
- fix(ingestion): Enhance BigQuery source logging. by @rslanka in #4101
- fix(glossary terms): fix add glossary term flow by @gabe-lyons in #4106
- (docs) Add Zynga & Tableau logos by @maggiehays in #4109
- fix(ingestion): Add sql lineage to redshift-usage plugin by @dexter-mh-lee in #4103
- feat(ui): Add svg datahub satellite loading logo by @eburairu in #4067
- fix(ingestion): resolve oracle issue with large view definitions by @hsheth2 in #4027
- fix(ingest): ignore Postgres information_schema tables by default by @kevinhu in #4069
- fix(ingest) - close event loops in Okta source and add additional debug logging by @aditya-radhakrishnan in #4077
- chore(ingest): remove unused groupby_unsorted utility by @hsheth2 in #4011
- fix(docs): fixing metadata model doc generation script and updating png by @swaroopjagadish in #4120
- fix(ci): fix formatting in doc generation action yaml by @swaroopjagadish in #4121
- fix(ci): fix formatting for action yaml by @swaroopjagadish in #4122
- feat(Tags/Terms): Backend support for tag & term mutations by @jjoyce0510 in #4096
- docs(backup): add doc for taking backup by @anshbansal in #3917
- fix(docs): make intro to metadata ingestion easier for beginners by @anshbansal in #4039
- fix(ingest) Athena: db filter was not applied by @treff7es in #4127
- fix(ui) - move book logo to right of glossary term by @aditya-radhakrishnan in #4125
- fix(docs) Fix doc on modelDocUpload by @daha in #4112
- fix(cypress): force clicks on tag mutation test by @gabe-lyons in #4102
- feat(ingest) Athena: Getting table properties for Athena datasets by @treff7es in #4123
- fix(logging): Fix Restli Logging Filter to print full stack trace on error by @dexter-mh-lee in #4136
- docs : markdown fixes for db retention table by @satyamkrishna in #4133
- docs : markdown fixes for db retention table by @satyamkrishna in #4148
- feat(ingestion): Kafka stateful ingestion by @claudio-benfatto in #4028
- fix(docs): update graphql docs to reference new graphql file by @gabe-lyons in #4139
- Feature/oss/update to v2 endpoints by @RyanHolstien in #4128
- fix(cli): add timeout for telemetry calls by @anshbansal in #4135
- chore(cli): update default cli version pinned in the UI based ingestion by @anshbansal in #4150
- fix(docs): fix example of delta lake by @anshbansal in #4149
- fix(ui): Fix cutoff profiling axis labels by @jjoyce0510 in #4154
- feat(ingest): Glue - Support for domains and containers by @treff7es in #4110
- feat(ui): Host platform images on datahub-web-react by @ngamanda in #4118
- bug(seedData): adds a key to the root user seed data and fixes corner case check for missing key aspects by @RyanHolstien in #4162
- UI Fix: Modal close on Enter press, autofocus on modal, added split panel, alignment of button by @Ankit-Keshari-Vituity in #4155
- feat(ui): Edit glossary term descriptions via UI by @jjoyce0510 in #4156
- Update querying-entities.md -> Documentation Error by @buggythepirate in #4157
- refactor(metadata-io/test): common ElasticsearchContainer and ability to override from environment. by @stephenp-gr in #4152
- feat(ingestion): Add support for snowflake view lineage. by @rslanka in #4163
- Update the doc to including options to include Views by @cuong-pham in #4164
- fix(ingest): Use lower-case dataset names in the dataset urns for all SQL-styled datasets. by @rslanka in #4140
- chore(ingestion): upgrade mypy by @hsheth2 in #4141
- ci(ingestion): fix airflow 1 deps for tox by @hsheth2 in #4083
- fix(ingest) Glue: Removing sqlalchemy dependency from glue by @treff7es in #4168
- fix(ingest) Athena: Generating propert containers for Athena by @treff7es in #4167
- Feature/users and groups UI updated as per new design by @ShubhamThakre in #4134
- chore(docs): various cleanup for docs-website by @hsheth2 in #4143
- bugfix(logging): reduce log noise from authentication chain by @RyanHolstien in #4173
- bug(glossaryTermLabels): fix glossary term labels missing and add cypress test by @RyanHolstien in #4171
- fixes(ui): Misc UI fixes + Adding Owners to Search Filters by @jjoyce0510 in #4175
- BugFixes/user-and-groups-minor-ui-fixes by @ShubhamThakre in #4181
- feat(groups): Adding editable group properties in the backend by @jjoyce0510 in #4166
- fix(python build): Pinning markupsafe by @treff7es in #4188
- feat(analytics): Improve analytics page by adding more charts regarding metadata ingested by @dexter-mh-lee in #4176
- docs(model): auto-generated docs and hand-written docs for the metada… by @swaroopjagadish in #4189
- minor fixes(ui): Small UI display fixes by @jjoyce0510 in #4190
- fix(ui): Return empty search response on invalid characters in search by @jjoyce0510 in #4193
- refactor(spark-lineage): enhance logging and documentation by @MugdhaHardikar-GSLab in #4113
- fix(ui): Correctly display user photo on "list users" screen. by @jjoyce0510 in #4195
- fix(ingest) Snowflake: Handle external S3 bucket lineage for "External Tables". by @rslanka in #4192
- fix(Ingestion): Elastic http/https host support by @abiwill in #4191
- Pinning down elasticsearch to less than 8.0.0 by @pppsunil in #4182
- test(airflow): fix airflow version parsing by @hsheth2 in #4142
- fix(delete): Fixing NPE on delete urns path by @jjoyce0510 in #4197
- docs(website): Move company logos into tabs categorized by industry by @jeffmerrick in #4174
- fix(profile) Bigquery: Setting bigquery temp schema if it is set to fix limit and offset in profiling by @treff7es in #4161
- feat(ingest): record bucketed profiling runtimes by @kevinhu in #4068
- (docs) How to search better with search bar by @xiphl in #4200
- feat(ingest) Bigquery: Ignore temporary tables from lineage and connect edges directly by @treff7es in #4160
- feat(ingest): Not failing on table/view ingestion error by @treff7es in #4185
- fix(ingest): map additional Postgres types by @kevinhu in #4179
- feat(lineage): Add feature for lineage to capture airflow task description. by @guidoturtu in #4147
- add initial clickhouse support by @ne1r0n in #4057
- ci: update rename-namespace.sh to specify /bin/bash by @stephenp-gr in #4153
- fix(ingest): superset - adding missing greenlet dep by @swaroopjagadish in #4203
- fix(docs): fix config typo for stateful ingestion by @jieqiu0630 in #4202
- fix(dbt): dont product key aspects if the entity has no other aspects by @gabe-lyons in #4217
- feat(ingest): Add support for non-default schema registry subject name strategies to the Kafka source by @rslanka in #4215
- fix(ingest): Revert use lower-case dataset names in the dataset urns for all SQL-styled datasets. by @rslanka in #4218
- Add pagination to group ownerships by @mmmeeedddsss in #4199
- feat(Spark-smoke-test): add spark smoke test by @MugdhaHardikar-GSLab in #4158
- feat(ingest): add Python libs for Urns by @tc350981 in #4172
- feat(GraphQL API): Adding group ownership by @jjoyce0510 in #4219
- fix(ui): Wrap homepage cards on long text by @jjoyce0510 in #4220
- fix(docs): add mention of list-runs command in CLI page by @anshbansal in #4210
- fix(ingestion): enable compat with avro 1.11 by @hsheth2 in #4205
New Contributors
- @Ankit-Keshari-Vituity made their first contribution in #4072
- @wangqinghuan made their first contribution in #4082
- @daha made their first contribution in #4112
- @satyamkrishna made their first contribution in #4133
- @ngamanda made their first contribution in #4118
- @buggythepirate made their first contribution in #4157
- @stephenp-gr made their first contribution in #4152
- @cuong-pham made their first contribution in #4164
- @pppsunil made their first contribution in #4182
- @guidoturtu made their first contribution in #4147
- @ne1r0n made their first contribution in #4057
- @jieqiu0630 made their first contribution in #4202
- @mmmeeedddsss made their first contribution in #4199
- @tc350981 made their first contribution in #4172
Full Changelog: v0.8.26...v0.8.27