Releases: datahub-project/datahub
Releases · datahub-project/datahub
DataHub v0.4.2
Added
- #1711 feature(ingest): add bigquery ETL script @mars-lan
- #1712 feat(ingest): add PostgreSQL ETL script @mars-lan
- #1713 feat(ingest): replace custom hive-etl with sql-based ETL @mars-lan
- #1714 feat(ingest): add snowflake ETL script @mars-lan
- #1706 Implemented data process search feature @liangjun-jiang
- #1742 feat(gms): add postgres & mariadb supports to GMS @mars-lan
- #1752 build: build GitHub Pages from /docs directory @mars-lan
- #1745 feat(kafka-config): Add ability to configure other Kafka props @jsotelo
- #1754 Add documentation around the DataHub RFC process @jplaisted
Changed
- #1710 Refactor all ETL scripts to using Python 3 exclusively @mars-lan
- #1733 refactor(models): remove internal cluster model @hshahoss
- #1756 metadata-models 72.0.8 -> 80.0.0 @jywadhwani
- #1757 docs: add a sequence diagram and a description @liangjun-jiang
Removed
Fixed
- #1716 fix(py3): Bump ingestion Docker py dependency to 3.6 @keremsahin1
- #1726 fix: modify the etl script dependency @cobolbaby
- #1727 fix: correct the way to catch the exception @cobolbaby
- #1758 fix(ingestions): align the default kafka topics with PR @RealChrisL
DataHub v0.4.1
Added
- #1680 Data process entity @liangjun-jiang
- #1695 Implement data process graph feature
- #1708 feature(etl): add SQLAlchemy-based ingestion script @mars-lan
- #1707 Support for volta in web client @cptran777
- bbf7545 build: parallelize docker image builds @mars-lan
Changed
- #1700 Add missing updates from recent internal push @keremsahin1
- #1693 metadata-models 62.0.3 -> 72.0.8 @jywadhwani
- #1687 build(docker): refactor docker build scripts @mars-lan
- #1690 build(docker): refactor ingestion docker build script @mars-lan
- #1691 upgrade the version of neo4j @jywadhwani
- #1685 move the gradle plugin version to top level build.gradle @jywadhwani
- 63943a1 build: update workflows to build version-tagged docker images upon new release @mars-lan
Fixed
- #1697 fix: remove helm container command @jsotelo
- #1698 fix: add missing neo4j.host helm var @jsotelo
- #1709 [fix] load default picture link if not present @jywadhwani
- #1704 fix-DatasetSearchConfig class ref @geosmart
- f79b2c9 fix(ingestion): Fix sample MCE for data process @keremsahin1
- 867dbd0 fix: use tuple notations for union types @mars-lan
DataHub v0.4.0
Added
- #1568 Allow to store Quickstart dockers data in a folder for persistence @afranzi
- #1602 feat: support for Kubernetes-based deployment @bharatak
- #1608 add lineage hive @clojurians-org
- #1609 add support for kubernetes helm packaging @bharatak
- #1611 init jdbc generator @clojurians-org
- #1613 add oracle driver @clojurians-org
- #1629 feat: Converting MCE to a Spring boot Application @arunvasudevan
- #1635 feat: convert MAE application to springboot @arunvasudevan
- #1637 add postgresql support and force utf8 encode on non-utf8 locale @clojurians-org
- #1647 Add openldap-etl script and instruction @loftyet
- #1673 add DataProcess Urn @loftyet
- #1678 refactor(pdl): convert all pdsc to pdl @mars-lan
- #1677 feat(urn): add AzkabanFlow and AzkabanJob urn @hshahoss
Changed
- #1601 build: bypass testing datahub-web when running idea gradle task @mars-lan
- 6ab2ab6 build(mysql): Change mysql dependency from latest to 5.7 @keremsahin1
- #1610 metadata-models 54.0.1 -> 58.0.1 @jywadhwani
- #1616 metadata-models 58.0.1 -> 62.0.3 @jywadhwani
- #1619 refactor(gms): move gms restli resources @jywadhwani
- #1624 build(gms): rename JettyRunWar task to run @mars-lan
- #1626 refactor(frontend): fails loudly to help debug gms issue @mars-lan
- #1633 add field for ui and parser reference @clojurians-org
- #1641 migrate hive generator @clojurians-org
- #1662 style: add checkstyle and IDEA code style config @mars-lan
- #1664 build: update pegasus to v28 to add PDL support @mars-lan
- #1667 refactor: change the default log location @mars-lan
- #1669 refactor: use named volume instead of bind mount in quickstart @mars-lan
Deprecated
Removed
Fixed
- #1605 specify explicit avro lib for compatibility issue @jhsenjaliya
- d1cf628 Fix: Docker Quickstart - Sample Data Loading Error @RealChrisL
- ba33c7a Specify python version in mce-cli requirement.txt @RealChrisL
- #1621 fix: elasticsearch not starting on Mac @mars-lan
- #1622 build: pegasus plugin doesn't work well with gradle caching @mars-lan
- #1625 fix(gms): unable to find registered resources @mars-lan
- #1630 fix: Reduce gms & frontend docker image sizes @keremsahin1
- #1631 fix(Docker): Fixing 'dockerize not found' issue while starting @keremsahin1
- #1632 fix: Reduce mae-consumer & mce-consumer docker image sizes @bharatak
- #1646 fix(metadata-ingestion): pass schema_record to mce-cli cosumer @RealChrisL
- #1657 fix(quickstart): set utf8mb4 for mysql @e11it
- #1661 fix(urn): Move UrnCoercer into corresponding Urn class @mars-lan
- #1665 fix: use semantic instead of literal comparison in DefaultEqualityTester @mars-lan
- #1670 build: start enforcing checkstyle and fix all violations @mars-lan
- #1672 fix(frontend): Extract lastModified field from downstream/upstream aspect @keremsahin1
DataHub v0.3.1
Added
- 3765c1d Enable parallel Gradle build @keremsahin1
- #1575 Enable Failed Metadata Change Event for MCE Processor @arunvasudevan
- #1570 Use pictureLink property to show person picture @afranzi
- #1569 Show Dataset description in Dataset view @afranzi
- #1597 Ingestion tool to load JSON data to DataHub (in /contrib) @clojurians-org
- #1585 Nix sandbox (in /contrib) @clojurians-org
- 71f2d14 Added EventUtilsTest @keremsahin1
Changed
- 36a5d23 Migrate to getSnapshot API & remove dataset snapshot @keremsahin1
- b17b91f Bump gradle to 5.6.4 and pegasus to 27.7.18 @keremsahin1
- Documentation
Removed
- #1581 Drop LinkedIn internal fabrics @mars-lan
- 1fff6c9 Cleanup unused snapshot resources for corp users & groups @keremsahin1
Fixed
- #1590 Gradle Build Fails When Run in Parallel @RyanHolstien
- #1574 Fix typo and watchman error @clojurians-org
- #1564 Allow dashes in user urn @ben5448
- 3d64c45 Fix browse result pagination @keremsahin1
- fba5cd8 Handle optional aspects/fields for CorpUser gracefully @keremsahin1
DataHub v0.3.0
- Onboarded people as a top level entity
- Enabled people search
- Created Docker image for running ingestion pipeline
- Misc bug fixes
- Documentation updates
- Code cleanup
DataHub v0.2.0-alpha
- Added Neo4j graph indexing/querying pipeline
- Dataset downstream lineage is now powered by graph
- Added MySQL ETL example
- Updated docker-compose settings for low resource environments
- Misc bug fixes
DataHub v0.1.1-alpha
- Added Kafka crawler sample
- Added support for surfacing downstream dataset lineage using search. This is a stop-gap solution until neo4j support is added
Data Hub v0.1.0-alpha
First official release of Data Hub:
- Leveraging GMA architecture
- Backend: GMS implementation - support for dataset & user entities
- Frontend: Data Hub Web Application
- Pub-sub: Kafka
- Stream processing: MXE consumer jobs using Kafka Streams
- Generic modeling layer with CRUD on MySQL
- Search support using Elasticsearch
- Supported metadata sources: LDAP and Hive