forked from datahub-project/datahub
-
Notifications
You must be signed in to change notification settings - Fork 12
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'datahub-project:master' into master
- Loading branch information
Showing
21 changed files
with
408 additions
and
172 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
import FeatureAvailability from '@site/src/components/FeatureAvailability'; | ||
|
||
# AI Documentation | ||
|
||
<FeatureAvailability saasOnly /> | ||
|
||
:::info | ||
|
||
This feature is currently in closed beta. Reach out to your Acryl representative to get access. | ||
|
||
::: | ||
|
||
With AI-powered documentation, you can automatically generate documentation for tables and columns. | ||
|
||
<p align="center"> | ||
<iframe width="560" height="315" src="https://www.youtube.com/embed/_7DieZeZspY?si=Q5FkCA0gZPEFMj0Y" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe> | ||
</p> | ||
|
||
## Configuring | ||
|
||
No configuration is required - just hit "Generate" on any table or column in the UI. | ||
|
||
## How it works | ||
|
||
Generating good documentation requires a holistic understanding of the data. Information we take into account includes, but is not limited to: | ||
|
||
- Dataset name and any existing documentation | ||
- Column name, type, description, and sample values | ||
- Lineage relationships to upstream and downstream assets | ||
- Metadata about other related assets | ||
|
||
Data privacy: Your metadata is not sent to any third-party LLMs. We use AWS Bedrock internally, which means all metadata remains within the Acryl AWS account. We do not fine-tune on customer data. | ||
|
||
## Limitations | ||
|
||
- This feature is powered by an LLM, which can produce inaccurate results. While we've taken steps to reduce the likelihood of hallucinations, they can still occur. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
import FeatureAvailability from '@site/src/components/FeatureAvailability'; | ||
|
||
# AI Glossary Term Suggestions | ||
|
||
<FeatureAvailability saasOnly /> | ||
|
||
:::info | ||
|
||
This feature is currently in closed beta. Reach out to your Acryl representative to get access. | ||
|
||
::: | ||
|
||
The AI Glossary Term Suggestion automation uses LLMs to suggest [Glossary Terms](../glossary/business-glossary.md) for tables and columns in your data. | ||
|
||
This is useful for improving coverage of glossary terms across your organization, which is important for compliance and governance efforts. | ||
|
||
This automation can: | ||
|
||
- Automatically suggests glossary terms for tables and columns. | ||
- Goes beyond a predefined set of terms and works with your business glossary. | ||
- Generates [proposals](../managed-datahub/approval-workflows.md) for owners to review, or can automatically add terms to tables/columns. | ||
- Automatically adjusts to human-provided feedback and curation (coming soon). | ||
|
||
## Prerequisites | ||
|
||
- A business glossary with terms defined. Additional metadata, like documentation and existing term assignments, will improve the accuracy of our suggestions. | ||
|
||
## Configuring | ||
|
||
1. **Navigate to Automations**: Click on 'Govern' > 'Automations' in the navigation bar. | ||
|
||
<p align="center"> | ||
<img width="30%" src="https://raw.githubusercontent.com/datahub-project/static-assets/main/imgs/automation/saas/automations-nav-link.png"/> | ||
</p> | ||
|
||
2. **Create the Automation**: Click on 'Create' and select 'AI Glossary Term Suggestions'. | ||
|
||
<p align="center"> | ||
<img width="40%" src="https://raw.githubusercontent.com/datahub-project/static-assets/main/imgs/automation/saas/ai-term-suggestion/automation-type.png"/> | ||
</p> | ||
|
||
3. **Configure the Automation**: Fill in the required fields to configure the automation. | ||
The main fields to configure are (1) what terms to use for suggestions and (2) what entities to generate suggestions for. | ||
|
||
<p align="center"> | ||
<img width="50%" src="https://raw.githubusercontent.com/datahub-project/static-assets/main/imgs/automation/saas/ai-term-suggestion/automation-config.png"/> | ||
</p> | ||
|
||
4. Once it's enabled, that's it! You'll start to see terms show up in the UI, either on assets or in the proposals page. | ||
|
||
<p align="center"> | ||
<img width="70%" src="https://raw.githubusercontent.com/datahub-project/static-assets/main/imgs/automation/saas/ai-term-suggestion/term-proposals.png"/> | ||
</p> | ||
|
||
## How it works | ||
|
||
The automation will scan through all the datasets matched by the configured filters. For each one, it will generate suggestions. | ||
If new entities are added that match the configured filters, those will also be classified within 24 hours. | ||
|
||
We take into account the following metadata when generating suggestions: | ||
|
||
- Dataset name and description | ||
- Column name, type, description, and sample values | ||
- Glossary term name, documentation, and hierarchy | ||
- Feedback loop: existing assignments and accepted/rejected proposals (coming soon) | ||
|
||
Data privacy: Your metadata is not sent to any third-party LLMs. We use AWS Bedrock internally, which means all metadata remains within the Acryl AWS account. We do not fine-tune on customer data. | ||
|
||
## Limitations | ||
|
||
- A single configured automation can classify at most 10k entities. | ||
- We cannot do partial reclassification. If you add a new column to an existing table, we won't regenerate suggestions for that table. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.