-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat(ingest/dremio): Dremio Source Ingestion (#11598)
Co-authored-by: Jonny Dixon <jonny.dixon@acryl.io> Co-authored-by: Jonny Dixon <45681293+acrylJonny@users.noreply.github.com> Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com>
- Loading branch information
1 parent
5c58128
commit 5d17ecb
Showing
29 changed files
with
11,812 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
### Concept Mapping | ||
|
||
Here's a table for **Concept Mapping** between Dremio and DataHub to provide a clear overview of how entities and concepts in Dremio are mapped to corresponding entities in DataHub: | ||
|
||
| Source Concept | DataHub Concept | Notes | | ||
| -------------------------- | --------------- | ---------------------------------------------------------- | | ||
| **Physical Dataset/Table** | `Dataset` | Subtype: `Table` | | ||
| **Virtual Dataset/Views** | `Dataset` | Subtype: `View` | | ||
| **Spaces** | `Container` | Mapped to DataHub’s `Container` aspect. Subtype: `Space` | | ||
| **Folders** | `Container` | Mapped as a `Container` in DataHub. Subtype: `Folder` | | ||
| **Sources** | `Container` | Represented as a `Container` in DataHub. Subtype: `Source` | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
### Starter Receipe for Dremio Cloud Instance | ||
|
||
``` | ||
source: | ||
type: dremio | ||
config: | ||
# Authentication details | ||
authentication_method: PAT # Use Personal Access Token for authentication | ||
password: <your_api_token> # Replace <your_api_token> with your Dremio Cloud API token | ||
is_dremio_cloud: True # Set to True for Dremio Cloud instances | ||
dremio_cloud_project_id: <project_id> # Provide the Project ID for Dremio Cloud | ||
# Enable query lineage tracking | ||
include_query_lineage: True | ||
#Optional | ||
source_mappings: | ||
- platform: s3 | ||
source_name: samples | ||
# Optional | ||
schema_pattern: | ||
allow: | ||
- "<source_name>.<table_name>" | ||
sink: | ||
# Define your sink configuration here | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
### Setup | ||
|
||
This integration pulls metadata directly from the Dremio APIs. | ||
|
||
You'll need to have a Dremio instance up and running with access to the necessary datasets, and API access should be enabled with a valid token. | ||
|
||
The API token should have the necessary permissions to **read metadata** and **retrieve lineage**. | ||
|
||
#### Steps to Get the Required Information | ||
|
||
1. **Generate an API Token**: | ||
|
||
- Log in to your Dremio instance. | ||
- Navigate to your user profile in the top-right corner. | ||
- Select **Generate API Token** to create an API token for programmatic access. | ||
|
||
2. **Permissions**: | ||
|
||
- The token should have **read-only** or **admin** permissions that allow it to: | ||
- View all datasets (physical and virtual). | ||
- Access all spaces, folders, and sources. | ||
- Retrieve dataset and column-level lineage information. | ||
|
||
3. **Verify External Data Source Permissions**: | ||
- If Dremio is connected to external data sources (e.g., AWS S3, relational databases), ensure that Dremio has access to the credentials required for querying those sources. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
source: | ||
type: dremio | ||
config: | ||
# Coordinates | ||
hostname: localhost | ||
port: 9047 | ||
tls: true | ||
|
||
# Credentials with personal access token(recommended) | ||
authentication_method: PAT | ||
password: pass | ||
# OR Credentials with basic auth | ||
# authentication_method: password | ||
# username: user | ||
# password: pass | ||
|
||
#For cloud instance | ||
#is_dremio_cloud: True | ||
#dremio_cloud_project_id: <project_id> | ||
|
||
include_query_lineage: True | ||
|
||
#Optional | ||
source_mappings: | ||
- platform: s3 | ||
source_name: samples | ||
|
||
#Optional | ||
schema_pattern: | ||
allow: | ||
- "<source_name>.<table_name>" | ||
|
||
sink: | ||
# sink configs |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
Oops, something went wrong.