Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PoC evaluate alternatives of identifying, in Airflow, a DAG was created with DAG Factory #210

Open
2 tasks
tatiana opened this issue Aug 1, 2024 · 2 comments

Comments

@tatiana
Copy link
Collaborator

tatiana commented Aug 1, 2024

Context

At the moment users don't have visibility of which DAGs were created using Python directly, and which DAGs were created using DAG Factory. The goal of this ticket is

Some options we brainstormed:

  • Generate some events during the conversion of DAG Factory using Add Scarf based telemetry apache/airflow#39510 (would only work for newer versions of Airflow, or we'd need to have DAG Factory using this tool to emit them)
  • We could have a Astro-specific solution. If Astro exposed some parameter to access the next layer, that would allow us to push data to SF metadata db. Then, probably from DAG Factory code we push some data (e.g. deployment data, DAG source e.g. DAG Factory)
  • Assoaciate metadata / DAG factory to either DAG runs or Task runs (check if we could leverage Implement Metadata to emit runtime extra apache/airflow#38650 in any way). In this case, we'd need to confirm with the Data team if this data is already stored in Snowflake - or if it could be stored.

This starts as a PoC, the idea is to identify strategies and have a way, that would require the least change and configuration to end-users, so we could collect (at least in Astro) information about which DAGs were created with DAG Factory.

Acceptance criteria

  • Summary of approaches attempted
  • Working PoC sharing this data in a way we can consume from Astro SF
@tatiana
Copy link
Collaborator Author

tatiana commented Aug 5, 2024

@cmarteepants to write a detailed issue for us to use Scarf

@cmarteepants
Copy link
Collaborator

Created #214 for Scarf integration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants