-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OOM (Out of Memory Errors) in DataHub 0.13+ when ingesting Bigquery metadata #11147
Comments
Same behaviour in 0.14.x (tested in 0.14.0.1, 0.14.0.2) |
@vejeta could you generate a memory profile - see https://datahubproject.io/docs/metadata-ingestion/docs/dev_guides/profiling_ingestions/ |
Thanks, @hsheth2 for the suggestion! I will do that with the latest version (0.14.1) |
Just out of curiosity, to see if this is easier, is there a published docker image with the memray dependency included?
It that doesn't exist, no worries, I will see to test it with our own docker image with memray included. |
memray is installed when you install the datahub debug sub-package: datahub/metadata-ingestion/setup.py Line 895 in cf1d296
Check these docs on how to use it: https://datahubproject.io/docs/metadata-ingestion/docs/dev_guides/profiling_ingestions/#how-to-use (opt for the |
For starters, if you could run Also - we separately have been working on a bigquery lineage/usage v2 implementation (enabled with |
@hsheth2 I see. Thanks a lot. This has been generated with 0.13.3. I tried with 0.14.1 but our GMS is not in 0.14.x yet, so there are some incompatibilities when sending the metadata. Hopefully, it is enough to locate the bottleneck. Attached the flamegraph: |
@vejeta looking at the flamegraph, it looks like the OOM is being caused by our SQL parser. If that's the case, use_queries_v2 probably won't help. We made a few tweaks to the SQL parser since 0.13.3 which might help, but I'm not too confident on that. If you could run with |
Thanks a lot @hsheth2, here it goes the last 4000 lines with I had to "anonymize" the field names, hopefully it still can give a clue of what introduced the memory leak from 0.12.x to 0.13+ |
Thanks a lot @hsheth2 for the quick feedback on this issue. I am going to work on migrating from 0.13.x to the latest 0.14.x so I can provide feedback on this. Will the tag 0.14.1.3 be published soon? |
That's the CLI version, which is published here https://pypi.org/project/acryl-datahub/0.14.1.3/ |
Describe the bug
We have an ingestion job, running periodically in Kubernetes, it runs fine with DataHub 0.12.x versions.
You can see the memory stays stable under 1GiB during the execution of the job.
However, with DataHub 0.13.x versions it always fail with Error 137, out of memory.
We have tried to increase the memory to 20GiB, but there must be a memory leak, because it alwasys run out of memory.
To Reproduce
Steps to reproduce the behavior:
Note, we have also tried, the latest release from the latest commit from master, and the issue is still present.
Expected behavior
Not having a Out of Memory error in DataHub 0.13.3
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Additional context
Logs with some summary after a successful execution with DataHub 0.12.x
The text was updated successfully, but these errors were encountered: