-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reading dense array doesn't free memory #150
Comments
PyArray_NewFromDescr ignores the NPY_ARRAY_OWNDATA flag, so we need to set it manually so that NumPy frees the memory. Fixes #150
Thanks for the report and the great repro, fix inbound (#151). |
Fixes #150 PyArray_NewFromDescr ignores the NPY_ARRAY_OWNDATA flag, so we need to set it manually so that NumPy frees the memory. Also switch memory allocation to use PyDataMem_NEW/FREE. This should generally be the same as PyMem_Malloc, but it could end up different in a situation where the C ext is not linked against the same C stdlib. In that case, NumPy would not call the correct de-allocator when freeing the memory it gains ownership over.
Fixes #150 PyArray_NewFromDescr ignores the NPY_ARRAY_OWNDATA flag, so we need to set it manually so that NumPy frees the memory. Also switch memory allocation to use PyDataMem_NEW/FREE. This should generally be the same as PyMem_Malloc, but it could end up different in a situation where the C ext is not linked against the same C stdlib. In that case, NumPy would not call the correct de-allocator when freeing the memory it gains ownership over.
Fixes #150 PyArray_NewFromDescr ignores the NPY_ARRAY_OWNDATA flag, so we need to set it manually so that NumPy frees the memory. Also switch memory allocation to use PyDataMem_NEW/FREE. This should generally be the same as PyMem_Malloc, but it could end up different in a situation where the C ext is not linked against the same C stdlib. In that case, NumPy would not call the correct de-allocator when freeing the memory it gains ownership over.
Fixes #150 PyArray_NewFromDescr ignores the NPY_ARRAY_OWNDATA flag, so we need to set it manually so that NumPy frees the memory. Because we now don't (can't) set the flag in this call, simplify construction by using PyArray_SimpleNewFromData. Now that memory is being freed correctly, we must use the NumPy allocator (PyDataMem_NEW/FREE) so that de-allocation is matched.
Fixes #150 PyArray_NewFromDescr ignores the NPY_ARRAY_OWNDATA flag, so we need to set it manually so that NumPy frees the memory. Because we now don't (can't) set the flag in this call, simplify construction by using PyArray_SimpleNewFromData. Now that memory is being freed correctly, we must use the NumPy allocator (PyDataMem_NEW/FREE) so that de-allocation is matched.
Fixes #150 PyArray_NewFromDescr ignores the NPY_ARRAY_OWNDATA flag, so we need to set it manually so that NumPy frees the memory. Because we now don't (can't) set the flag in this call, simplify construction by using PyArray_SimpleNewFromData. Now that memory is being freed correctly, we must use the NumPy allocator (PyDataMem_NEW/FREE) so that de-allocation is matched.
Please re-open @ihnorton , I experience the same issue with current tiledb and sparse arrays in range of many GB's. > conda list | grep -E -i "tiledb|numpy"
numpy 1.19.4 py37h7e9df27_1 conda-forge
tiledb 2.1.3 h17508cd_0 conda-forge
tiledb-py 0.7.3 py37h11a8686_0 conda-forge The reproducer from @keenangraham again leaks memory so it can be re-used for testing. |
I have several basic tests for releasing array and context memory, and have done additional checking on sparse arrays specifically, but could not reproduce a situation like this where the memory trivially leaks every iteration. So, definitely not ruling it out, but considering in light of discussion in #440 right now. |
Also here: https://github.com/TileDB-Inc/TileDB-Py/blob/dev/tiledb/tests/test_libtiledb.py#L3178-L3214 |
@ihnorton In your tests you always use the same context. |
This line creates a new Ctx every iteration. However, the test is only checking that we keep the memory usage under 2x the initial usage (because RSS is not very reliable). |
I think you're right and my test is somehow flawed. |
Do you mind if we close this one and consolidate in #440? I will respond there. |
Hi,
I'm wondering if this is expected behavior or if you have any tips to fix. On Ubuntu 16, Python 3.7, and tiledb 0.4.1:
Create toy array:
Read from array:
Basically memory never seems to get released even when I don't assign the
data[:]
to any variable. I've tried playing around with garbage collection (import gc; gc.collect()
) but it seems Python is not aware. Have also tried doing some explicit closing of the DenseArray. Eventually have to reset Jupyter notebook to get memory to free.In my real use case I am iterating over several tileDBs and pulling full array data out from each, doing some transforms, and writing new tileDBs with transformed data. Works okay except every read call adds around 2GBs to the used memory and never releases it, causing the machine to eventually run out of memory. Current work around is to spin up new process for every iteration.
Thanks!
The text was updated successfully, but these errors were encountered: