Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel extract #967

Merged
merged 47 commits into from
Jun 10, 2024
Merged

Parallel extract #967

merged 47 commits into from
Jun 10, 2024

Conversation

jgostick
Copy link
Member

@jgostick jgostick commented Jun 5, 2024

This is a redo #958 from @Arenhart. That PR had quite a few conflicts with the dev branch, which changed quite a lot while they were working, so I pulled that branch from their repo, fixed it locally, and am now pushing it to our repo. This feels like the right way to make modifications to a PR, but I feel a bit bad that the PR will look like it originated from me. I guess the important thing is that all @Arenhart 's work will still be acknowledged in the commit history.

In addition to fixing the conflicts, I also fixed a lot of pep8 sins, and cleaned up the imports throughout.

I also made a few major changes which might break @Arenhart 's code:

  • I changed all the imports to try importing pyedt an fall back on regular edt if not found. This is because pyedt is not yet "pip installable" so we can't really make it a dependency of PoreSpy since the average user would struggle to install a github package.
  • I renamed the main function to regions_to_network_parallel, and re-added regions_to_network as well. The reason I did this is because their function uses a special function in their pyedt which is not installed by default. So, if a person does not have pyedt install they will not be able to use the parallel version, but they can still use the legacy regions_to_network function. Perhaps in future we can work to get pyedt on PyPI then we no longer need to worry about this.

Rafael Arenhart and others added 27 commits August 5, 2022 21:38
# Conflicts:
#	porespy/generators/_cylinder.py
#	porespy/generators/_pseudo_packings.py
#	porespy/networks/__init__.py
#	requirements/conda_requirements.txt
#	requirements/pip_requirements.txt
#	setup.py
#	src/porespy/generators/_imgen.py
#	src/porespy/io/_funcs.py
#	src/porespy/metrics/_meshtools.py
#	src/porespy/metrics/_regionprops.py
#	src/porespy/networks/_getnet.py
#	src/porespy/simulations/_drainage.py
#	src/porespy/simulations/_ibip_gpu.py
#	src/porespy/tools/__init__.py
#	src/porespy/tools/_funcs.py
#	src/porespy/tools/_marching_cubes.py
#	src/porespy/tools/_marching_squares.py
#	src/porespy/tools/marching_cubes_templates.dat
#	test/unit/test_simulations_ibip.py
@Arenhart
Copy link
Contributor

Arenhart commented Jun 5, 2024

Hi, thank you for your work on this PR.
I will test it in the next days.
I also added Pyedt to pipy recently (https://pypi.org/project/pyedt/) I'll add it to the requirements, and check if it is working, if so, I can make a new PR to this PR to implement it.
I will also pull this branch into ours, so we won't be working with different codes.

@jgostick
Copy link
Member Author

jgostick commented Jun 5, 2024

@Arenhart, don't do anything yet. If pyedt is now on PyPI, I will add that to this PR. If it works, then I can also change the regions_to_network_parallel back to just regions_to_network, and rename our original one to regions_to_network_legacy or orig. (First I will get all the tests passing)

@jgostick
Copy link
Member Author

jgostick commented Jun 5, 2024

@Arenhart, I am in the process of adding pyedt as a dependency, and there is one problem: in your regions_to_network function you use pyedt.jit_edt_cpu, but this function does not seem to be part of the package you deployed to pypi. Thoughts?

@jgostick
Copy link
Member Author

jgostick commented Jun 6, 2024

@Arenhart, I have answered my own question...I found the jit_edt_cpu function on the parallel_develop branch. As it stands now the regions_to_network that you wrote cannot run without that. Hopefully you plan to merge this branch sooner than later? Do you have a timeline?

@jgostick
Copy link
Member Author

jgostick commented Jun 6, 2024

Hi @Arenhart, sorry for all the noise, but I have one more question/request. I see that your pyedt function returns the squared distance. This caught me by surprize, and it also means that pyedt is not a drop-in replacement for edt. Would you be at all amenable to changing this behavior? I feel like the euclidean distance has a specific meaning, while your function is actually returning the city block distance, or I think scipy calls this the chamfer distance.

@Arenhart
Copy link
Contributor

Arenhart commented Jun 6, 2024

Hello @jgostick , regarding the squared distance, pyedt is indeed the euclidian distance, and not manhattan nor chamfer distance, it just allows the return of either the strict euclidian distance, or (eclidian distance)^2. This allows faster computation when we only need to compare edt to a certain threshold t, as pyedt.edt(img, sqrt_result=False) <= t**2 is faster than pyedt.edt(img, sqrt_result=True) <= t . However, this is an improper implementation on my part, pyedt should default to sqrt_result=True, as it is the expected behaviour.

This can be verified in the following way:

>>> import numpy as np
>>> import pyedt
>>> import edt
>>> from skimage import morphology
>>> img = np.zeros((30, 30, 30), dtype=np.unit8)
>>> img = np.zeros((30, 30, 30), dtype=np.uint8)
>>> img[:21, :21, :21]=morphology.ball(10)
>>> img[:, 3:12, 5:8] = 1
>>> (pyedt.edt(img, sqrt_result=True) == edt.edt(img)).all()
True
>>> np.isclose(pyedt.edt(img, sqrt_result=False), edt.edt(img)**2).all()
True

As for jit_edt_cpu, I did package the wrong branch.

I'll fix those two issues and repackage pyedt, should be ready this morning, I'll post here when it's done.

@jgostick
Copy link
Member Author

jgostick commented Jun 6, 2024

@Arenhart, these fixes would be appreciated! I actually fixed the squared distance thing like this:

from pyedt import edt as sqedt
def edt(im):
    return np.sqrt(sqedt(im))

So this is not as urgent, however, without the jit_edt_cpu function, this PR will be on hold.

@Arenhart
Copy link
Contributor

Arenhart commented Jun 6, 2024

@jgostick
I just uploaded pyed 0.1.2 to pypi, it now cointains the jit_edt_cpu function, and defaults edt to the non-squared result. Unfortunatelly, this means dt = np.sqrt(edt(im)) must be changed back to dt = edt(im) now

@jgostick
Copy link
Member Author

jgostick commented Jun 6, 2024

No worries @Arenhart, I can fix that pretty fast. Thanks so much for the quick response here! Now I just need to get the tests passing, but it seems that something has changed somewhere in the code.

Copy link

codecov bot commented Jun 7, 2024

Codecov Report

Attention: Patch coverage is 26.25850% with 542 lines in your changes missing coverage. Please review.

Project coverage is 74.6%. Comparing base (4f114d5) to head (f4bd3ce).
Report is 7 commits behind head on dev.

Additional details and impacted files
@@           Coverage Diff           @@
##             dev    #967     +/-   ##
=======================================
- Coverage   82.6%   74.6%   -8.1%     
=======================================
  Files         36      39      +3     
  Lines       4652    5314    +662     
=======================================
+ Hits        3847    3968    +121     
- Misses       805    1346    +541     

@jgostick jgostick merged commit cb741b2 into dev Jun 10, 2024
11 of 15 checks passed
@ma-sadeghi ma-sadeghi deleted the parallel-extract branch August 9, 2024 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants