Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding the Better Tracing Suite into Main #932

Open
wants to merge 743 commits into
base: main
Choose a base branch
from

Conversation

MaxGamill-Sheffield
Copy link
Collaborator

Closes #800

This aims to add features which split out the DNATracing pipeline into smaller parts (disordered tracing, ordered tracing and splining) while adding individual analyses in each of these to be more modular. It also adds topological functions and analyses into ordered tracing, facilitating the processing of catenated molecules as separate objects, and includes a new module to handle and analyse crossings of DNA segments.

MaxGamill-Sheffield and others added 30 commits August 15, 2024 15:59
…/TopoStats into SylviaWhittle/800-splining-tests
@SylviaWhittle
Copy link
Collaborator

Okay I give up

if I use a different backend to appease windows, it fails on the other OSs
image

@SylviaWhittle
Copy link
Collaborator

Reverted commit.

The tests pass on everything except Windows python 3.9 so we just ditch that support please?

@@ -888,6 +889,8 @@ def ordered_tracing_image(
Results containing the ordered_trace_data (coordinates), any grain-level metrics to be added to the grains
dataframe, a dataframe of molecule statistics and a dictionary of diagnostic images.
"""
topoly_version = importlib.metadata.version("topoly")
print(f"Topoly version: {topoly_version}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this be permanent?

If so LOGGER.debug() might be appropriate and we could set up the tests to use a debug level in logging.

I will also address #896 now as its a small change and means you can SSH into GitHub runners when these fail and find out such information directly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah not permanent, just desperate debugging to see what version was being used

@SylviaWhittle
Copy link
Collaborator

@ns-rse Do you have any thoughts on this? Windows 3.9 fails due to tk issues, I tried using a different matplotlib backend but that caused Ubuntu to fail, and I'm scared of adding platform-specific code (backends for different platforms)

3.10 fails due to data type issues but I need to re-look at that as I've forgotten what I checked on Friday

@ns-rse
Copy link
Collaborator

ns-rse commented Oct 14, 2024

The tests pass on everything except Windows python 3.9 so we just ditch that support please?

They also failed on Windows 3.10 with the following...

5830 FAILED tests/tracing/test_disordered_tracing.py::test_trace_image_disordered[catenane] - AssertionError: Attributes of DataFrame.iloc[:, 3] (column name="branch_type") are different
5831
5832 Attribute "dtype" are different
5833 [left]:  int32
5834 [right]: int64
5835 !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!
5836 ====== 1 failed, 560 passed, 4 skipped, 2 warnings in 215.91s (0:03:35) =======
5837 Error: Process completed with exit code 1.

Could add in check_dtype = False to the pd.testing.assert_fames_equal() but this could be a slippery slope. Perhaps a better solution is to downcast the dtype of columns (I can't see we'll ever need int64 for the branch_type variable).

Thus after data frames are concatenated on line 344 of disordered_tracing.py we could...

disordered_tracing_stats = pd.concat((disordered_tracing_stats, skan_df))
disordered_tracing_stats = disordered_tracing_stats.astype({"branch_type": 'int32'})

However, this will also require that the same step is done when reading the CSV using pd.read_csv() and so the test itself will need to have a dictionary of the column names and dtype that they should be read as.

This creates a bit more work in my view and seems like the sort of situation that pytest-regtest was designed to handle. ⚖️

@SylviaWhittle
Copy link
Collaborator

The tests pass on everything except Windows python 3.9 so we just ditch that support please?

They also failed on Windows 3.10 with the following...

5830 FAILED tests/tracing/test_disordered_tracing.py::test_trace_image_disordered[catenane] - AssertionError: Attributes of DataFrame.iloc[:, 3] (column name="branch_type") are different
5831
5832 Attribute "dtype" are different
5833 [left]:  int32
5834 [right]: int64
5835 !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!
5836 ====== 1 failed, 560 passed, 4 skipped, 2 warnings in 215.91s (0:03:35) =======
5837 Error: Process completed with exit code 1.

Could add in check_dtype = False to the pd.testing.assert_fames_equal() but this could be a slippery slope. Perhaps a better solution is to downcast the dtype of columns (I can't see we'll ever need int64 for the branch_type variable).

Thus after data frames are concatenated on line 344 of disordered_tracing.py we could...

disordered_tracing_stats = pd.concat((disordered_tracing_stats, skan_df))
disordered_tracing_stats = disordered_tracing_stats.astype({"branch_type": 'int32'})

However, this will also require that the same step is done when reading the CSV using pd.read_csv() and so the test itself will need to have a dictionary of the column names and dtype that they should be read as.

This creates a bit more work in my view and seems like the sort of situation that pytest-regtest was designed to handle. ⚖️

Yeah I will fix the type issue, by forcing the dtype as the data goes into the dictionary.

Do you have any thoughts / ideas about the matplotlib tk / tcl errors that seem to be failing the rest of the tests the majority of the time? I've tried a different backend but this breaks ubuntu :(

@ns-rse
Copy link
Collaborator

ns-rse commented Oct 14, 2024

Do you have any thoughts / ideas about the matplotlib tk / tcl errors that seem to be failing the rest of the tests the majority of the time? I've tried a different backend but this breaks ubuntu :(

I think I've seen these quite a few times and from memory its when setting up/installing packages that things fail. I typically just have to manually re-run the tests.

Python 3.9 is the minimum version of Python and I think for the sake of not having to explain to people how to install and upgrade Python we should support the minimum version.

@MaxGamill-Sheffield
Copy link
Collaborator Author

I think I've seen these quite a few times and from memory its when setting up/installing packages that things fail. I typically just have to manually re-run the tests.

@ns-rse, @SylviaWhittle says this is on every re-run hence why she can't just push through. Is there a way to go into the test machines to manually ensure the package installs work? If not and as this is the highest priority, is dropping 3.9 support alright?

@ns-rse
Copy link
Collaborator

ns-rse commented Oct 14, 2024

@MaxGamill-Sheffield The tests are failing occasionally because of tcl/tk issues (and this is something I've observed many times over the last year or two). Of the two that are currently showing as failed in this Pull Request both have failed on Windows Python 3.9 and 3.10 because of a dtype issue on reading the CSV file into a Pandas Dataframe and testing equality.

Dropping Python 3.9 support won't address this particular issue as it is a Windows issue.

A quick search led me to this thread which suggests that, unsurprisingly, the underlying problem encountered here is with Numpy (Pandas Dataframes are all essentially Numpy arrays).

@ns-rse
Copy link
Collaborator

ns-rse commented Oct 14, 2024

@MaxGamill-Sheffield P.S. - The way to go into the GitHub runners is via the tmate action I created a Pull Request for and have just merged. But you can always look at the log file and see what has been installed under the Python setup section (needs expanding).

@ns-rse
Copy link
Collaborator

ns-rse commented Oct 14, 2024

@MaxGamill-Sheffield Commit e59d1bd35059298c8f0794ab4f36d63fc3d9bc0c may have had some unintended consequences as other tests are now failing.

Somehow I see the same errors on the PR #949 even though e59d1bd35 was made after I branched as it doesn't show in the log for that branch as I branched from 0ad6ed3b8 * origin/maxgamill-sheffield/800-better-tracing Revert "Try to appease the worse operating system... (pyqt5 matplotlib backend)" where the test suite passed on both GNU/Linux and OSX.

The failures on Windows do seem to be due to dtype errors, at least those that I've had eyes on here on GitHub's Continuous Integration. If there are others that have cropped up I'd suggest making the changes on a branch to merge them into maxgamill-sheffield/800-better-tracing as this branch is itself currently closing in on being ready to merge to main and committing directly to it is probably a bad idea.

SylviaWhittle and others added 2 commits October 14, 2024 21:33
Co-authored-by: Neil Shephard <n.shephard@sheffield.ac.uk>
Co-authored-by: Neil Shephard <n.shephard@sheffield.ac.uk>
@SylviaWhittle
Copy link
Collaborator

Weirdly MacOS runner seems to time out on tmate initialisation...
image

@ns-rse
Copy link
Collaborator

ns-rse commented Oct 14, 2024

But tmate only runs if there has been an earlier failure which there has as we now see the tests fail with...

Error: test_interpolate_height_profile_images[Skeleton loop 1]

AssertionError: 
Arrays are not almost equal to 12 decimals

Mismatched elements: 104 / 104 (100%)
Max absolute difference: 10.53686996
Max relative difference: 9.22337208e+18
 x: array([ 4.307918269844,  4.8637[168](https://github.com/AFM-SPM/TopoStats/actions/runs/11334618627/job/31521176486?pr=932#step:5:169)38544,  5.285009481818,  5.534734846213,
        5.558195073897,  5.392339586946,  5.071938213424,  4.62108600[173](https://github.com/AFM-SPM/TopoStats/actions/runs/11334618627/job/31521176486?pr=932#step:5:174)9,
        4.09936482494 ,  3.548273814461,  2.995079890958,  2.504882276258,...
 y: array([4.67065435e-19, 5.27325236e-19, 5.73001876e-19, 6.00077154e-19,
       6.02620717e-19, 5.84638629e-19, 5.49900643e-19, 5.01019148e-19,
       4.44454025e-19, 3.84704618e-19, 3.24727212e-19, 2.71579881e-19,...
tests/measure/test_height_profiles.py::test_interpolate_height_profile_images[Skeleton loop 1] FAILED [ 17%]

...which looks like the order of magnitude change in e59d1bd.

I'll employ git bisect tomorrow to work out what has caused this change and then sort out my #949 in a similar manner.

@@ -885,7 +885,7 @@ def _generate_random_skeleton(**extra_kwargs):
"allow_overlap": True,
}
# kwargs.update
heights = {"scale": 100, "sigma": 5.0, "cval": 20.0}
heights = {"scale": 1e2, "sigma": 5.0, "cval": 20.0}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these not the same value?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion I caused, should have posted here after pushing. But aye, what Sylvia, said it was super weird as although the values 100 and 1e2 are the same, they have a very different effect when used in scale:
Screenshot 2024-10-14 at 17 28 51

And the 1e2 is far from very small values which seem to give floating point errors.

@@ -1242,7 +1241,7 @@ def test_prune_by_length(
pytest.param(
"pruning_skeleton",
None,
8.0e-19,
7.3,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why has the value as well as the magnitude changed for this set of parameters?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As seen above, some of the values changed slightly so I modified the test values in order to keep the expected output the same. When doing this I was logging the height values produced by the method_values to find one that fits but for the value of ~8e-19, the largest branch height was ~7.9 so couldn't be used.

# marks=pytest.mark.skip(),
),
pytest.param(
"pruning_skeleton",
None,
7.7e-19,
7.1,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto, value and magnitude have changed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar story for this other one that changed.

@@ -1374,7 +1373,7 @@ def test_prune_by_length(
pytest.param(
"pruning_skeleton",
None,
1.0e-19,
10000, # can assume any not-None value as the threshold is found via the IQR
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Useful comments like this could go in the id="" where they will be clearly seen if the test fails.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we'd want to keep the id's short and readable but if you think it's useful I'll make this change tomorrow

base_dir: ./ # Directory in which to search for data files
output_dir: ./output # Directory to output results to
log_level: info # Verbosity of output. Options: warning, error, info, debug
cores: 2 # Number of CPU cores to utilise for processing multiple files simultaneously.
cores: 1 # Number of CPU cores to utilise for processing multiple files simultaneously.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for decreasing the number of cores? Roughly doubles processing time and most modern systems have at least 4 cores (although 2 is the default so that a) people's computers don't get clobbered; b) GitHub Runners have only 2 cores).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might have been an accident, or intentional so users can see which file throws an error. Who's on the blame? If it's me, this was probs an accident I think I'd prefer the default to be speedier.

Copy link
Collaborator

@ns-rse ns-rse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was the value of scale considered to be the cause of the Windows int32 v int64 error with dtype?

The affected variable was branch_type in the data frames that were being compared so nothing to do with height profiles.

These changes have affected the tests/measure/test_height_profiles.py::test_interpolate_height_profile_images because the same skeletons and height profiles are used in those tests.

It is worth running pytest before making pull requests and the introduction of pytest --testmon should have helped with that but may not be setup locally. Also branch and PR rather than committing directly to what is a development branch.

@SylviaWhittle
Copy link
Collaborator

Why was the value of scale considered to be the cause of the Windows int32 v int64 error with dtype?

The affected variable was branch_type in the data frames that were being compared so nothing to do with height profiles.

These changes have affected the tests/measure/test_height_profiles.py::test_interpolate_height_profile_images because the same skeletons and height profiles are used in those tests.

It is worth running pytest before making pull requests and the introduction of pytest --testmon should have helped with that but may not be setup locally. Also branch and PR rather than committing directly to what is a development branch.

That commit was made to fix a pruning test that fails exclusively on windows that I flagged on friday, it appeared to be floating point arithmetic on values of order ~1e-19 (posting from windows so don't have the screenshot to hand) Max found that it was likely able to be fixed by just increasing the values used in the pruning heights to be less prone to being in the floating point error range ie, by setting them to be order ~1e-1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[feature] : Better Tracing & Skeletonisation Merger
4 participants