Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bounding_box_query does not work on cellpose result stored in spatialdata #166

Open
mamyAndrianteranagna opened this issue Jun 27, 2024 · 10 comments

Comments

@mamyAndrianteranagna
Copy link

Dear scverse Team,

First of all, thank you for your effort on developping integrative frameworks for spatial transcriptomics data.

Now, I'm analyzing MERFISH data using spatialdata. When I use cellpose to segment cells, I am not able to crop my data using the bounding_box_query function. However, this function works fine for data from generated by other segmentation tools such as baysor.

We suspect an incompatibility in the coordinate system but do not know how to check.
Could you please help us to solve the issue?

The code to crop the data:

from spatialdata import bounding_box_query
crop_sdata = bounding_box_query(sdata, min_coordinate=[8000,8000], max_coordinate=[8250,8250], axes=("x","y"), target_coordinate_system="microns",filter_table=True)

This is the error message:


LinAlgError Traceback (most recent call last)
Input In [7], in <cell line: 2>()
1 from spatialdata import bounding_box_query
----> 2 crop_sdata = bounding_box_query(sdata, min_coordinate=[8000,8000], max_coordinate=[8250,8250], axes=("x","y"), target_coordinate_system="microns",filter_table=True)

File /data/kdi_prod/.kdi/project_workspace_0/1928/acl/01.00/scripts/conda_sopa/lib/python3.10/functools.py:889, in singledispatch..wrapper(*args, **kw)
885 if not args:
886 raise TypeError(f'{funcname} requires at least '
887 '1 positional argument')
--> 889 return dispatch(args[0].class)(*args, **kw)

File /data/kdi_prod/.kdi/project_workspace_0/1928/acl/01.00/scripts/conda_sopa/lib/python3.10/site-packages/spatialdata/_core/query/spatial_query.py:454, in _(sdata, axes, min_coordinate, max_coordinate, target_coordinate_system, filter_table)
452 for element_type in ["points", "images", "labels", "shapes"]:
453 elements = getattr(sdata, element_type)
--> 454 queried_elements = _dict_query_dispatcher(
455 elements,
456 bounding_box_query,
457 axes=axes,
458 min_coordinate=min_coordinate,
459 max_coordinate=max_coordinate,
460 target_coordinate_system=target_coordinate_system,
461 )
462 new_elements[element_type] = queried_elements
464 tables = _get_filtered_or_unfiltered_tables(filter_table, new_elements, sdata)

File /data/kdi_prod/.kdi/project_workspace_0/1928/acl/01.00/scripts/conda_sopa/lib/python3.10/site-packages/spatialdata/_core/query/spatial_query.py:398, in _dict_query_dispatcher(elements, query_function, **kwargs)
396 assert isinstance(d, dict)
397 if target_coordinate_system in d:
--> 398 result = query_function(element, **kwargs)
399 if result is not None:
400 # query returns None if it is empty
401 queried_elements[key] = result

File /data/kdi_prod/.kdi/project_workspace_0/1928/acl/01.00/scripts/conda_sopa/lib/python3.10/functools.py:889, in singledispatch..wrapper(*args, **kw)
885 if not args:
886 raise TypeError(f'{funcname} requires at least '
887 '1 positional argument')
--> 889 return dispatch(args[0].class)(*args, **kw)

File /data/kdi_prod/.kdi/project_workspace_0/1928/acl/01.00/scripts/conda_sopa/lib/python3.10/site-packages/spatialdata/_core/query/spatial_query.py:728, in _(polygons, axes, min_coordinate, max_coordinate, target_coordinate_system)
720 _ = BoundingBoxRequest(
721 target_coordinate_system=target_coordinate_system,
722 axes=axes,
723 min_coordinate=min_coordinate,
724 max_coordinate=max_coordinate,
725 )
727 # get the four corners of the bounding box
--> 728 (intrinsic_bounding_box_corners, intrinsic_axes) = _get_bounding_box_corners_in_intrinsic_coordinates(
729 element=polygons,
730 axes=axes,
731 min_coordinate=min_coordinate,
732 max_coordinate=max_coordinate,
733 target_coordinate_system=target_coordinate_system,
734 )
736 bounding_box_non_axes_aligned = Polygon(intrinsic_bounding_box_corners)
737 indices = polygons.geometry.intersects(bounding_box_non_axes_aligned)

File /data/kdi_prod/.kdi/project_workspace_0/1928/acl/01.00/scripts/conda_sopa/lib/python3.10/site-packages/spatialdata/_core/query/spatial_query.py:102, in _get_bounding_box_corners_in_intrinsic_coordinates(element, axes, min_coordinate, max_coordinate, target_coordinate_system)
100 # transform the coordinates to the intrinsic coordinate system
101 intrinsic_axes = get_axes_names(element)
--> 102 transform_to_intrinsic = transform_to_query_space.inverse().to_affine_matrix( # type: ignore[union-attr]
103 input_axes=axes, output_axes=intrinsic_axes
104 )
105 rotation_matrix = transform_to_intrinsic[0:-1, 0:-1]
106 translation = transform_to_intrinsic[0:-1, -1]

File /data/kdi_prod/.kdi/project_workspace_0/1928/acl/01.00/scripts/conda_sopa/lib/python3.10/site-packages/spatialdata/transformations/transformations.py:527, in Affine.inverse(self)
526 def inverse(self) -> BaseTransformation:
--> 527 inv = np.linalg.inv(self.matrix)
528 return Affine(inv, self.output_axes, self.input_axes)

File /data/kdi_prod/.kdi/project_workspace_0/1928/acl/01.00/scripts/conda_sopa/lib/python3.10/site-packages/numpy/linalg/linalg.py:556, in inv(a)
554 a, wrap = _makearray(a)
555 _assert_stacked_2d(a)
--> 556 _assert_stacked_square(a)
557 t, result_t = _commonType(a)
559 signature = 'D->D' if isComplexType(t) else 'd->d'

File /data/kdi_prod/.kdi/project_workspace_0/1928/acl/01.00/scripts/conda_sopa/lib/python3.10/site-packages/numpy/linalg/linalg.py:213, in _assert_stacked_square(*arrays)
211 m, n = a.shape[-2:]
212 if m != n:
--> 213 raise LinAlgError('Last 2 dimensions of the array must be square')

LinAlgError: Last 2 dimensions of the array must be square

The structure of our spatialdata:

SpatialData object with:
├── Images
│ └── 'output_region_0_z3': MultiscaleSpatialImage[cyx] (11, 96363, 78210), (11, 48181, 39105), (11, 24090, 19552), (11, 12045, 9776), (11, 6022, 4888)
├── Points
│ └── 'output_region_0_transcripts': DataFrame with shape: (19797064, 9) (2D points)
├── Shapes
│ ├── 'cellpose_boundaries': GeoDataFrame shape: (62498, 1) (2D shapes)
│ ├── 'output_region_0_polygons': GeoDataFrame shape: (196467, 1) (2D shapes)
│ └── 'sopa_patches': GeoDataFrame shape: (9, 1) (2D shapes)
└── Tables
└── 'table': AnnData (62498, 315)
with coordinate systems:
▸ 'microns', with elements:
output_region_0_z3 (Images), output_region_0_transcripts (Points), cellpose_boundaries (Shapes), output_region_0_polygons (Shapes), sopa_patches (Shapes)

Version of tools:
spatialdata 0.1.2,
cellpose 3.0.10,
cellpose was run on image patches created by the sopa framework version 1.1.0 after creating multiples patches of the image)

Please let me know if you need more informations.

Looking forward for your reply.

Regards,
Mamy

@LucaMarconato
Copy link
Member

LucaMarconato commented Jun 27, 2024

Hi thanks for the reporting the bug and happy that you find our package useful 😊 I haven't seen this bug before.

Could you paste here the content of .attrs in your shapes object? It could may be a problem with a transformation badly specified. Thanks!

If this is not enough to understand the root of the problem, I think here the easiest would be if you could subset the data by dropping all but the shapes that lead to the bug, and also try to subset the rows of the GeoDataFrame to reduce it to a very small dataset. After this if you could write it to Zarr, zip it and send it to me via Zulip: https://scverse.zulipchat.com/#user/480560.

But let's see if the first approach is enough 😊

@mamyAndrianteranagna
Copy link
Author

Hello Luca,

Thanks for you rapid reply.

Here is the .attrs of the shape:

sdata.shapes['cellpose_boundaries'].attrs

{'transform': {'microns': Affine (x, y -> c, x, y)
[0. 0. 0.]
[1.07998399e-01 0.00000000e+00 4.20769397e+03]
[0.00000000e+00 1.07999088e-01 1.62009116e+03]
[0. 0. 1.]}}

Also, I 've tried to delete all other shapes except the cellpose one but I still had the same error:

del sdata.shapes['output_region_0_polygons']
del sdata.shapes['sopa_patches']
sdata

SpatialData object with:
├── Images
│ └── 'output_region_0_z3': MultiscaleSpatialImage[cyx] (11, 96363, 78210), (11, 48181, 39105), (11, 24090, 19552), (11, 12045, 9776), (11, 6022, 4888)
├── Points
│ └── 'output_region_0_transcripts': DataFrame with shape: (19797064, 9) (2D points)
├── Shapes
│ └── 'cellpose_boundaries': GeoDataFrame shape: (62498, 1) (2D shapes)
└── Tables
└── 'table': AnnData (62498, 315)
with coordinate systems:
▸ 'microns', with elements:
output_region_0_z3 (Images), output_region_0_transcripts (Points), cellpose_boundaries (Shapes)

Cropping this object leads exactly to the same error.

Regards,
Mamy

@LucaMarconato
Copy link
Member

Thanks for sharing. I think that if you remove the c channel from the output of the transformation it will work. Was the transformation given with the c channel from cellpose?

@mamyAndrianteranagna
Copy link
Author

Hello Luca,

Can you let us know how we can easily remove the c channel?

We tried to work directly on the transformation matrix of the 'cellpose_boundaries' but we didn't succeed to remove the error.

For example, we changed this

sdata.shapes['cellpose_boundaries'].attrs["transform"]

{'microns': Affine (x, y -> c, x, y)
[0. 0. 0.]
[1.07998399e-01 0.00000000e+00 4.20769397e+03]
[0.00000000e+00 1.07999088e-01 1.62009116e+03]
[0. 0. 1.]}

to this

sdata.shapes['cellpose_boundaries'].attrs["transform"]

{'microns': Affine (x, y -> c, x, y)
[0. 0.]
[ 0. 4207.69396883]
[1.07999088e-01 1.62009116e+03]
[0. 1.]}

but still have the same error.

Thanks for your help!

Best,

@LucaMarconato
Copy link
Member

Solution

Here is how you can remove the c channel and set the transformation inside the object again:

affine_without_c = affine_from_data.to_affine(input_axes=("x", "y"), output_axes=("x", "y"))
set_transformation(sdata[element_name], affine_without_c)

Alternative solution

You could have also constructed a new Affine transformation from a new affine matrix (you can see the code for doing it below).

Saving to disk

Please note that calling set_transformation only changes the transformation metadata in-memory. If you want to also save this to disk you need to call set_transformation with write_to_sdata set to the SpatialData object you are considering.
Alternatively you can call the method write_transformations() of the SpatialData object you are considering (please note that this method is not currently available in the version you can install from pip; we are making a new release this week).

Full example

Full example showing how to reproduce your bug and how to make it disappear (the example doesn't show how to save to disk).

##
import numpy.linalg
import pytest
from spatialdata.transformations import Affine, set_transformation, get_transformation
from spatialdata.datasets import blobs
from spatialdata import bounding_box_query

##
# data preparation
sdata = blobs()
element_name = "blobs_circles"

matrix = [
    [0.00000000e00, 0.00000000e00, 0.00000000e00],
    [1.07998399e-01, 0.00000000e00, 4.20769397e03],
    [0.00000000e00, 1.07999088e-01, 1.62009116e03],
    [0.00000000e00, 0.00000000e00, 1.00000000e00],
]

affine = Affine(matrix, input_axes=("x", "y"), output_axes=("c", "x", "y"))
set_transformation(sdata[element_name], affine)

##
# querying the data fails
with pytest.raises(numpy.linalg.LinAlgError):
    bounding_box_query(
        sdata[element_name],
        axes=("x", "y"),
        min_coordinate=[4000, 1000],
        max_coordinate=[5000, 2000],
        target_coordinate_system="global",
    )

##
# replacing the transformation with a matrix without the 'c' axis
# this is the same as the transformation above, here I am showing how to get it from the data
affine_from_data = get_transformation(sdata[element_name])

affine_without_c = affine_from_data.to_affine(input_axes=("x", "y"), output_axes=("x", "y"))
set_transformation(sdata[element_name], affine_without_c)

##
# querying the data succeeds
bounding_box_query(
    sdata[element_name],
    axes=("x", "y"),
    min_coordinate=[4000, 1000],
    max_coordinate=[5000, 2000],
    target_coordinate_system="global",
)

@mamyAndrianteranagna
Copy link
Author

Thanks Luca for your reply.
I will try it and let you know.

Regards,
Mamy

@mamyAndrianteranagna
Copy link
Author

mamyAndrianteranagna commented Jul 8, 2024

Hello Luca,

Sorry to bring up this issue again.

Following your solution, indeed, the bounding_box_query works fine but the cellpose shapes seem not to be aligned at all to the original image even after removing the c in the transformation.

This is what I've done:

my sdata:

SpatialData object with:
├── Images
│     └── 'output_region_0_z3': MultiscaleSpatialImage[cyx] (11, 96363, 78210), (11, 48181, 39105), (11, 24090, 19552), (11, 12045, 9776), (11, 6022, 4888)
├── Points
│     └── 'output_region_0_transcripts': DataFrame with shape: (19797064, 9) (2D points)
├── Shapes
│     ├── 'baysor_boundaries': GeoDataFrame shape: (35330, 1) (2D shapes)
│     ├── 'cellpose_nuclei': GeoDataFrame shape: (21894, 1) (2D shapes)
│     ├── 'output_region_0_polygons': GeoDataFrame shape: (196467, 1) (2D shapes)
│     └── 'sopa_patches': GeoDataFrame shape: (9, 1) (2D shapes)
└── Tables
      └── 'table': AnnData (21894, 315)
with coordinate systems:
▸ 'microns', with elements:
        output_region_0_z3 (Images), output_region_0_transcripts (Points), baysor_boundaries (Shapes), cellpose_nuclei (Shapes), output_region_0_polygons (Shapes), sopa_patches (Shapes)

The coordinate before transformations:

print(sdata.images['output_region_0_z3'].attrs)
print(sdata.points['output_region_0_transcripts'].attrs)
print(sdata.shapes['cellpose_nuclei_6diam_0minArea'].attrs)
print(sdata.shapes['output_region_0_polygons'].attrs)
print(sdata.shapes['sopa_patches'].attrs)
{}
{'transform': {'microns': Identity }}
{'transform': {'microns': Affine (c, y, x -> c, x, y)
    [1. 0. 0. 0.]
    [0.00000000e+00 0.00000000e+00 1.07998399e-01 4.20769397e+03]
    [0.00000000e+00 1.07999088e-01 0.00000000e+00 1.62009116e+03]
    [0. 0. 0. 1.]}}
{'transform': {'microns': Identity }}
{'transform': {'microns': Identity }}
from spatialdata.transformations import Affine, set_transformation, get_transformation

# Get transformation
affine_from_data = get_transformation(sdata.shapes['cellpose_nuclei_6diam_0minArea'], to_coordinate_system='microns')
affine_from_data
Affine (c, y, x -> c, x, y)
    [1. 0. 0. 0.]
    [0.00000000e+00 0.00000000e+00 1.07998399e-01 4.20769397e+03]
    [0.00000000e+00 1.07999088e-01 0.00000000e+00 1.62009116e+03]
    [0. 0. 0. 1.]
# Create new transformation without c
affine_from_data_without_c = affine_from_data.to_affine(input_axes=("x","y"), output_axes=("x","y"))
affine_from_data_without_c

Affine (x, y -> x, y)
    [1.07998399e-01 0.00000000e+00 4.20769397e+03]
    [0.00000000e+00 1.07999088e-01 1.62009116e+03]
    [0. 0. 1.]

# Set transformation
set_transformation(sdata.shapes['cellpose_nuclei_6diam_0minArea'], affine_from_data_without_c)
sdata.shapes['cellpose_nuclei_6diam_0minArea'].attrs
{'transform': {'microns': Affine (c, y, x -> c, x, y)
      [1. 0. 0. 0.]
      [0.00000000e+00 0.00000000e+00 1.07998399e-01 4.20769397e+03]
      [0.00000000e+00 1.07999088e-01 0.00000000e+00 1.62009116e+03]
      [0. 0. 0. 1.],
  'global': Affine (x, y -> x, y)
      [1.07998399e-01 0.00000000e+00 4.20769397e+03]
      [0.00000000e+00 1.07999088e-01 1.62009116e+03]
      [0. 0. 1.]}}

Here, it creates a new transformation with 'global' name but keep the old one (microns). Then I remove the 'microns' transformation and rename the 'global' to 'microns':

sd.transformations.remove_transformation(sdata.shapes['cellpose_nuclei_6diam_0minArea'], to_coordinate_system = 'microns')

sdata.shapes['cellpose_nuclei_6diam_0minArea'].attrs
{'transform': {'global': Affine (x, y -> x, y)
    [1.07998399e-01 0.00000000e+00 4.20769397e+03]
    [0.00000000e+00 1.07999088e-01 1.62009116e+03]
    [0. 0. 1.]}}
sdata.shapes['cellpose_nuclei_6diam_0minArea'].attrs["transform"]["microns"] = sdata.shapes['cellpose_nuclei_6diam_0minArea'].attrs["transform"]["global"]

sd.transformations.remove_transformation(sdata.shapes['cellpose_nuclei_6diam_0minArea'], to_coordinate_system = 'global')

At the end, I have the same except the c channel removed in the cellpose boundaries shape:

print(sdata.images['output_region_0_z3'].attrs)
print(sdata.points['output_region_0_transcripts'].attrs)
print(sdata.shapes['cellpose_nuclei_6diam_0minArea'].attrs)
print(sdata.shapes['output_region_0_polygons'].attrs)
print(sdata.shapes['sopa_patches'].attrs)
{}
{'transform': {'microns': Identity }}
{'transform': {'microns': Affine (x, y -> x, y)
    [1.07998399e-01 0.00000000e+00 4.20769397e+03]
    [0.00000000e+00 1.07999088e-01 1.62009116e+03]
    [0. 0. 1.]}}
{'transform': {'microns': Identity }}
{'transform': {'microns': Identity }}

Using this object, the cropping works without error but the cellpose shapes plot seems to be not aligned to the other images in the object and looks weird.

sdata.pl.render_shapes('cellpose_nuclei_6diam_0minArea', outline=True, fill_alpha=0, outline_color='r').pl.render_shapes('output_region_0_polygons', outline=True, fill_alpha=0, outline_color='g').pl.show()

polygons_cellpose_and_default

sdata.pl.render_shapes('cellpose_nuclei_6diam_0minArea', outline=True, fill_alpha=0, outline_color='r').pl.show()

polygons_cellpose

Thanks for you help.

Regards,
Mamy

@LucaMarconato
Copy link
Member

Thanks for sharing the details.

First, minor tip, I have formatted your message by enclosing the code in blocks
```python
your code here
```
and the same for the outputs (I have used the triple backticks, without the "python" string in the first line).

Now, back to your problem. I am afraid I can't pinpoint the cause of the problem here not having the data, but I can give some advice.

  1. Where does the transformation matrix come from? I see that you have an element from SOPA in the data. Is that matrix constructed by SOPA? In that case you could be seeing a bug from SOPA and I would report it in the SOPA GitHub repository.
  2. More of an "hack" than a solution, but may unlock you for this particular case. If there is only one matrix available (as it seems from your case), a good guess is that one the following will make the data aligned: either no matrix either the inverse matrix. I would try these two cases (by setting Identity for the nuclei, and by setting my_affine_transformation.inverse()). You can also try to construct an affine matrix with only translation. Quickly looking at your plot, the following matrix may work:
Affine (x, y -> x, y)
      [1 0.00000000e+00 4.20769397e+03]
      [0.00000000e+00 1 1.62009116e+03]
      [0. 0. 1.]}}
  1. I don't get what is happening in the last plot. I would try zooming a bit more, or visualizing the data with napari_spatialdata, which offers an interactive exploration experience.
from napari_spatialdata import Interactive

Interactive(sdata)

I hope the above can help.

@mamyAndrianteranagna
Copy link
Author

Hello Luca,

Thanks for the tips.

  1. I suspect also a problem with sopa because the plot of the shape is really weird, I will post the issue in the SOPA GitHub too. However, element generated by other segmentation tool (baysor) run using SOPA does not have this transformation and this problem.
  2. I will test these solutions and let you know
  3. I tried to use napari-spatialdata but I have also an issue that I will post on the napari-spatialdata GitHub.

Regards,
Mamy

@LucaMarconato
Copy link
Member

Ok perfect thanks for the information, we can follow up in the napari-spatialdata repository, and please when you post the issue in SOPA reference this issue so I can track the conversation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants