Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merscope - not loading vpt output #208

Open
MathieuBo opened this issue Sep 18, 2024 · 1 comment
Open

Merscope - not loading vpt output #208

MathieuBo opened this issue Sep 18, 2024 · 1 comment

Comments

@MathieuBo
Copy link

Hello!

I just hit several problems trying to load MERSCOPE data that were processed with the vpt tool and cellpose2. I am reported them both here as they are most likely connected.

Issue 1

If I only specify the vpt output folder, line 314 returns an error.

data = spatialdata-io.merscope(path = PATH_TO_OUTPUT, vpt_output = PATH_TO_VPTOUTPUT)
INFO     The column "global_x" has now been renamed to "x"; the column "x" was  
         already present in the dataframe, and will be dropped.                 
INFO     The column "global_y" has now been renamed to "y"; the column "y" was  
         already present in the dataframe, and will be dropped.                 

/home/mathieubo/miniforge3/envs/bento/lib/python3.11/functools.py:946: UserWarning: The index of the dataframe is not monotonic increasing. It is recommended to sort the data to adjust the order of the index before calling .parse() to avoid possible problems due to unknown divisions
  return method.__get__(obj, cls)(*args, **kwargs)

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[42], line 1
----> 1 ag02_reg1 = merscope(
      2     path = '/mnt/ISS/MERFISH/output/202407081557_Ageing02_VMSC19502/region_1/',
      3     vpt_outputs='/media/mathieubo/filesystem2/ageing02_reg1/')

File ~/miniforge3/envs/bento/lib/python3.11/site-packages/spatialdata_io/readers/merscope.py:214, in merscope(path, vpt_outputs, z_layers, region_name, slide_name, backend, transcripts, cells_boundaries, cells_table, mosaic_images, imread_kwargs, image_models_kwargs)
    212 if cells_boundaries:
    213     if boundaries_path.exists():
--> 214         shapes[f"{dataset_id}_polygons"] = _get_polygons(boundaries_path, transformations)
    215     else:
    216         logger.warning(f"Boundary file {boundaries_path} does not exist. Cell boundaries are not loaded.")

File ~/miniforge3/envs/bento/lib/python3.11/site-packages/spatialdata_io/readers/merscope.py:314, in _get_polygons(boundaries_path, transformations)
    312 geo_df = geo_df.rename_geometry("geometry")
    313 geo_df = geo_df[geo_df[MerscopeKeys.Z_INDEX] == 0]  # Avoid duplicate boundaries on all z-levels
--> 314 geo_df.geometry = geo_df.geometry.map(lambda x: x.geoms[0])  # The MultiPolygons contain only one polygon
    315 geo_df.index = geo_df[MerscopeKeys.METADATA_CELL_KEY].astype(str)
    317 return ShapesModel.parse(geo_df, transformations=transformations)

File ~/miniforge3/envs/bento/lib/python3.11/site-packages/pandas/core/series.py:4700, in Series.map(self, arg, na_action)
   4620 def map(
   4621     self,
   4622     arg: Callable | Mapping | Series,
   4623     na_action: Literal["ignore"] | None = None,
   4624 ) -> Series:
   4625     """
   4626     Map values of Series according to an input mapping or function.
   4627 
   (...)
   4698     dtype: object
   4699     """
-> 4700     new_values = self._map_values(arg, na_action=na_action)
   4701     return self._constructor(new_values, index=self.index, copy=False).__finalize__(
   4702         self, method="map"
   4703     )

File ~/miniforge3/envs/bento/lib/python3.11/site-packages/pandas/core/base.py:919, in IndexOpsMixin._map_values(self, mapper, na_action, convert)
    916 arr = self._values
    918 if isinstance(arr, ExtensionArray):
--> 919     return arr.map(mapper, na_action=na_action)
    921 return algorithms.map_array(arr, mapper, na_action=na_action, convert=convert)

File ~/miniforge3/envs/bento/lib/python3.11/site-packages/pandas/core/arrays/base.py:2322, in ExtensionArray.map(self, mapper, na_action)
   2302 def map(self, mapper, na_action=None):
   2303     """
   2304     Map values using an input mapping or function.
   2305 
   (...)
   2320         a MultiIndex will be returned.
   2321     """
-> 2322     return map_array(self, mapper, na_action=na_action)

File ~/miniforge3/envs/bento/lib/python3.11/site-packages/pandas/core/algorithms.py:1743, in map_array(arr, mapper, na_action, convert)
   1741 values = arr.astype(object, copy=False)
   1742 if na_action is None:
-> 1743     return lib.map_infer(values, mapper, convert=convert)
   1744 else:
   1745     return lib.map_infer_mask(
   1746         values, mapper, mask=isna(values).view(np.uint8), convert=convert
   1747     )

File lib.pyx:2972, in pandas._libs.lib.map_infer()

File ~/miniforge3/envs/bento/lib/python3.11/site-packages/spatialdata_io/readers/merscope.py:314, in _get_polygons.<locals>.<lambda>(x)
    312 geo_df = geo_df.rename_geometry("geometry")
    313 geo_df = geo_df[geo_df[MerscopeKeys.Z_INDEX] == 0]  # Avoid duplicate boundaries on all z-levels
--> 314 geo_df.geometry = geo_df.geometry.map(lambda x: x.geoms[0])  # The MultiPolygons contain only one polygon
    315 geo_df.index = geo_df[MerscopeKeys.METADATA_CELL_KEY].astype(str)
    317 return ShapesModel.parse(geo_df, transformations=transformations)

File ~/miniforge3/envs/bento/lib/python3.11/site-packages/shapely/geometry/base.py:997, in GeometrySequence.__getitem__(self, key)
    995 if isinstance(key, (int, np.integer)):
    996     if key + m < 0 or key >= m:
--> 997         raise IndexError("index out of range")
    998     if key < 0:
    999         i = m + key

IndexError: index out of range

Issue 2

In the example above, I had to rename my cellpose output file to remove the 2 as it was processed by cellpose2.
When defining a dictionary with all the correct paths, it doesn't open the file.

vpt_output_paths = dict()
vpt_output_paths['cell_by_gene'] = '/media/mathieubo/filesystem2/ageing02_reg1/cell_by_gene.csv'
vpt_output_paths['cell_metadata'] = '/media/mathieubo/filesystem2/ageing02_reg1/cell_metadata.csv'
vpt_output_paths['cell_boundaries'] = '/media/mathieubo/filesystem2/ageing02_reg1/cellpose2_micron_space.parquet'

ag02_reg1 = merscope(
    path = '/mnt/ISS/MERFISH/output/202407081557_Ageing02_VMSC19502/region_1/',
    vpt_outputs=vpt_output_paths)
INFO     The column "global_x" has now been renamed to "x"; the column "x" was  
         already present in the dataframe, and will be dropped.                 
INFO     The column "global_y" has now been renamed to "y"; the column "y" was  
         already present in the dataframe, and will be dropped.                 

/home/mathieubo/miniforge3/envs/bento/lib/python3.11/functools.py:946: UserWarning: The index of the dataframe is not monotonic increasing. It is recommended to sort the data to adjust the order of the index before calling .parse() to avoid possible problems due to unknown divisions
  return method.__get__(obj, cls)(*args, **kwargs)

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[41], line 7
      4 vpt_output_paths['cell_metadata'] = '/media/mathieubo/filesystem2/ageing02_reg1/cell_metadata.csv'
      5 vpt_output_paths['cell_boundaries'] = '/media/mathieubo/filesystem2/ageing02_reg1/cellpose2_micron_space.parquet'
----> 7 ag02_reg1 = merscope(
      8     path = '/mnt/ISS/MERFISH/output/202407081557_Ageing02_VMSC19502/region_1/',
      9     vpt_outputs=vpt_output_paths)

File ~/miniforge3/envs/bento/lib/python3.11/site-packages/spatialdata_io/readers/merscope.py:213, in merscope(path, vpt_outputs, z_layers, region_name, slide_name, backend, transcripts, cells_boundaries, cells_table, mosaic_images, imread_kwargs, image_models_kwargs)
    210 shapes = {}
    212 if cells_boundaries:
--> 213     if boundaries_path.exists():
    214         shapes[f"{dataset_id}_polygons"] = _get_polygons(boundaries_path, transformations)
    215     else:

AttributeError: 'str' object has no attribute 'exists'

Any help will be much appreciated! Thank you!!

@LucaMarconato
Copy link
Member

Hi @MathieuBo, thanks for reporting. Does #207 addresses both issues that you reported?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants