Skip to content

Commit

Permalink
Merge branch 'main' into 3d_features
Browse files Browse the repository at this point in the history
  • Loading branch information
friskluft authored Nov 1, 2022
2 parents bab7aa0 + 2ee8f92 commit b4e6628
Show file tree
Hide file tree
Showing 9 changed files with 439 additions and 58 deletions.
106 changes: 94 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ Nyxus provides a set of pixel intensity, morphology, texture, intensity distribu
| ROBUST_MEAN_ABSOLUTE_DEVIATION | Robust mean absolute deviation |
| MASS_DISPLACEMENT | ROI mass displacement |
| AREA_PIXELS_COUNT | ROI area in the number of pixels |
| COMPACTNESS | Mean squared distance of the object’s pixels from the centroid divided by the area |
| COMPACTNESS | Mean squared distance of the ROI's pixels from the centroid divided by the area |
| BBOX_YMIN | Y-position and size of the smallest axis-aligned box containing the ROI |
| BBOX_XMIN | X-position and size of the smallest axis-aligned box containing the ROI |
| BBOX_HEIGHT | Height of the smallest axis-aligned box containing the ROI |
Expand Down Expand Up @@ -126,8 +126,7 @@ Nyxus provides a set of pixel intensity, morphology, texture, intensity distribu
| 3D_CENTRAL_MOMENT_*pqr* | a set of 3-dimensional central moments of orders p,q,r |
| 3D_NORM_CENTRAL_MOMENT_*pqr* | a set of 3-dimensional normalized central moments of orders p,q,r |


For the complete list of features see [Nyxus provided features](docs/source/featurelist.rst)
For the complete list of features see [Nyxus provided features](docs/featurelist.md)

## Feature groups

Expand Down Expand Up @@ -356,14 +355,97 @@ __Example__: we need to process collection of mask images located in directory "
nyxushie ~/data/image-collection1/seg train_.*\\.tif _ch 1 0 ~/results/result1
```

Using the hierarchical ROI Python API is illustrated in the following example:
### Nested features Python API

The nested features functionality can also be utilized in Python using the `Nested` class in `nyxus`. The `Nested` class
contains two methods, `find_relations` and `featurize`.

The `find_relations` method takes in a path to the label files, along with a child
filepattern to identify the files in the child channel and a parent filepattern to match the files in the parent channel. The `find_relation` method
returns a Pandas DataFrame containing a mapping between parent ROIs and the respective child ROIs.

The `featurize` method takes in the parent-child mapping along with the features of the ROIs in the child channel. If a list of aggregate functions
is provided to the constructor, this method will return a pivoted DataFrame where the rows are the ROI labels and the columns are grouped by the features.


__Example__: Using aggregate functions

``` python

from nyxus import Nyxus, Nested
import numpy as np

int_path = 'path/to/intensity'
seg_path = 'path/to/segmentation'

nyx = Nyxus(['GABOR'])

child_features = nyx.featurize(int_path, seg_path, file_pattern='p[0-9]_y[0-9]_r[0-9]_c0\.ome\.tif')

nest = Nested(['sum', 'mean', 'min', ('nanmean', lambda x: np.nanmean(x))])

df = nest.find_relations(seg_path, 'p{r}_y{c}_r{z}_c1.ome.tif', 'p{r}_y{c}_r{z}_c0.ome.tif')

df2 = nest.featurize(df, features)
```

The parent-child map is

``` bash
Image Parent_Label Child_Label
0 /path/to/image 72 65
1 /path/to/image 71 66
2 /path/to/image 70 64
3 /path/to/image 68 61
4 /path/to/image 67 65

```
from nyxus import Nested
nyx = Nested()
segPath = "d:\\data\\mini\\seg"
fPat = '.*'
cnlSig = '_c'
parCnl = '1'
chiCnl = '0'
rels = nyx.findrelations (segPath, fPat, cnlSig, parCnl, chiCnl)

and the aggregated DataFrame is

``` bash
GABOR_0 GABOR_1 GABOR_2 ...
sum mean min nanmean sum mean min nanmean sum mean ...
label ...
1 24.010227 0.666951 0.000000 0.666951 19.096262 0.530452 0.001645 0.530452 17.037345 0.473260 ...
2 13.374170 0.445806 0.087339 0.445806 7.279187 0.242640 0.075000 0.242640 6.390529 0.213018 ...
3 5.941783 0.198059 0.000000 0.198059 3.364149 0.112138 0.000000 0.112138 2.426409 0.080880 ...
4 13.428773 0.559532 0.000000 0.559532 12.021938 0.500914 0.008772 0.500914 9.938915 0.414121 ...
5 6.535722 0.181548 0.000000 0.181548 1.833463 0.050930 0.000000 0.050930 2.083023 0.057862 ...

```

__Example__: Without aggregate functions

``` python

from nyxus import Nyxus, Nested
import numpy as np

int_path = 'path/to/intensity'
seg_path = 'path/to/segmentation'

nyx = Nyxus(['GABOR'])

child_features = nyx.featurize(int_path, seg_path, file_pattern='p[0-9]_y[0-9]_r[0-9]_c0\.ome\.tif')

nest = Nested()

df = nest.find_relations(seg_path, 'p{r}_y{c}_r{z}_c1.ome.tif', 'p{r}_y{c}_r{z}_c0.ome.tif')

df2 = nest.featurize(df, features)
```

the parent-child map remains the same but the `featurize` result becomes

``` bash
GABOR_0 ...
Child_Label 1 2 3 4 5 6 7 8 9 10 ...
label ...
1 0.666951 NaN NaN NaN NaN NaN NaN NaN NaN NaN ...
2 NaN 0.445806 NaN NaN NaN NaN NaN NaN NaN NaN ...
3 NaN NaN 0.198059 NaN NaN NaN NaN NaN NaN NaN ...
4 NaN NaN NaN 0.559532 NaN NaN NaN NaN NaN NaN ...
5 NaN NaN NaN NaN 0.181548 NaN NaN NaN NaN NaN ...

```
150 changes: 150 additions & 0 deletions docs/source/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -127,4 +127,154 @@ features = nyx.featurize(
])
```

8. Nested Features Examples
-----------------------------------------------------------------------------

The Nested class is the Python API of Nyxus identifies child-parent relations of ROIs in images with a child and parent channel.
For example, consider the following intensity and segmentation images of the parent channel,

.. list-table::

* - .. figure:: img/parent_int.png

Fig 1. Parent channel intensity

- .. figure:: img/parent_seg.png

Fig 2. Parent channel segmentation

With the child channel

.. list-table::

* - .. figure:: img/child_int.png

Fig 3. Child channel intensity

- .. figure:: img/child_seg.png

Fig 4. Child channel segmentation


As shown by the figures, there are ROIs in the child segmentation that are completely contained in the the ROIs of the parent channel.
The purpose of the Nested class is to identify the child ROIs of the parent channel. The Nested class also contains functionality to
apply aggregate functions to the child features, as shown belong in the example.

To use the Nested class, first call the constructor with the optional argument `aggregate`. If `aggregate` is not passed, the
`find_relation` behavior will change (described later). Any aggregate function supported by Pandas is available,
such as `min`, `max`, `count`, and `mean`. Lambda functions can also be used, and named using a 2-tuple, where the first
element is the name and the second is the lambda function. This allows functions that are not supported by Pandas to be used,
such as Numpy's `np.nanmean`.

To use the Nested class, first call Nyxus to get the features of all ROIs from the child channels. If the child channels are described
by a channel number in the filename, a filepattern can be used to filter down to only the child channel. Consider a directory with the images

.. code-block:: bash
p0_y1_r1_c0.ome.tif
p0_y1_r1_c1.ome.tif
p0_y1_r2_c0.ome.tif
p0_y1_r2_1.ome.tif
p0_y1_r3_c0.ome.tif
p0_y1_r3_c1.ome.tif
...
where the child channel is designated by `c0` and the parent channel is `c1`. We can filter down to only the child channel using the
`filepattern <https://filepattern.readthedocs.io/en/latest/>`_ `p{r}_y{c}_r{z}_c0.ome.tif` or the equivalent regex `p[0-9]_y[0-9]_r[0-9]_c0\.ome\.tif`.


Next, we calculate the features for the child channel. For simplicity, we only use the Gabor features, but any or all features can be used.

.. code-block:: python
from nyxus import Nyxus, Nested
import numpy as np
int_path = 'path/to/intensity'
seg_path = 'path/to/segmentation'
nyx = Nyxus(['GABOR'])
child_features = nyx.featurize(int_path, seg_path, file_pattern='p[0-9]_y[0-9]_r[0-9]_c0\.ome\.tif')
print(features.head())
The result of this code is

.. code-block:: bash
mask_image intensity_image label GABOR_0 GABOR_1 GABOR_2 GABOR_3 GABOR_4 GABOR_5 GABOR_6
0 p0_y1_r1_c0.ome.tif p0_y1_r1_c0.ome.tif 1 0.224206 0.172619 0.166667 0.730159 0.773810 0.767857 0.753968
1 p0_y1_r1_c0.ome.tif p0_y1_r1_c0.ome.tif 2 1.000000 0.610000 0.540000 0.980000 0.990000 0.990000 0.970000
2 p0_y1_r1_c0.ome.tif p0_y1_r1_c0.ome.tif 3 0.429864 0.217195 0.122172 0.877828 0.941176 0.936652 0.909502
3 p0_y1_r1_c0.ome.tif p0_y1_r1_c0.ome.tif 4 0.846154 0.948718 0.717949 1.000000 1.000000 1.000000 1.000000
4 p0_y1_r1_c0.ome.tif p0_y1_r1_c0.ome.tif 5 0.277778 0.021368 0.029915 0.794872 0.841880 0.841880 0.824786
Next, the `find_relation` method is used to find the child-parent relations. This method takes in the segmentation path along with
filepatterns to distinguish the child channel from the parent channel.

.. code-block:: python
nest = Nested(['sum', 'mean', 'min', ('nanmean', lambda x: np.nanmean(x))])
df = nest.find_relations(seg_path, 'p{r}_y{c}_r{z}_c1.ome.tif', 'p{r}_y{c}_r{z}_c0.ome.tif')
print(df.head())
The result is

.. code-block:: bash
Image Parent_Label Child_Label
0 /path/to/image 72.0 65.0
1 /path/to/image 71.0 66.0
2 /path/to/image 70.0 64.0
3 /path/to/image 68.0 61.0
4 /path/to/image 67.0 65.0
The `featurize` method can then be used along with the child features to apply the aggregate functions. The `featurize` method
takes in the `features` DataFrame generated by Nyxus, which contains the features calculations for each ROI, along with the DataFrame
containing the parent-child relations from the `find_relations` method. The output of this method is a DataFrame containing

.. code-block:: python
df = nest.featurize(df, features)
print(df.head())
The result is

.. code-block:: bash
GABOR_0 GABOR_1 GABOR_2 ... GABOR_4 GABOR_5 GABOR_6
sum mean min nanmean sum mean min nanmean sum mean ... min nanmean sum mean min nanmean sum mean min nanmean
label ...
1 24.010227 0.666951 0.000000 0.666951 19.096262 0.530452 0.001645 0.530452 17.037345 0.473260 ... 0.773810 0.897924 32.060053 0.890557 0.767857 0.890557 31.643434 0.878984 0.753968 0.878984
2 13.374170 0.445806 0.087339 0.445806 7.279187 0.242640 0.075000 0.242640 6.390529 0.213018 ... 0.735000 0.885494 26.414860 0.880495 0.727500 0.880495 25.886468 0.862882 0.700000 0.862882
3 5.941783 0.198059 0.000000 0.198059 3.364149 0.112138 0.000000 0.112138 2.426409 0.080880 ... 0.858462 0.900500 26.836040 0.894535 0.858462 0.894535 26.172914 0.872430 0.829231 0.872430
4 13.428773 0.559532 0.000000 0.559532 12.021938 0.500914 0.008772 0.500914 9.938915 0.414121 ... 0.820175 0.945459 22.572913 0.940538 0.802632 0.940538 22.270382 0.927933 0.787281 0.927933
5 6.535722 0.181548 0.000000 0.181548 1.833463 0.050930 0.000000 0.050930 2.083023 0.057862 ... 0.697917 0.819318 29.094328 0.808176 0.693452 0.808176 28.427727 0.789659 0.675595 0.789659
The other way to utilize the Nested class is to not pass any aggregate features to the constructor. In this case, the `featurize` method with create a
pivot table where the rows are the ROI labels and the columns are grouped by the features.

.. code-block:: python
nest = Nested(['sum', 'mean', 'min', ('nanmean', lambda x: np.nanmean(x))])
df = nest.find_relations(seg_path, 'p{r}_y{c}_r{z}_c1.ome.tif', 'p{r}_y{c}_r{z}_c0.ome.tif')
df = nest.featurize(df, features)
print(df.head())
The result is

.. code-block:: bash
GABOR_0 ... GABOR_6
Child_Label 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 ... 55.0 56.0 58.0 59.0 60.0 61.0 62.0 64.0 65.0 66.0
label ...
1 0.666951 NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 NaN 0.445806 NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 NaN NaN 0.198059 NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 NaN NaN NaN 0.559532 NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
5 NaN NaN NaN NaN 0.181548 NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
Binary file added docs/source/img/child_int.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/img/child_seg.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/img/parent_int.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/img/parent_seg.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
98 changes: 98 additions & 0 deletions src/nyx/python/nested_roi_py.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -659,5 +659,103 @@ bool mine_segment_relations (

}

return true; // success
}


/// @brief Finds related (nested) segments and sets global variables 'pyHeader', 'pyStrData', and 'pyNumData' consumed by Python binding function findrelations_imp()
bool mine_segment_relations (
bool output2python,
const std::string& label_dir,
const std::string& parent_file_pattern,
const std::string& child_file_pattern,
const std::string& outdir,
const ChildFeatureAggregation& aggr,
int verbosity_level)
{

std::vector<std::string> parentFiles;
readDirectoryFiles(label_dir, parent_file_pattern, parentFiles);

std::vector<std::string> childFiles;
readDirectoryFiles(label_dir, child_file_pattern, childFiles);

// Check if the dataset is meaningful
if (parentFiles.size() == 0)
{
throw std::runtime_error("No parent files to process");
}

if (childFiles.size() == 0)
{
throw std::runtime_error("No child files to process");
}

if(childFiles.size() != parentFiles.size())
{
throw std::runtime_error("Parent and child channels must have the same number of files");
}

// Prepare the buffers.
// 'totalNumLabels', 'stringColBuf', and 'calcResultBuf' will be updated with every call of output_roi_relational_table()
theResultsCache.clear();

// Prepare the header
theResultsCache.add_to_header({ "Image", "Parent_Label", "Child_Label" });

// Mine parent-child relations
for (int i = 0; i < parentFiles.size(); ++i)
{
auto parFname = parentFiles[i];
auto chiFname = childFiles[i];

// Diagnostic
//if (verbosity_level >= 1)
// std::cout << stem << "\t" << parent_channel << ":" << child_channel << "\n";

// Clear reference tables
uniqueLabels1.clear();
uniqueLabels2.clear();
roiData1.clear();
roiData2.clear();

// Analyze geometric relationships and recognize the hierarchy
std::vector<int> P; // parents
bool ok = find_hierarchy(P, parFname, chiFname, verbosity_level);
if (!ok)
{
std::stringstream ss;
ss << "Error finding hierarchy based on files " << parFname << " as parent and " << chiFname << " as children";
throw std::runtime_error(ss.str());
}

// Output the relational table to object 'theResultsCache'
if (output2python)
{
ok = output_roi_relational_table_2_rescache (P, theResultsCache);
if (!ok)
throw std::runtime_error("Error creating relational table of segments");
}
else
{
ok = output_roi_relational_table_2_csv (P, outdir);
if (!ok)
throw std::runtime_error("Error creating relational table of segments");
}

// Aggregate features
if (output2python)
{
// Aggregating is implementing externally to this function
}
else
{
ok = aggregate_features (P, outdir, aggr);
if (!ok)
throw std::runtime_error ("Error aggregating features");
}

}

return true; // success
}
Loading

0 comments on commit b4e6628

Please sign in to comment.