Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Add tol parameter to events_from_annotations #12324

Merged
merged 10 commits into from
Feb 8, 2024

Conversation

rcmdnk
Copy link
Contributor

@rcmdnk rcmdnk commented Dec 25, 2023

Reference issue

Fixes #12321.

What does this implement/fix?

This change adds a tol (tolerance) parameter to annotations.events_from_annotations.

In the function, the following equation is affected by rounding errors:

good_events = annot_offset - _onsets >= chunk_duration

Even if annot_offset - _onsets[x] equals chunk_duration through manual calculation, annot_offset - _onsets[x] could be slightly smaller than chunk_duration.

To address this issue, the tol parameter is introduced:

good_events = annot_offset - _onsets >= chunk_duration - tol

Additional information

It may be advisable to use a non-zero default value for tol. In this context, using a parameter fraction like epsilon may be better.
The tol value should be calculated as:

tol = (1. / raw.info['sfreq']) * epsiron

And a default value of epsilon=1e-5 seems appropriate.

However, this default value alters the behavior, so it should be properly communicated to users.

@agramfort
Copy link
Member

the issue you report in #12321 suggests it's a silent bug that can happen quiet easily. I fear that with just the docstring you have here, users will unlikely do the right thing. Do you see a way to have a better default? Do you see a case where using tol=1e-15 can lead to some unexpected behavior?

do you think there is bug in this tutorial https://mne.tools/stable/auto_tutorials/clinical/60_sleep.html ?

@rcmdnk
Copy link
Contributor Author

rcmdnk commented Dec 28, 2023

@agramfort
Thanks for the comment.

The example case needs at least tol of 1e-11:

>>> onset=32760.12
>>> duration=30.0
>>> chunk_duration=30.0
>>> onset + duration - onset - chunk_duration
-3.637978807091713e-12

I think something like 1e-8 is enough large for the tolerance
and also enough small comparing to the interval of data (for frequency up to O(100Hz) at most).

I updated the default value of tol = 1e-8.

For the tutorial, Both meas_date of Raw (then it is the orig_time of the annotation) and all onset of Annotations are not decimals and it is not affected by this issue.

@rcmdnk rcmdnk changed the title [ENH] Add tol parameter to events_from_annotations [BUG] Add tol parameter to events_from_annotations Dec 28, 2023
@rcmdnk rcmdnk changed the title [BUG] Add tol parameter to events_from_annotations [FIX] Add tol parameter to events_from_annotations Dec 28, 2023
@agramfort
Copy link
Member

Ok I fully get the issue now. Note that this reminds me of:

>>> np.arange(7.8, 8.4, 0.05)
array([7.8 , 7.85, 7.9 , 7.95, 8.  , 8.05, 8.1 , 8.15, 8.2 , 8.25, 8.3 ,
       8.35, 8.4 ])

basically when you have a float step size you have rounding errors. Here it's created cause there is a loss of accuracy with the meas_date but it feels it can happen also without.

what makes me less supportive of this change is that if you use the public API ie raw.set_annotations(annot) then the issue does not seem to happen.

with raw.info._unlock(check_after=True):
raw.info["meas_date"] = meas_date
annot = Annotations([32730.12, 32760.12, 32790.12], 30.0, ["0", "1", "2"], 0)
raw._annotations = annot
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raw.set_annotations(annot)

is the recommended way. If you do this then the issue does not occur on my end.

@agramfort
Copy link
Member

@rcmdnk sorry my last comment about using raw.set_annotations had not been sent.

@rcmdnk rcmdnk force-pushed the fix_events_from_annotations branch 3 times, most recently from b9cd0c1 to 1780981 Compare January 15, 2024 11:20
@rcmdnk
Copy link
Contributor Author

rcmdnk commented Jan 15, 2024

@agramfort,

Apologies for the delayed response.

To clarify, the previous test would not have encountered any loss if set_annotations were used, because the orig_time of the annotation is modified within this function. The reason I opted not to use set_annotations is due to my choice of 32730.12 as the onset value. Starting from 0 with this onset value would require a significantly large dataset (on the order of 10,000,000).

However, if the meas_date of the raw data and the orig_time of the annotations are identical, using set_annotations won't alter the onsets, and the original issue persists. Furthermore, if the duration includes a decimal value, the same problem can arise.

I've simplified the test to demonstrate that using set_annotations with a tol value can effectively address the issue as anticipated.

Copy link
Member

@agramfort agramfort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok it seems legit !

a couple of remaining nitpicks.

sorry for the slow reaction time ...

@@ -0,0 +1 @@
Add ``tol`` parameter to :meth:`mne.events_from_annotations`, by `Michiru Kaneda`_
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you be more explicit for this sentence? it's written as an enhancement and not really a bug fix. Something like

Add tol parameter to :meth:mne.events_from_annotations so that ... when using chunk_duration != None, by Michiru Kaneda_

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is an update:
7dd7375

The tolerance which is used to check if the calculated onsets have the
enough duration to the end of the annotation. If the duration from the
onset to the end of the annotation is smaller than ``chunk_duration``
minus ``tol``, the onset will be discarded.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you clarify that this tol parameter is only useful when chunk_duration != None?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed:
8b146a2

@larsoner
Copy link
Member

larsoner commented Feb 6, 2024

@rcmdnk sorry for the hassle but it looks like there is a conflict, could you git fetch upstream && git merge upstream main and fix the conflicts (or rebase if you prefer)?

@rcmdnk rcmdnk force-pushed the fix_events_from_annotations branch from 8b146a2 to 727fe92 Compare February 7, 2024 01:16
@rcmdnk
Copy link
Contributor Author

rcmdnk commented Feb 7, 2024

@larsoner
ok, the branch is rebased on main now.

Copy link
Member

@drammock drammock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just 2 suggestions. +1 for merge after those are addressed

Comment on lines 1577 to 1581
The tolerance which is used to check if the calculated onsets have the
enough duration to the end of the annotation when``chunk_duration`` is
not ``None``. If the duration from the onset to the end of the
annotation is smaller than ``chunk_duration`` minus ``tol``, the onset
will be discarded.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor wording change; I think this makes the purpose clearer (but please confirm that I've understood the purpose correctly!)

Suggested change
The tolerance which is used to check if the calculated onsets have the
enough duration to the end of the annotation when``chunk_duration`` is
not ``None``. If the duration from the onset to the end of the
annotation is smaller than ``chunk_duration`` minus ``tol``, the onset
will be discarded.
The tolerance used to check if a chunk fits within an annotation
when ``chunk_duration`` is not ``None``. If the duration from a
computed chunk onset to the end of the annotation is smaller than
``chunk_duration`` minus ``tol``, the onset will be discarded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this change is fine with me.

@@ -819,6 +819,57 @@ def test_events_from_annot_onset_alingment():
assert raw.first_samp == event_latencies[0, 0]


def test_events_from_annot_with_tolerance():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test is ripe for a @pytest.mark.parametrize decorator. Do you know how to do that? If not I can push the change... something like

@pytest.mark.parametrize(
    "use_rounding,tol,shape,idx",
    (
        pytest.param(True, 0, (2, 3), [202, 402], id="rounding-notol"),
        pytest.param(True, 1e-8, (3, 3), [202, 302, 402], id="rounding-tol"),
        pytest.param(False, 0, (3, 3), [202, 302, 402], id="norounding-notol"),
        pytest.param(False, 1e-8, (3, 3), [202, 302, 401], id="norounding-tol"),
    )
)
def test_events_from_annot_with_tolerance(use_rounding, tol, shape, idx):
    """."""
    # ...setup code here
    events, _ = events_from_annotations(
        raw,
        event_id={"0": 0, "1": 1, "2": 2},
        chunk_duration=1,
        use_rounding=use_rounding,
        tol=tol,
    )
    assert events.shape == shape
    assert (events[:, 0] == idx).all()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you for the suggestion; it's a smarter approach. The test has been updated to use parametrize.

update tests events_from_annotations with parameterize
@drammock drammock merged commit 6857f10 into mne-tools:main Feb 8, 2024
30 checks passed
@drammock
Copy link
Member

drammock commented Feb 8, 2024

thanks @rcmdnk!

@rcmdnk rcmdnk deleted the fix_events_from_annotations branch February 14, 2024 11:46
snwnde pushed a commit to snwnde/mne-python that referenced this pull request Mar 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add tolerance parameter to events_from_annotations to
4 participants