How to deal with forecast rates ≤ 0 in any LL-based test? #226

mherrmann3 · 2023-05-21T11:39:40Z

It's not a surprise that zero (or even negative) rates cause issues in LL scores, but I believe pyCSEP does not generally check, deal, or warn in LL-based tests if forecasts contain them (an example: Akinci's HAZGRIDX in the 5-yr Italy experiment).

This affects every grid-based and catalog-based test except the N-test.

I noticed one exception: rates ≤ 0 are silently omitted in binomial_evaluations using masked arrays (numpy.ma.masked_where). But this is not an optimal treatment, because it provides a model the opportunity to cheat and game the system: in areas where it is unsure or expects low seismicity, it simply forecasts 0; if a target event should occur in such bins, it won't count in the testing. Apart that, excluding rates ≤ 0 could trigger a corner case in the T-test when all target event rates are ≤ 0 (see case 3 in #225).

So the better approach is to replace forecast rates ≤ 0 (in a reproducible way, not randomly), e.g., with

the minimum among all forecast rates,
an order of magnitude below the minimum,
the average among the surrounding bins, or
...

It would be convenient to write a separate function for this treatment and use it

at the top of .core.poisson_evaluations._poisson_likelihood_test(),
at the top of .core.poisson_evaluations._t_test_ndarray(),
at the top of .utils.calc._compute_likelihood(), and
somewhere in the middle of .core.catalog_evaluations.magnitude_test().

The text was updated successfully, but these errors were encountered:

mjw98 · 2023-05-21T16:13:41Z

Shouldn’t subzero rates throw an error to indicate unphysical numbers? A zero rate cell with an earthquake should in my view generate a negative infinity in the rate based tests, at least so the evaluators are aware. They may then decide on a strategy if the experiment design didn’t already.

…

________________________________ From: Marcus Herrmann ***@***.***> Sent: Sunday, May 21, 2023 12:39:51 PM To: SCECcode/pycsep ***@***.***> Cc: Subscribed ***@***.***> Subject: [SCECcode/pycsep] How to deal with forecast rates ≤ 0 in any LL-based test? (Issue #226) It's not a surprise that zero (or even negative) rates cause issues in LL scores, but I believe pyCSEP does not generally check, deal, or warn in LL-based tests if forecasts contain them (an example: Akinci's HAZGRIDX in the 5-yr Italy experiment). This affects every grid-based and catalog-based test except the N-test. I noticed one exception: rates ≤ 0 are silently omitted in binomial_evaluations<https://github.com/SCECcode/pycsep/blob/master/csep/core/binomial_evaluations.py> using masked arrays (numpy.ma.masked_where). But this is not an optimal treatment, because it provides a model the opportunity to cheat: in areas where it is unsure, it simply forecasts 0; if a target event should occur, it won't count. Apart that, excluding rates ≤ 0 could trigger a corner case in the T-test when all target event rates are ≤ 0 (see case 3 in #225<#225>). So the better approach is to replace forecast rates ≤ 0 with something small, preferably the minimum among all forecast rates, or similar (e.g., an order of magnitude below the minimum). The most suitable location for this treatment is at the top of * .core.poisson_evaluations._poisson_likelihood_test(), * .core.poisson_evaluations._t_test_ndarray(), * .utils.calc._compute_likelihood(), and * .core.catalog_evaluations.magnitude_test(). ― Reply to this email directly, view it on GitHub<#226>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AD6OISJF5MK2ICQRGR2YSQTXHH5IPANCNFSM6AAAAAAYJK7KN4>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

mherrmann3 · 2023-05-22T08:31:44Z

Currently no Error is thrown when reading or processing unreasonable rates. Case 2 in #225 illustrates what happens in the T-test:

two obscure and generic (Runtime)Warnings are logged, which may go unnoticed;
yes, the IG becomes -inf.

Good idea, Max, to use -inf as an indicator for the evaluators to act on! But outside of elaborate experiment designs, a regular pyCSEP user, if (s)he notices it, may not know what and why it happens, ignore the RuntimeWarnings, and accept the -inf. So perhaps pyCSEP could feature

an informative Warning (or Error) message about the actual problem, and
an easy way to deal with it having a reasonable default solution - either automatically or manually (e.g., via a flag); this avoids modifying the input data.

Granted, it's a rare case, but it does seem to happen with grid-based forecasts. Catalog-based forecasts with a limited number of simulations and no dedicated background simulation may be even more prone. But since target events should rarely occur in ≤ 0 rate bins, I believe that an -inf result unnecessarily invalidates an otherwise "usable" LL score.

pabloitu · 2023-05-30T13:21:15Z

Nice catch. I think the best is to provide a forecast format checker (e.g. prohibit negative mean rate values), and additionally, to raise an additional warning in case the forecast has zero-rate bins. However, the now present Runtime warnings are low-level numpy warnings, and although they are ugly, I don't think they should be silenced or bypassed. Also, agree with -inf should be passed as the given result.

Personally, I wouldn't provide options (or even less defaulting) to "fix" the forecasts. Although can be obvious that they need fixing for negative rates, it is not for zero rated forecasts. The method selection would have an impact on expected results, and perhaps should be addressed by the modeler in their workflow as a modeling decision, not testing.

wsavran · 2023-06-06T15:09:10Z

I agree with Pablo here. Grid based forecasts should not contain zero-rate cells, so we should have a mechanism for checking this condition. If we want to use the forecasts as is, we can implement some type of masking process into forecasts to simply ignore these cells and carry on with the tests like normal.

mherrmann3 · 2023-06-16T10:07:16Z

So let me summarize what we probably agree on to implement:

a format checker of grid-based forecasts
an informative Warning message if rates ≤ 0 are present (because we expect Poisson rates, which must be positive); perhaps mention already that LL-based scores may become -inf/undefined
an informative Warning message if a LL-based score is indeed -inf (due to a target event occurring in a 0-rate bin) instead of RuntimeWarnings; this Warning ...
- ... effectively solves case 2 in T-test: deal with corner cases that raise RuntimeWarnings #225
- ... will also be thrown in catalog-based forecasts
- ... will be a hint for the evaluator to act, but we will leave him/her alone with that
avoid masking of forecast rates ≤ 0 by all means (as this would allow cheating)

Things are a bit different for the Brier score (see #232 for feedback from Francesco on this issue), which is not affected by rates = 0 (not sure about negative rates). So additionally we could do inside the format checker (still have to agree on):

clamp rates < 0 to 0 – and inform the user about that operation
But since this is such a rare case (that I have never seen), we could also just do nothing in such a case and leave it with the Warning message in the format checker – this would be consistent with also doing nothing if rates = 0.

(I can keep those item list and check boxes updated based on further discussions and/or progress 👀)

pabloitu · 2023-06-16T10:20:56Z

Thank you Marcus with the summary!

I agree with 1, 2 and 4.
I would leave number 3 as is, because it should be clear already to the modeler with checkbox 2, and we shouldnt suppress numpy warnings.
About brier score, we should implement it without masking.
About 5, we should raise an exception in the format checker (negative rates are non-physical, and should not be accepted by pycsep).

Somethings I have already done in floatcsep, and should be moved to pycsep. Will update soon about this.

pabloitu · 2023-11-16T00:09:44Z

Just wanted to mention, given our discussion last meeting, that having a checker for negative rates or zero rates is not trivial because of performance.

We could

we check on initialization, which screws up on-the-fly instantiation of array (crazy useful)
we check on returning the rates (calling the property forecast.data), but this would particularly slow us down when using global forecasts, i.e., everytime when data is needed doing an if check on 200m cells, .

any ideas welcome before entering testing those 2 ideas.

mherrmann3 · 2023-11-22T10:24:03Z

Since we only want to be informative, without modifying the forecast data at all, I believe option (1) would be sufficient: simply implement a format checker that gets called after loading the forecasts. (a check on every T-test execution would be indeed overkill).

I'd place the format checker in a new method _check_rates() or _assert_validity() of class core.forecasts.GriddedForecast. It could get called either:

in .load_ascii() after defining rates = , or
in csep.load_gridded_forecast before return forecast .

Likewise, we should add a format checker in core.forecasts.CatalogForecast, which informs if any bin of .expected_rates is zero (cannot be negative, unless provided on initialization).

About

an informative Warning message if a LL-based score is indeed -inf

in my previous comment: this is not a performance issue as it checks only one value, not an array.

wsavran · 2023-11-22T20:06:15Z

I agree with @mherrmann3 here. I think we should have a checker that gets called after loading the forecasts along with a flag that can disable this option in case someone doesn't want to use it. An associated warning could let them know in the output that the results are unchecked.

As far as the catalog forecasts go, a checker will be a lot more heavy weight because you will have to create a gridded forecast from the catalogs in order to do the checking. This should probably be called in .expected_rates() like Marcus said.

mherrmann3 mentioned this issue May 21, 2023

T-test: deal with corner cases that raise RuntimeWarnings #225

Open

mherrmann3 mentioned this issue Jun 14, 2023

Implementation of the Brier Score and its consistency test. #232

Open

8 tasks

pabloitu mentioned this issue Sep 2, 2024

Need support for additional grid (and catalog) forecast formats #266

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to deal with forecast rates ≤ 0 in any LL-based test? #226

How to deal with forecast rates ≤ 0 in any LL-based test? #226

mherrmann3 commented May 21, 2023 •

edited

Loading

mjw98 commented May 21, 2023 via email

mherrmann3 commented May 22, 2023

pabloitu commented May 30, 2023

wsavran commented Jun 6, 2023

mherrmann3 commented Jun 16, 2023 •

edited by pabloitu

Loading

pabloitu commented Jun 16, 2023

pabloitu commented Nov 16, 2023

mherrmann3 commented Nov 22, 2023 •

edited

Loading

wsavran commented Nov 22, 2023

How to deal with forecast rates ≤ 0 in any LL-based test? #226

How to deal with forecast rates ≤ 0 in any LL-based test? #226

Comments

mherrmann3 commented May 21, 2023 • edited Loading

mjw98 commented May 21, 2023 via email

mherrmann3 commented May 22, 2023

pabloitu commented May 30, 2023

wsavran commented Jun 6, 2023

mherrmann3 commented Jun 16, 2023 • edited by pabloitu Loading

pabloitu commented Jun 16, 2023

pabloitu commented Nov 16, 2023

mherrmann3 commented Nov 22, 2023 • edited Loading

wsavran commented Nov 22, 2023

mherrmann3 commented May 21, 2023 •

edited

Loading

mherrmann3 commented Jun 16, 2023 •

edited by pabloitu

Loading

mherrmann3 commented Nov 22, 2023 •

edited

Loading