Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing approach is needed to better support core engine development #180

Open
michaeltryby opened this issue May 14, 2018 · 2 comments
Open

Comments

@michaeltryby
Copy link

michaeltryby commented May 14, 2018

The current regression testing framework is useful for development that doesn't alter simulation results. From time to time, however, it is necessary to make a change to the core SWMM engine to add a new feature, improve computational stability, or make the model more physically realistic. Currently, these scenarios cause our regression testing framework to fail. Simply making a change to the output report format to make it more useful and informative also causes test failures.

We need to provide testing tools that allow an engine developer to evaluate fundamental changes to the engine that result in a "benchmark shift." This could involve using different criteria for comparing models. Our current testing criteria, basically checks to see if the results are the same within a specified tolerance. @LRossman has suggested some additional criteria we could consider:

  1. Compare overall flow and mass continuity errors.
  2. Compare selected variables in the Summary Results tables (after SWMM sorts them from high to low).
  3. Compare time series plots of total system inflow, outflow, flooding and storage.
  4. Compare time series plots of inflow to the system’s outfalls.
  5. Compare the links with the Highest Flow Instability Indexes.

Further thoughts contributed by Lew and paraphrased here include: When a change is made to a core SWMM engine procedure, the expectation should not be that the new results exactly equal the old ones. Comparison against a benchmark is at best a useful surrogate for the quality of model results, since for these kinds of problems there is no “theoretically correct” solution to compare against. Instead, we should make sure that any differences in results that get introduced in the course of development are “reasonably small.” A new method should be evaluated to insure that it produces a solution that is clearly more physically meaningful than the old one. The method's implementation should also be evaluated to ensure that continuity errors and model stability are improved.

The objectives when evaluating a benchmark shift are related but different than day to day regression testing. These new objectives need to be reflected in testing tools to better support core engine development.

@dickinsonre
Copy link

Great post @michaeltryby The system graphs in particular show so much about the functioning of the network. A graphical and/or statistical comparison of the inflow, outflow, flooding, runoff, DWF, GW, RDII and other major components would easily reveal important changes. I am not sure about the instability indices reveling much.

@samhatchett
Copy link

I think the discussion over on OpenWaterAnalytics/EPANET#169 is completely relevant here. We're in the realm of theoretical correctness and statistical validity being a condition of passing tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants