-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define some metric by which we can state a probability that two datasets come from the same underlying model #73
Comments
@arm61 I'm not sure how you would use the Z-test in this case. Maybe check that the mean of the normalized residuals is zero? This would be less sensitive than χ² to large deviations, and symmetric deviations (such as you would get by shifting Kiessig fringes) may even leave the mean unchanged. Given that a shift in the fringes will lead to a bimodal distribution in the residuals when it is large enough, I tried to improve the sensitivity of the χ² test by also testing that the residuals follow a standard normal. Using a shifted sine wave with the Anderson-Darling test it does indeed reduce false positives but it also increases false negatives when there is no shift. In this simple example the χ² test alone is a better choice. A normality test may also be affected by any non-normality in our measurement uncertainty so I would not recommend using it. |
I was referring to http://homework.uoregon.edu/pub/class/es202/ztest.html, which has the same equation as you provide above. |
This may not strictly be part of the
Analysis
remit, but I know it would certainly be useful for it so I add the issue here.If we have two datasets, both with error bars on every point (lets simplify our lives and at least initially assume the data points have the same x-values and that the error bars are solely in y), then we wish to know what the probability that the datasets come frrom the same underlying distribution? That is to say, if both were to be measured such that the error bars on all data points become infinitesimal, then the data points would all lie in the same place.
As far as I can understand this is a solved/trivial problem for a single dataset and a known distribution, but I am unaware of something which is able to deal with both the datasets having error bars, and the underlying distribution being unknown.
I can provide almost infinite datasets to test any theory through HOGBEN.
Thoughts on this problem and any potential solutions are most welcome.
The text was updated successfully, but these errors were encountered: