-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restrict MCMC to sets with at least one observed element #3
base: master
Are you sure you want to change the base?
Conversation
It contains two commits of Pull request #3.
I committed the two fixs of the pull request. For the remaining one, it is a great addition (if I remember correctly, Ontologizer also does something similar). But I'm wondering whether it is possible to implement the removing of the sets in a upper layer (e.g., in the R code) and make this available as an option. It is a change in the (documented) model and thus it should be switchable, which also helps documentation. Having this is the C code is also okay, but it should be switchable by the caller. |
The "disabled" sets still contribute to the posterior probability via The flag to enable/disable this optimization would be easy to implement. |
Yes, the results would be different hence I think that it would be useful to switch this on/off. I missed that you still account for them in the score. Still I think that if you won't allow these sets to become active one imposes an additional constraint to the model (you impose another prior over the active terms hence the marginalized posteriors are different), in particular for fixed alpha/beta/p. The previous model allowed that unrelated terms could be switched on, making the marginalized probability usually lower (but more spread across all terms, not only terms that contains observations) to account for the ambiguity. Excluding them from to be chosen (this is my understanding of your change) will have an impact on this. I haven't really looked into this since a while, may be I'm missing something and thus I can be wrong of course. I agree however, that users probably don't expect that terms come up that have no annotations. However, if this happens, this could also be an indication that the experiment was not specific enough. So I opt to have a switchable option and also to have an explanation for users. Regarding weighted sets in general, it really would be great to have such a feature. It has been on my TODO list since a long time. |
Hyperparameters were not updated anymore after half the burn-in stage
a11172f
to
da13705
Compare
Consists of 3 commits. Commit information: Commit id: 032e785 Merge branch 'alyst-fixes' It contains two commits of Pull request #3. Committed by: Sebastian Bauer Author Name: Sebastian Bauer Commit date: 2014-10-25 13:58:37 +0200 Author date: 2014-10-25 13:58:37 +0200 Commit id: 4287f56 use log1p() where possible log1p(-x) should be more accurate and faster than log(1-x) Committed by: Alexey Stukalov Author Name: Alexey Stukalov Commit date: 2014-10-24 19:06:07 +0200 Author date: 2014-10-23 15:28:46 +0200 Commit id: ba756a9 fix parameter sampling Committed by: Alexey Stukalov Author Name: Alexey Stukalov Commit date: 2014-10-23 02:09:01 +0200 Author date: 2014-10-23 02:05:37 +0200 git-svn-id: https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/mgsa@96027 bc3139a8-67e5-0310-9ffc-ced21a209358
The PR allows MCMC active/inactive set flag sampling only to those sets that have at least one observed element; the sets without observed elements are fixed in inactive state.
There are 2 advantages:
p
had to be constrained to small values to avoid degenerated solutions (p
> 0.5 and many sets without observed elements are active); by disabling a priori unrelated sets we exclude the degenerated solutions, andp
constraints could be relaxed