-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
need to design a tree store for large studies/trees that are integrated into OpenTree via scripts #111
Comments
Just to offer one vague suggestion: We could have tiny NexSON stub for the study that lives in phylesystem and has the reference info, other associated data, and (in some cases) perhaps the OTU mapping. And then just have links to newicks that are stored elsewhere. We'd need to work on a syntax for augmenting the raw newick with extra info (e.g. ingroup and perhaps interpretation of branch lengths). But it shouldn't be too hard to do that. |
+1 to the nexson stub idea, especially for the published large trees. and maybe a fully separate repo for the automatically generated trees that haven't been peer reviewed. I'll think on this! |
s3 storage was $.03 / Gb / month last I checked. The command line tool for On Fri, Sep 23, 2016 at 1:19 PM, Emily Jane McTavish <
|
Perhaps of interest: https://help.github.com/articles/about-git-large-file-storage/ |
A use case from @bomeara: OpenTreeOfLife/opentree#788 |
@josephwb has a supertree study with >1000 trees.
@snacktavish will soon have automatically generated gene trees.
Brian O'Meara has some very large trees for which he has/wants to have the OTU mapping done with scripts.
These don't really fit nicely into the design of phylesystem as data store for manually curated trees (most of which are presumably intended for synthesis).
Huge studies will make us hit our git repo limit sooner (we can use shards then, but still a bit of an annoyance). Plus they won't load in the curator app (they'll require to much RAM).
Automatically curated trees are a generated product that probably should not be versioned.
So basically we need a data store that:
The text was updated successfully, but these errors were encountered: