-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
we need a few special case collection edit methods #175
Comments
Oh and the "few" in the title refers to the fact that we might want a "remove this study+tree" from the default list. and perhaps a "move this study+tree pair to a different collection". I think that the add operation is the only one that is likely to be needed to assure smooth demos/workshops. removing or moving are probably rare enough, that I doubt we'll see lots of simultaneous usage. |
[Paraphrased from the original issue:] There might be simpler solutions using git's own configuration options. I'm looking into clean/smudge filters, and whether we can configure them on GitHub. (We might be able to minimize these bugs by setting this up on the api server.) I'm also taking a quick look at |
OK, after some exploration with git configuration, I'm almost ready to give up. @mtholder, I know you're familiar with git internals, so maybe you'll know whether this idea is a dead end. I've worked out how to configure a custom diff-tool and merge-tool for a local repo, which in principle allows us to work in a JSON-aware way (parsing the JSON documents to resolve differences, instead of using git's dumb line-wise algorithm). But these only work "downstream" of the initial commit logic, once a merge conflict has been detected. I can't seem to change the linewise logic used during a commit, which is important in our most common use case: two different curators each add a tree, which git interprets as a trivial series of changes, instead of two distinct trees. Thoughts? |
Here's another thought on how to handle this requirement (easily adding a tree to synthesis): We keep exactly what we have, including the When it's time to run synthesis, we simply create a temporary collection (by scanning the phylesystem data for preferred trees) and append it to the other synthesis inputs. To me, this makes sense for a few reasons:
|
I like this last idea from @jimallman : it gets us what we want (an easy way for people to add trees to synthesis) with very little overhead. As part of synthesis, we can scan phylesystem and check curation status for proposed trees and add those to a collection in one operation. |
I don't have a problem with this (@jimallman's suggestion to use We may face a bit of a social hurdle because (up until now) using the But that is a short term issue. Presumably we can fix any scaling issues, and untoggle the |
Longer term, I think we may want to add a validation step before someone can propose a tree for synthesis. Short term, we will almost certainly need a validation step before propinquity puts the proposed trees into synthesis. |
Don't we have the same problem if the UI adds each tree to a collection for synthesis? The current behavior is easy to maintain, just by treating these trees as candidates and expecting a curator to assign them to a proper synth collection. |
We do have the same problem if the UI just adds trees to a grab bag. The only difference is that the grab-bag (opentreeoflife/default collection) has just 4 trees rather than hundreds of trees. As I say, I don't object to using I agree that the synth should just filter trees that appear multiple times (retaining only the first). Currently it would not matter (for the output tree) if the same tree shows up more than one time. |
Ah, because of the legacy data. Gotcha. |
OpenTreeOfLife/germinator#82 discusses some errors that cropped up in the context of multiple curators trying to add a study to the default synth collection. I originally thought that these were indicative a race condition, but that was an incorrect diagnosis.
The phylesystem-api for studies assumed that it would be pretty rare for multiple curators to be working on the same study at the same time, so we could tolerate the second curator's work being stranded on a work-in-progress branch until some manual fixes were performed on the server.
But the default synth collection (https://github.com/OpenTreeOfLife/collections-1/blob/master/collections-by-owner/opentreeoflife/default.json) is a single resource that multipe curators may be interested in. Particularly at workshops, when we are demonstrating to a group how to add a tree (and several people are playing along on their own laptops).
We also observed that users frequently want to just add a tree to the queue, and don't want to worry about the ranking.
That argues for an "add this tree" button. And the cleanest way to support that (and minimize chance of a change being stuck on a work in progress branch) seems like it would be a service in the API that:
There would be no chance for the addition to end up on a work-in-progress branch in this workflow.
The text was updated successfully, but these errors were encountered: