Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protect against accidental creation of duplicates #126

Open
rybesh opened this issue Mar 16, 2017 · 8 comments
Open

Protect against accidental creation of duplicates #126

rybesh opened this issue Mar 16, 2017 · 8 comments

Comments

@rybesh
Copy link
Member

rybesh commented Mar 16, 2017

Currently the following not-unreasonable workflow can result in accidentally creating duplicates in PeriodO:

  1. I create a new local database, and sync a period collection from the canonical dataset.
  2. I add some new periods to this collection in my local database and submit a patch.
  3. The patch gets merged. Great!
  4. Now I add some more new periods to the collection in my local database, thinking that it is "in sync" when actually it still has skolem IDs for the merged new periods.
  5. I submit a new patch, not realizing that it is a superset of my earlier, already merged patch.

Possible solution: give backends a GUID, store it with submitted patches, and before allowing patch submission from a backend, query for merged patches that resulted in new URIs, and update the skolem IDs (this assumes we are keeping that mapping around when we deskolemize).

@atomrab
Copy link

atomrab commented Mar 27, 2017

An additional problem here is that a local IDB does not display in the client the URIs of definitions that have been merged into the Canonical dataset -- so if you want to delete your skolem-ID definitions in order to update your local database with the newly merged definitions via sync (which will otherwise create duplicates in the local db), or if you find yourself with duplicates, you can't tell which ones have IDs and which don't from the edit interface. This means once you've introduced duplicates into your local IDB by syncing without purging skolem versions, you can only get the new, unsynced definitions out by editing the json file (there's no way to do it in the client without risking the deletion of merged definitions).

@rybesh
Copy link
Member Author

rybesh commented Sep 26, 2019

Need to test whether this is still an issue; I'm guessing it is.

@ptgolden
Copy link
Member

There's a function that does the ID replacing: https://github.com/periodo/periodo-client/blob/master/modules/periodo-app/src/linked-data/utils/skolem_ids.js

The question is, as you point out, Ryan, how can the client know that a new period has been accepted and now has a stable URI? I think at one point, I just periodically ran something in the background that automatically updated things, but that was a weird solution.

One thing we could do that might solve of the problems above (not submitting duplicate patches, and pulling down permanent URIs for accepted periods) is to add a little bit to the "Import changes" page. Maybe above the authority section, there could be a small section that says something like "24 of your local periods have been accepted into this dataset. Would you like to update their URLs?"

I can't imagine someone would just want to update a few of their periods' skolem IDs and leave the rest intact, right? This way, they wouldn't have to go through and click every period they want to update.

Another thing we could do is add a new change type, in addition to {add,edit,delete}{period,authority}. There could be a specific change type called "assign permanent identifier", and we could allow selecting all of certain kind of change. ...Although that would be hard, because we don't really have a way of only selecting certain changes to periods/authorities.

I vote for the "Accept permanent identifiers" alert on the import changes page.

@ptgolden
Copy link
Member

The GUID suggestion would work to fix the duplicate submission issue, but it wouldn't deal with what Adam is talking about without having to get tricky about periodically checking for patches' merge statuses. I would prefer to make it explicit instead.

@ptgolden
Copy link
Member

One hangup with the above that might require some stuff to be added to the server- there's not currently a way to see all the skolem ID replacements in the dataset without iterating over all the merged patches. To get around that, we could

  1. Have the client keep track of every patch it has submitted, and request them when syncing. (Merged patches could be cached, of course).
  2. Add a resource to the server that gives a skolem map for every item. Perhaps just for a certain client, if each client were assigned a GUID.

@rybesh
Copy link
Member Author

rybesh commented Jun 24, 2020

Option 2 would not be so difficult… this is just a dict with an entry for each period and authority, right? So it hardly seems worth it to even split it up by client. And we could cache it server-side in between merges.

@ptgolden
Copy link
Member

Yep. Let's do that then- no need to change the database schema to add a GUID.

@ptgolden
Copy link
Member

Implemented in #273

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants