Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset syncing ES-Index issue #536

Open
EddieLF opened this issue Aug 28, 2023 · 0 comments
Open

Dataset syncing ES-Index issue #536

EddieLF opened this issue Aug 28, 2023 · 0 comments

Comments

@EddieLF
Copy link
Contributor

EddieLF commented Aug 28, 2023

The sync process is reporting success when syncing the ES-Index for a dataset, despite failing silently somewhere. Manual attempts to sync are successful despite no obvious difference.

To reproduce:

  • Create a new ES-Index for a project that adds at least one sequencing group
  • Synchronize the project and observe the successful sync messages, including a message that the new sequencing group(s) have been added to the index
  • Check seqr and see that the new sequencing groups are still "waiting for data" despite a successful sync
  • Sync the ES-Index by manually POSTing the exact same data
  • Observe that the new sequencing groups are now available in seqr.

What is working in the code?

Everything in this block seems to be working fine, including the "Samples added to index" and "Sequencing groups missing from index" messages. This means that returning the last es-index and second last es-index (based on timestamp) is working.

This line also works fine, which means that the above resp_1.raise_for_status() succeeded, since this should raise an exception if the response to the POST is not OK/200.

So where is it going wrong? I really don't know.

  • The mapping file pulls the sequencing group IDs and participant external IDs from the participant layer and uses the sequencing_group_id_format(id) function to convert the integer sequencing group ID into the CPG string ID. I'm pretty sure this part is working as intended.
  • The es-index filtering section is also working OK, as demonstrated by the fact that the messages are appended and returned just fine. The code finds the correct ES-Index and reports the new / missing sequencing groups as expected.
  • The POST statement uses the same SEQR_URL as everything else, and the _url_update_es_index variable is defined correctly.

This has not been a big blocker since the seqr sync can be run manually, which I have been doing. As far as I can tell, a manual run does nothing different except for using the API to find the ES-Indexes instead of the db layers. I don't think this should make any difference.

I can provide a live demo of this issue on production data since we currently have some ready to sync.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant