Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Rmote Store] Run integ tests with segrep and remote store enabled #12458

Open
Bukhtawar opened this issue Feb 26, 2024 · 3 comments
Open

[Rmote Store] Run integ tests with segrep and remote store enabled #12458

Bukhtawar opened this issue Feb 26, 2024 · 3 comments
Labels
enhancement Enhancement or improvement to existing feature or request Storage Issues and PRs relating to data and metadata storage

Comments

@Bukhtawar
Copy link
Collaborator

Is your feature request related to a problem? Please describe

We run integ tests without segrep or remote store enabled defaulting to docrep, which might create regression and mostly tech debt during release activities where we need to depend on manual sign-offs.

Describe the solution you'd like

Given the rate of changes we make we need to enable running randomised testing on decrep/segrep/remote store modes to improve code fidelity.

Related component

Storage

Describe alternatives you've considered

No response

Additional context

No response

@Bukhtawar Bukhtawar added enhancement Enhancement or improvement to existing feature or request untriaged labels Feb 26, 2024
@github-actions github-actions bot added the Storage Issues and PRs relating to data and metadata storage label Feb 26, 2024
@andrross
Copy link
Member

We run integ tests without segrep or remote store enabled defaulting to docrep

We have added parameterized tests for replication-focused tests to run both docrep and segrep. We've also added randomization to randomly select replication strategy for other tests not focused on replication. @Rishikesh1159 can you provide more detail about where we're at now and the testing you've done with enabling segrep-with-remote-store?

@Rishikesh1159
Copy link
Member

Rishikesh1159 commented Feb 28, 2024

Running Integ Tests with Segment Replication and Remote Store.

Goal :

Run integ tests with segrep and remote store enabled.

Problem :

We have two problems to acheive our goal:

1. Waiting after refresh : After a refresh, we need to wait some time for replicas to catch up with primary shards. The tests will fail with assertion on replica shard before it has caught up with primary.

2. Running all integ Tests : Every Integ Test is different, so there is no single framework for us to follow and enable segrep for all Integ Test at once. There are certain tests that test the behaviour not supported by segrep in this case those tests will fail if we run them with segment replication enabled.

Plan to Overcome Problems :

1. Waiting after refresh :

  • To overcome this problem we have introduced a new method called refreshAndWaitForReplication .
  • This refreshAndWaitForReplication will be used after an indexing operation, so with this method we refresh and then wait for replicas to catch up. This way only any assertions further made in test will be done only after replica has caught up.
  • Basically we will need to replace the refresh() call in IntegTest with refreshAndWaitForReplication .

2. Running all integ Tests :

Every Integ Test that we want to run with segrep/remote-store can be divided into below 4 categories:

A. Critical Tests related to segrep/remote-store :

  • The tests which are critical to and directly tests behaviour of segrep/remote-store can be put in this category.
  • These are tests which are already exists before and we want run them with both segrep and docrep or remote-store/non remote-store.
  • That means we need to run these critical test twice, once with segrep and another with docrep.
  • We have used parameterized testing approach for these critical tests to run a single test multiple times with different parameters.
  • Example PR: [Segment Replication] Add Segment Replication Specific Integration Tests #11773.

B. Non Crititical Tests :

  • The tests which don’t directly test segrep/remote-store behaviour can be put in this category.
  • Although these tests don’t test segrep behaviour directly but still can be related to a different feature/operation that is failing because of segrep/remote-store.
  • To identify these edge cases and proper working of segrep/remote-store we need to run these non-critical tests with segment replication as well.
  • As these are not critical tests we can use randomization here, by randomly picking a replication strategy for test, this way we would cover running tests with both segrep, docrep without running the test multiple times as we did above for critical tests. This increases the test coverage for segrep/remote-store.
  • Example PR: [Segment Replication] Add random replication strategy #12297

C. Segrep/Remote-Store Specific Integ Tests :

  • These are tests which previously didn’t exists with docrep. These tests are newly created specifically to test certain behaviors of segrep/remote-store.
  • As they are tests Specific to segrep/remote-store they are run only once with segrep/remote-store.
  • Example Integ Test : SegmentReplicationIT.java

D. Integ Tests not compatible with segrep/remote-store :

  • There are certain Integ tests related to a specific feature/operation that is currently not directly supported by segrep/remote-store. These kind of tests can be put in this category.
  • As these Integ tests are not compatible segrep/remote-store we completely avoid/ignore running them with segrep/remote-store enabled.
  • Example : ShrinkIndexIT

How to a run an existing Integ test with segrep/remote-store :

Step 1 :

Step 2 :

  • In the Integ Test replace the usage of refresh() with refreshAndWaitForReplication() .
  • If the test uses indexRandom(forceRefresh:true) then indexRandom() internally uses refreshAndWaitForReplication() so we don’t need to worry about refresh in this case.
  • For all other remaining type of invocations of refresh like : client().prepareIndex("test").setId("1").setSource("field","value1").setRefreshPolicy(RefreshPolicy.IMMEDIATE).get();we manually we need to call waitForReplication() after a write operation. This way we wait for replica shards to catch up with primary after every write operation.

Step 3 :

  • Sometimes few tests in a Integ Test suite make assertions on features/operations which are not compatible with segrep/remote-store. In those cases, we conditionally need to ignore the assertion.
  • Example here in IndexStatsIT, the test testSimpleStats() was asserting on count of write operations on replica shards. But with segment replication these write operation never happens on replica shards, so the test will fail with assertion. To cover such cases we add conditional check before making few assertions.

Step 4 :

  • After following all above steps, we have to make sure we run the Integ Test suite for few hundred iterations locally.
  • This step is needed to make sure that we are not adding any new flakiness to existing tests.

Current Status :

To reach the goal of Running Integ tests with segrep and remote store enabled. We need completion of following steps:

Step 1 :

Step 2 :

Step 3 :

Step 4 :

Step 5 :

Step 6 :

  • Make of list of existing integ tests which should be critical, non-critical and not compatible with segrep/remote-store. Once the list is made we need to modify each Integ Test to support them run them with segrep/remote-store enabled.
  • Implemented in PR: N/A
  • Status: NOT STARTED.

@Rishikesh1159
Copy link
Member

Rishikesh1159 commented Feb 28, 2024

We have added parameterized tests for replication-focused tests to run both docrep and segrep. We've also added randomization to randomly select replication strategy for other tests not focused on replication. @Rishikesh1159 can you provide more detail about where we're at now and the testing you've done with enabling segrep-with-remote-store?

Sure @andrross . @Bukhtawar here is current status:

Current Status :
To reach the goal of Running Integ tests with segrep and remote store enabled. We need completion of following steps:

Step 1 :

Step 2 :

Step 3 :

Step 4 :

Step 5 :

Step 6 :

  • Make of list of existing integ tests which should be critical, non-critical and not compatible with segrep/remote-store. Once the list is made we need to modify each Integ Test to support them run them with segrep/remote-store enabled.
  • Implemented in PR: N/A
  • Status: NOT STARTED.

As mentioned above we are currently working on steps 4,5. Once those are completed the process and framework will be completed and only step 6 will be remaining.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Storage Issues and PRs relating to data and metadata storage
Projects
Status: 🆕 New
Development

No branches or pull requests

3 participants