Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-16503: Replace default USH apache client with Http2SolrClient #2741

Open
wants to merge 36 commits into
base: main
Choose a base branch
from

Conversation

iamsanjay
Copy link
Contributor

https://issues.apache.org/jira/browse/SOLR-16503

Replaced the default apache Http client provided by UpdateShardHandler with the Http2SolrClient provided by CoreContainer#getDefaultHttpSolrClient which will be introduced in #2689. As we are still deciding on what is the best way moving forward to set the url. Here, for now, almost everywhere a new Http2SolrClient is being re-created.

try (var solrClient =
          new Http2SolrClient.Builder(baseUrl)
              .withHttpClient(coreContainer.getDefaultHttpSolrClient())
              .build()) {
    // code goes here
}

Of course, If we decided to go with #2714 then setting up url and closing the instance would be replaced with a new abstraction.

var updateRsp = client.requestWithBaseUrl(url, (c) -> req.process(c))

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide

@iamsanjay iamsanjay changed the title SOLR-16503: Replace default USH apache client with Jetty Http2SolrClient SOLR-16503: Replace default USH apache client with Http2SolrClient Oct 4, 2024
@iamsanjay
Copy link
Contributor Author

iamsanjay commented Oct 5, 2024

SplitShardWithNodeRoleTest.testSolrClusterWithNodeRoleWithPull failed! Even in the past this one failed. If I look at the test, It seem simple, However the complexity that split operation involves is rather too much. This is how it runs.

  1. Create a collection with one shard containing one NRT and one PULL replica.
  2. waitForState to achieve above configuration
  3. Index 10 documents
  4. Commit collection
  5. Perform split operation
  6. waitForState for 2 active sub-shards (about 45 seconds, and this is where it failed.)

I am not sure If increasing Timeout will help us. But we can try it! In the logs, one can see that just before the assertion failed, the IndexFetcher was running. May be the PULL replicas were downloading the index, and the time finished and the sub-shard never really recovered.

Question: Do both NRT and PULL replicas need to be active for sub-shards to be active?

Note :- I found a JIRA ticked related to this one https://issues.apache.org/jira/browse/SOLR-16753.

Comment on lines 862 to 865
.withSocketTimeout(30000, TimeUnit.MILLISECONDS)
.withConnectionTimeout(15000, TimeUnit.MILLISECONDS)
.withHttpClient(updateShardHandler.getDefaultHttpClient())
.withZkClientTimeout(30000, TimeUnit.MILLISECONDS)
.withZkConnectTimeout(15000, TimeUnit.MILLISECONDS)
.withHttpClient(getCoreContainer().getDefaultHttpSolrClient())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

withSocketTimeout was on the HTTP connection at the LBHttpSolrClient layer not ZK one. The Http2/Jetty side doesn't quite work the same way. It creates an LBHttp2SolrClient without such customizations. Maybe they could bet set at the HttpClient (Jetty) layer, albeit this means you're not passing in AFAICT, there's no direct substitute for our Http2 builder; instead you'd have to create the Jetty HttpClient with those settings and then pass it in.

Aaaaanyway.... I don't think it's worth retaining such particulars here. I don't know why these specific timeouts were put here; they were added by @sigram in relation to ReindexCollection. I recommend we simply use the new getCoreContainer().getDefaultHttpSolrClient().

.withZkConnectTimeout(15000, TimeUnit.MILLISECONDS)
.withHttpClient(getCoreContainer().getDefaultHttpSolrClient())
.build()) {
try (var solrClient =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this client is embedded into the Cloud one, I think you should name this var maybe internalClient or something and don't refer to it in the block of code for the try-finally except for passing into the CloudSolrClient.

BTW; it really is painful to customize these particular timeouts as we're forced to do it at an inner layer. Ugh. @epugh not sure if you worked on some of the SolrClient building methods and saw this issue before. Maybe we should add the same methods to the CloudHttp2SolrClient.Builder

var response = solrClient.requestWithBaseUrl(baseUrl, client -> client.request(request));
var response = solrClient.requestWithBaseUrl(baseUrl, request::process).getResponse();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is rather nice. @gerlowskija, I think we can do away with the lambda-less requestWithBaseUrl I was suggesting. WDYT?

Comment on lines +187 to 198

InputStream is = null;
var solrClient = coreContainer.getDefaultHttpSolrClient();

try {
GenericSolrRequest request = new GenericSolrRequest(GET, "/node/files" + getMetaPath());
request.setResponseParser(new InputStreamResponseParser(null));
var response = solrClient.requestWithBaseUrl(baseUrl, request::process).getResponse();
is = (InputStream) response.get("stream");
metadata =
Utils.executeGET(
coreContainer.getUpdateShardHandler().getDefaultHttpClient(),
baseUrl + "/node/files" + getMetaPath(),
Utils.newBytesConsumer((int) MAX_PKG_SIZE));
Utils.newBytesConsumer((int) MAX_PKG_SIZE).accept((InputStream) response.get("stream"));
m = (Map<?, ?>) Utils.fromJSON(metadata.array(), metadata.arrayOffset(), metadata.limit());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please replace needless use of the special InputStreamResponseParser with a standard JsonMapResponseParser. Perhaps don't even do that; we don't have to use JSON, I think; Solr will by default negotiate the format and get you a NamedList. (Map like thing).

var solrClient = coreContainer.getDefaultHttpSolrClient();
var resp = solrClient.requestWithBaseUrl(baseUrl, request::process).getResponse();

if (Utils.getObjectByPath(resp, false, Arrays.asList("files", path)) != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are convenience methods on NamedList avoiding the need to refer to this awkward utility method. Try findRecursive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants