Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-17477: Support custom aggregates in plugin code #2742

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

timatbw
Copy link
Contributor

@timatbw timatbw commented Oct 4, 2024

https://issues.apache.org/jira/browse/SOLR-17477

Description

Better support for writing custom aggregates in 3rd party plugin code, outside of Solr.

Solution

Opened up visibility of certain classes that are necessary to write custom aggregates, and allow them to register by type so that they are supported just the same as field/query/range/heatmap etc

Tests

Adjusted existing test that confirms a custom aggregate can be written outside the solr facet package

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide

@mkhludnev
Copy link
Member

I like it. Thanks @timatbw. I'm able to merge, if there's an approve from the second 👀.

REGISTERED_TYPES.put("func", (p, k, a) -> p.parseStat(k, a));
}

public static void registerParseHandler(String type, ParseHandler parseHandler) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the https://issues.apache.org/jira/browse/SOLR-17477 comment w.r.t. how this is called!

Copy link
Member

@mkhludnev mkhludnev Oct 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure where to discuss it. But registering anything in static initializer is asking for trouble, imho. I'd rather hookup a component via solrconfig.xml (or a plugin??), which just calls registerParserHandler() for custom handlers.
Anyway it's a matter of taste, and should block this PR, unless someone propose to make FP.registerParserHandler protected. I tried, it was so pity, don't even show it to anyone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's perhaps not the best way of hooking things up, especially as the whole process is already kicked off from solrconfig.xml where you declare a valueSourceParser. I was relying on the point at which that custom parser object is instantiated to run a static initialiser to also register it for json parsing too. But maybe it's safer to invert that and have the Solr code register the custom parser if it implements json parsing. I'll have another look and think about the alternatives.

My main reason for suggesting custom code uses a static block to register is when you have many hundreds of SolrCores in your CoreContainer and each one would be instantiated and call the register method, when really you might only need it done once. But now thinking if you run a mixed workload with different solrconfig and they're not homogenous, you don't want static registering maybe (each SolrCore is distinct)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although the 'standard set' of registered types in the static block above probably is appropriate as those are built-in and common to all SolrCores.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REGISTERED_TYPES are shared across cores. It means:

  • one core might inject a custom parser, it appears across all cores. If it hooks up some large state it remains in heap after core unloading (it might be called as a leak)
  • registerPH might be called with one of the standard name bringing some surprises to other cores.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes all good points, I'm going to change this to remove the statics and make it per-core which probably fits better with how it's hooked in too

Copy link
Member

@mkhludnev mkhludnev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please proceed with access modifiers

@@ -138,21 +139,30 @@ public Object parseFacetOrStat(String key, Object o) throws SyntaxError {
return parseFacetOrStat(key, type, args);
}

public interface ParseHandler {
Object doParse(FacetParser<?> parent, String key, Object args) throws SyntaxError;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, is there an other exemplar of such a naming convention like doSometh ? why not just parse?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular strong feelings on this, I tend to use doSomething when it's an inner helper method called from an outer/wrapper method that is the something(..) but in this case it's not really directly called. In practice I think people would use lambda anyway and never actually write this method name. But happy to change it to parse!

@@ -138,21 +139,30 @@ public Object parseFacetOrStat(String key, Object o) throws SyntaxError {
return parseFacetOrStat(key, type, args);
}

public interface ParseHandler {
Object doParse(FacetParser<?> parent, String key, Object args) throws SyntaxError;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, return type is a little bit tricky FacetRequest|AggValueSource I suppose it deserves to be documented via Javadoc, unless we can declare it explicitly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already how it is, the calling code allows the parser to return either a FacetRequest or a AggValueSource hence that Object type was needed here too. It's not ideal, and perhaps in practice most people will want to produce a AggValueSource only. Creating new subclasses of FacetRequest outside the package is harder, and I've not yet looked at how feasible it is to do that from a plugin (it's not the focus of this issue). But it would be interesting if eventually you can define custom types of faceting too.

@mkhludnev
Copy link
Member

oh gosh.. I figure it out how to update PR as a maintainer. The problem was I did it by https auth from Idea, but it seems it requested token only for my account and can't push into someone else's even if Maintainers are allowed to edit this pull request. is turn on. Old good ssh did a thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants