Is it possible to declare multiple elasticsearch appenders with different jackson mixins? #41

romanmar1 · 2020-03-22T11:26:13Z

If yes, is that efficient resource-wise?

rfoltyns · 2020-03-22T20:39:22Z

Yes, it's possible.

Resource-wise - it depends. Usually it's not efficient, as every Elasticsearch appender instance spawns several threads (batch emitter, metric printers, IO threads) and if you use pooled buffers, it also allocates at least as many bytes as you configure.

It depends what you need. Could you describe what exactly you want to achieve?

E.g. if you'd like to have mixins-per-index, I'd recommend programmatic config with:

one AsyncbatchDelivery, one ItemSourceFactory and ObjectWriter per set of mixins and use it like this:

asyncBatchDelivery.add(indexName, itemSourceFactory.create(logEvent, objectWriter));

or

one AsyncbatchDelivery and JacksonJsonLayout per set of mixins (with shared ItemSourceFactory!) and use it like this:

asyncBatchDelivery.add(indexName, jacksonJsonLayout.serialize(logEvent));

romanmar1 · 2020-03-22T23:29:50Z

I see.

I'll change the focus of my question a bit (or a lot):
Suppose I need to write logs for different components to different indices with different payloads,
and I want to be resource efficient as much as possible (for example: instantiate only 1 instance of JestClient, or 1 instance of apache async http client , with shared everything).
Mixins are not longer relevant to the question (I think i'll use pattern layout instead).

How would i do that?

rfoltyns · 2020-03-23T00:25:16Z

If you want to use the same appender instance for multiple indices, for now, you'll have to write your own IndexNameFormatter tailored to your needs and derive the index name from message content or context data.

I'll think that context data based formatter with StrLookup should do the trick. I'll add it in future releases.

If you need to be resource efficient, use hc module. It allocates several times less memory than jest, it has lower CPU overhead and further improvements are on the way.

romanmar1 · 2020-03-23T06:15:55Z

That's not exactly what i want.
Using the same Elasticsearch appender while deriving index name from context would be analogous using the same RollingFileAppender and deriving file name from context, which to me seems like the wrong kind of solution.

I want to use different appenders, possibly with different filters and everything else log4j2 has to offer, and be resource efficient. (Specifically to be able to share the same client, or delivery mechanism).

ps: Can i use hc with pattern layout? I think i tried that and it didn't work (something about using ItemSource api only).

rfoltyns · 2020-03-23T08:56:45Z

You can achieve that only if you configure multiple appenders programmatically. Filters are applied before LogEvent gets to the appender, so you should be fine there.

Any layouts that produce raw Strings - including PatternLayout- are not compatible with hc, only ItemSourceLayouts are and I'm not planning to change that. JacksonJsonLayout is the only implementation atm.

romanmar1 · 2020-03-23T09:26:30Z

Will you consider adding support for that via configuration files?

rfoltyns · 2020-03-23T18:42:31Z

Short answer - no, never.

With Log4j2, each appender has it's own lifecycle - all underlying components must be reliably stopped when requested to do so and release all resources, especially running threads, connections and thread locals(!) if used. They can't leak.

With config files, even if you manage to share resources between multiple appenders, the main challenge is to identify that the last appender was stopped and shared resource can be released. Log4j2 has control over it so static registry with usage count is one of the ideas. Possible, but horrible - I'll never add support for that.

However, programmatic config will allow you to do it - this library is highly extensible by design. You can extend ElasticsearchAppender.stop() and ElasticsearchAppender.createItemAppenderFactory() methods and control the lifecycle of AsyncBatchDelivery on your own. Have a look at the architecture. Maybe you'll find a better way.

How many appenders would you like to have?

romanmar1 · 2020-03-23T22:21:57Z

I'll probably start with 3 and go up from there, up to 10, give or take.

I understand your concerns.
I might add that the solution to this problem, at least initially, doesn't have to be complete, imho.
The demands can be more relaxed.

In the scenario where a couple of Elasticsearch appenders share a common delivery mechanism, It seems unlikely that one would want to stop some of them, and not the others. Assuming that, you can forgo the reference count mechanism and close the first appender completely, which will mean for the other appenders that they will stop working, whereupon closing them, they will dispose of resources only if not already disposed.

Making this partial capability will not jeopardize the library i think. On the contrary, you'll be adding, what i think, is a common and important feature. Elastic is very common logging platform nowdays, and using Elasticsearch appenders should be as easy as using file appenders, without taking a heavy toll on resources.

rfoltyns · 2020-03-24T01:32:17Z

It seems unlikely that one would want to stop some of them, and not the others

You're right. However, it still doesn't allow a Lifecycle.stop() call on first appender to cause instability of the others.

close the first appender completely, which will mean for the other appenders that they will stop working

That's exactly why I'd rather sacrifice a few megabytes of heap, close them properly, in order, allowing all logs in flight to be delivered safely. I don't want to jeopardize your data(!) If graceful shutdown can be executed, it should be.

The demands can be more relaxed

They can as long as it doesn't lead to data loss and instability of client application.

without taking a heavy toll on resources

I agree, that's why I started to work on pooled buffers and hc delaying other key features - the difference between hc and default jest module overhead is huge.

The partial solution is there already - programmatic config with user-provided handling of AsyncBatchDelivery lifecycle.

romanmar1 · 2020-03-24T07:00:39Z

That's exactly why I'd rather sacrifice a few megabytes of heap, close them properly, in order, allowing all logs in flight to be delivered safely. I don't want to jeopardize your data(!) If graceful shutdown can be executed, it should be.

It's logging data, and the fact that it is being channeled directly to ES, theoretically, puts it already in some jeopardy. It's a calculated risk.

They can as long as it doesn't lead to data loss and instability of client application.

Data loss perhaps, i don't see how it will lead to instability.

There are other options as well. For example, two types of elasticsearch appenders. One type is what you have now available, the other is a Decorating elasticsearch appender, that gets by ref the other type. The Decorating appender will use the inner appender (by actually decorating ItemAppender) and only needs to transfer the index name to the inner appender. This will make it quite easy, i think, to implement reference count on the inner appender. Whenever some Decorating appender get an inner appender by ref, the counter goes up, and on dispose, it goes down, on zero, it is disposed. Even simpler, you can enforce order of disposal, Decorating appenders first, inner appenders last.

The partial solution is there already - programmatic config with user-provided handling of AsyncBatchDelivery lifecycle.

Well, yes and no. I have the need to write logs not only to ES, but to other sources as well. Also , the project that i'm currently working on has already a number of log4j2 xml configurations, and I don't want the logging configuration to be present in several places (code and xml). I'm a big fan of programmatic configuration myself, but It's not an acceptable excuse to move everything to code configuration.

I agree, that's why I started to work on pooled buffers and hc delaying other key features - the difference between hc and default jest module overhead is huge.

Can you provide some absolute metrics for this? perhaps from tests that you conducted?

rfoltyns · 2020-03-24T22:19:04Z

It's a calculated risk

I'm not willing to increase it.

i don't see how it will lead to instability

If in any case resources are not closed properly or closed prematurely, some classes may not be unloaded in hot-deploy environments (this applies to most dangling resources)

Decorating appender get an inner appender by ref

Options are already available with Log4j2: AsyncAppender, RoutingAppender, RewriteAppender. All of them can ref shared appender.

Here are the measurements of GC times and memory allocation rate taken from SmokeTest.programmaticConfigTest in jest and SmokeTest.programmaticBufferedConfigTest in hc, revision 946ac3e at ~10,000 logs per second (each log has 200 characters). 2 GB heap. 10,000,000 logs sent. Log is allocated once to measure library overhead.

Shenandoah (default, no tuning):

Metric	jest	hc
No. of minor collections	132	12
Pause interval	~8 sec	~120 sec
Total Pause Time	477 ms	104 ms
Avg. Pause Time	1.72 ms	2.60 ms
Allocation Rate	185.81 MB/sec	11.45 MB/sec
Allocated Total	187.71 GB	11.2 GB
GC Throughput	99.954%	99.99%

CMS (default, no tuning):

Metric	jest	hc
No. of minor collections	341	21
Pause interval	~3 sec	~50 sec
Total Pause Time	3 sec 910 ms	430 ms
Avg. Pause Time	11.4 ms	18.7 ms
Allocation Rate	172.97 MB/sec	11.17 MB/sec
Allocated Total	174.17 GB	11.06 GB
GC Throughput	99.621%	99.958%

romanmar1 · 2020-03-25T08:39:48Z

Options are already available with Log4j2: AsyncAppender, RoutingAppender, RewriteAppender. All of them can ref shared appender.

I don't see how they solve the problem we discussed. Through them its possible to implement context based routing, but they don't solve the problem of delivery mechanism sharing with different Elasticsearch indices.

Thanks for performance metrics. It will help decide which client to use.

rfoltyns · 2020-03-25T13:32:41Z

You're welcome! :) Always a pleasure to do these kind of comparisons.

As mentioned earlier, you can use your own, custom IndexNameFormatter based on context data and either:

RewriteAppender with custom RewritePolicy parametrized with indexName
or custom MessageFactory that creates Messages with indexName context parameter; not perfect, cause one may prefer to use SLF4J; however, since it's on the class level, it allows to use parametrization at any level of granularity (class, package, service, group, etc.) with properties supplier of your choice;
or ThreadContext (used to be MDC) which is the simplest, but requires careful cleanup (thread locals); all parametrization options apply as well

IMHO, they are reasonably safe solutions to the problem you'd like to solve. All of these interfaces are battle proven, well tested and allow you to use everything that Log4j2 has to offer.

I can add the optional IndexNameResolver with StrLookup to current formatters in 1.5. This would allow to achieve a lot with minimal changes required. Do you think that this could be of any use to you?

romanmar1 · 2020-03-25T13:48:40Z

RewriteAppender with custom RewritePolicy parametrized with indexName

This one will do, together with the IndexNameResolver and StrLookup.
Thanks.

rfoltyns mentioned this issue Apr 10, 2020

Add support to log directly into Elasticsearch elastic/ecs-logging-java#7

Closed

rfoltyns added the howto label Jul 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to declare multiple elasticsearch appenders with different jackson mixins? #41

Is it possible to declare multiple elasticsearch appenders with different jackson mixins? #41

romanmar1 commented Mar 22, 2020

rfoltyns commented Mar 22, 2020

romanmar1 commented Mar 22, 2020

rfoltyns commented Mar 23, 2020

romanmar1 commented Mar 23, 2020

rfoltyns commented Mar 23, 2020

romanmar1 commented Mar 23, 2020 •

edited

Loading

rfoltyns commented Mar 23, 2020

romanmar1 commented Mar 23, 2020

rfoltyns commented Mar 24, 2020

romanmar1 commented Mar 24, 2020

rfoltyns commented Mar 24, 2020

romanmar1 commented Mar 25, 2020 •

edited

Loading

rfoltyns commented Mar 25, 2020

romanmar1 commented Mar 25, 2020

Is it possible to declare multiple elasticsearch appenders with different jackson mixins? #41

Is it possible to declare multiple elasticsearch appenders with different jackson mixins? #41

Comments

romanmar1 commented Mar 22, 2020

rfoltyns commented Mar 22, 2020

romanmar1 commented Mar 22, 2020

rfoltyns commented Mar 23, 2020

romanmar1 commented Mar 23, 2020

rfoltyns commented Mar 23, 2020

romanmar1 commented Mar 23, 2020 • edited Loading

rfoltyns commented Mar 23, 2020

romanmar1 commented Mar 23, 2020

rfoltyns commented Mar 24, 2020

romanmar1 commented Mar 24, 2020

rfoltyns commented Mar 24, 2020

romanmar1 commented Mar 25, 2020 • edited Loading

rfoltyns commented Mar 25, 2020

romanmar1 commented Mar 25, 2020

romanmar1 commented Mar 23, 2020 •

edited

Loading

romanmar1 commented Mar 25, 2020 •

edited

Loading