Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replayer instrumentation #475

Merged
merged 100 commits into from
Feb 9, 2024

Commits on Nov 16, 2023

  1. Move addMetricsIfPresent into the metrics builder as a first class me…

    …thod for others to leverage.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Nov 16, 2023
    Configuration menu
    Copy the full SHA
    a4caca7 View commit details
    Browse the repository at this point in the history

Commits on Nov 27, 2023

  1. WIP to play with OpenTelemetry metric instruments and tracer spans.

    Most of this is just playing, but making the StreamManager implement AutoCloseable gives a place to end spans to show how long a serializer/connection factory was relevant for.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Nov 27, 2023
    Configuration menu
    Copy the full SHA
    c026588 View commit details
    Browse the repository at this point in the history

Commits on Nov 30, 2023

  1. Get gradle files and docker-compose in order to support otlp exports …

    …to the collector to prometheus, zipkin, etc
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Nov 30, 2023
    Configuration menu
    Copy the full SHA
    f3c0077 View commit details
    Browse the repository at this point in the history
  2. WIP

    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Nov 30, 2023
    Configuration menu
    Copy the full SHA
    7fb8e2e View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    a8ae3d1 View commit details
    Browse the repository at this point in the history
  4. Add labels to each metric instrument so that multiple values can be p…

    …lotted within the same graph in prometheus.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Nov 30, 2023
    Configuration menu
    Copy the full SHA
    da9d36b View commit details
    Browse the repository at this point in the history
  5. Move the MetricsClosure into its own class and stop stuffing the metr…

    …ics into an optional.
    
    Dropping the optionals makes the code simpler and if we don't want to do logging, we can just not fill in the configuration for the SDK.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Nov 30, 2023
    Configuration menu
    Copy the full SHA
    06618ca View commit details
    Browse the repository at this point in the history
  6. WIP - Cleanup + get Jaeger to work by switching the endpoint. Also in…

    …troduce some more typesafe wrappers for contexts.
    
    Lots more to come.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Nov 30, 2023
    Configuration menu
    Copy the full SHA
    aba1aab View commit details
    Browse the repository at this point in the history
  7. Start moving away from ThreadLocal and 'current contexts' and toward …

    …explicitly passing strongly typed context objects.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Nov 30, 2023
    Configuration menu
    Copy the full SHA
    900bc6d View commit details
    Browse the repository at this point in the history
  8. Get span parenting to work.

    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Nov 30, 2023
    Configuration menu
    Copy the full SHA
    3746a8e View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    e0e7bf1 View commit details
    Browse the repository at this point in the history
  10. Attempt to fix a failing unit test.

    Make sure that the context is using the right requestKey, which also will have the appropriate indices as per the test context.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Nov 30, 2023
    Configuration menu
    Copy the full SHA
    4b43262 View commit details
    Browse the repository at this point in the history
  11. Refactor. Couple name changes, class package changes, and moved IRepl…

    …ayerRequestContext to the replayer
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Nov 30, 2023
    Configuration menu
    Copy the full SHA
    322e12f View commit details
    Browse the repository at this point in the history

Commits on Dec 1, 2023

  1. Bundle all of the offloader spans with the netty handler spans.

    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 1, 2023
    Configuration menu
    Copy the full SHA
    723bf77 View commit details
    Browse the repository at this point in the history

Commits on Dec 2, 2023

  1. Improve the tracing story for the capture proxy.

    Don't bother showing the Kakfa offloader just buffering (was called recordStream).  Now the offloader span is a child span of the connection span from the handler, so we can see the handler gathering the request/response (or waiting for the response).
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 2, 2023
    Configuration menu
    Copy the full SHA
    15a1705 View commit details
    Browse the repository at this point in the history
  2. Tracing change: Flatten the flush span and just record it as 'blocked'.

    That makes it a separate state for the logging handler superclass.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 2, 2023
    Configuration menu
    Copy the full SHA
    8a6f52a View commit details
    Browse the repository at this point in the history

Commits on Dec 4, 2023

  1. Minor cleanup - stop setting the namespace or trying to change in a p…

    …rocessor.
    
    Prometheus metrics already have an export_name that is unique, the processors weren't doing anything useful, & the namespace was appending EVERYTHING from one of the two services.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 4, 2023
    Configuration menu
    Copy the full SHA
    c50e01d View commit details
    Browse the repository at this point in the history
  2. Start instrumenting the replayer with more contexts so that traces an…

    …d (less so for now) metrics can be exported across more of the lifetime of a request/connection.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 4, 2023
    Configuration menu
    Copy the full SHA
    17c517d View commit details
    Browse the repository at this point in the history

Commits on Dec 11, 2023

  1. Double down on using Context objects in lieu of String labels and fix…

    … a test bug.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 11, 2023
    Configuration menu
    Copy the full SHA
    6288844 View commit details
    Browse the repository at this point in the history
  2. Merge branch 'FixKafkaResume' into OtelMetricsAndTraces

    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    
    # Conflicts:
    #	TrafficCapture/nettyWireLogging/src/main/java/org/opensearch/migrations/trafficcapture/netty/ConditionallyReliableLoggingHttpRequestHandler.java
    #	TrafficCapture/nettyWireLogging/src/main/java/org/opensearch/migrations/trafficcapture/netty/LoggingHttpRequestHandler.java
    #	TrafficCapture/nettyWireLogging/src/test/java/org/opensearch/migrations/trafficcapture/netty/ConditionallyReliableLoggingHttpRequestHandlerTest.java
    #	TrafficCapture/trafficCaptureProxyServer/src/main/java/org/opensearch/migrations/trafficcapture/proxyserver/netty/ProxyChannelInitializer.java
    #	TrafficCapture/trafficReplayer/src/main/java/org/opensearch/migrations/replay/Accumulation.java
    #	TrafficCapture/trafficReplayer/src/main/java/org/opensearch/migrations/replay/CapturedTrafficToHttpTransactionAccumulator.java
    #	TrafficCapture/trafficReplayer/src/main/java/org/opensearch/migrations/replay/RequestResponsePacketPair.java
    #	TrafficCapture/trafficReplayer/src/main/java/org/opensearch/migrations/replay/RequestSenderOrchestrator.java
    #	TrafficCapture/trafficReplayer/src/test/java/org/opensearch/migrations/replay/SimpleCapturedTrafficToHttpTransactionAccumulatorTest.java
    gregschohn committed Dec 11, 2023
    Configuration menu
    Copy the full SHA
    09e849c View commit details
    Browse the repository at this point in the history

Commits on Dec 12, 2023

  1. Merge branch 'main' into OtelMetricsAndTraces

    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 12, 2023
    Configuration menu
    Copy the full SHA
    9cf2540 View commit details
    Browse the repository at this point in the history
  2. Update the Http Logging Handler to suppress response packet captures …

    …when the request was ignored and remove the now-unused file for responses.
    
    I'd like to revisit this eventually to make sure that it's as efficient as possible and to organize it better.  However, it does get the job done for now and tests were updated to confirm.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 12, 2023
    Configuration menu
    Copy the full SHA
    c14da6a View commit details
    Browse the repository at this point in the history
  3. File rename since the LoggingHttpRequest handler now handles both req…

    …uests and responses.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 12, 2023
    Configuration menu
    Copy the full SHA
    7de4009 View commit details
    Browse the repository at this point in the history

Commits on Dec 15, 2023

  1. Shuffling lots of details of Contexts and the relationships between d…

    …ifferent levels of them.
    
    This has the minimal amount of work to get those relationships to simply compile.  Nearly every unit test fails and the code is more clunky than it needs to be, but getting to this point was alone a major lift.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 15, 2023
    Configuration menu
    Copy the full SHA
    a5bfc7d View commit details
    Browse the repository at this point in the history
  2. Lots of refactoring to get a couple more test cases to pass.

    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 15, 2023
    Configuration menu
    Copy the full SHA
    3d60106 View commit details
    Browse the repository at this point in the history
  3. Test fixes

    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 15, 2023
    Configuration menu
    Copy the full SHA
    ad6bd13 View commit details
    Browse the repository at this point in the history

Commits on Dec 16, 2023

  1. Begin to cleanup endspans for some of the contexts. Lots of bugs rema…

    …in, but the replayer isn't crashing.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 16, 2023
    Configuration menu
    Copy the full SHA
    07ae016 View commit details
    Browse the repository at this point in the history

Commits on Dec 17, 2023

  1. More work to get context chains to work better together.

    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 17, 2023
    Configuration menu
    Copy the full SHA
    ccb517c View commit details
    Browse the repository at this point in the history
  2. Two critical bugfixes around handling close observations that were di…

    …scovered by trace inspection.
    
    1) close was called AFTER the RRPair was rotated, so there were no traffic streams being committed, resulting in a perpetual hole in the commit log.
    2) close was being scheduled immediately, before all requests (in most cases) because the channelInteractionNumber was being miscalculated.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 17, 2023
    Configuration menu
    Copy the full SHA
    8bda2e3 View commit details
    Browse the repository at this point in the history

Commits on Dec 18, 2023

  1. Fix some test code where the nodeId and connectionId got reversed, ca…

    …using the same ConnectionReplaySession to be returned for every connection, which resulted in serious corruption of ordering and tests never completing.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 18, 2023
    Configuration menu
    Copy the full SHA
    d9df3fa View commit details
    Browse the repository at this point in the history

Commits on Dec 19, 2023

  1. Extra guards to try to make tests more reliable, but one of the FullT…

    …rafficTest runs (-1, false) is still missing a message.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 19, 2023
    Configuration menu
    Copy the full SHA
    ab3dfb4 View commit details
    Browse the repository at this point in the history
  2. More test fixes, including fixing a regression that I had caused in a…

    …n earlier edit around flushing the streams held list on close.
    
    That should only happen when the close was on an accumulated pair that had a broken request that wasn't rotated.  Otherwise, the traffic streams would be committed twice.
    I also tweaked makeTrafficStreams to keep stable ids on the individual streams to make debugging runs a LOT simpler.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 19, 2023
    Configuration menu
    Copy the full SHA
    273c5aa View commit details
    Browse the repository at this point in the history
  3. Fix a race condition with commitKafkaKey.

    It could be called from one thread while the nextCommit maps were being modified by other (Kafka Consumer) threads.  Now there's a lock and a quick copy to protect against that race.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 19, 2023
    Configuration menu
    Copy the full SHA
    e0167f5 View commit details
    Browse the repository at this point in the history

Commits on Dec 20, 2023

  1. Two changes to kafka interactions. Add trace spans for traffic source…

    …/kafka interactions + preempt blocking in the BlockingTrafficSource when there's keys (offsets) to commit.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 20, 2023
    Configuration menu
    Copy the full SHA
    3d81ad8 View commit details
    Browse the repository at this point in the history
  2. Extract an IInstrumentationAttributes interface from IScopedInstrumen…

    …tationAttributes.
    
    This allows for passing root contexts around that have attributes but don't have an associated span.  This helps make the code have less assumptions about how it is situated.
    The change also opens the door to removing all of the static factories for spans and metrics.  Those factories can be chained from a top-level context that is passed throughout the callstack, rooted from these new IInstrumentationAttributes classes.  That change isn't here, but will probably be completed in the near future & this makes it easier.
    I'm also in the process of adding contexts and spans to more places (like the high-level traffic source work), which is what precipitated the greater change.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 20, 2023
    Configuration menu
    Copy the full SHA
    ae45a6a View commit details
    Browse the repository at this point in the history

Commits on Dec 21, 2023

  1. Checkpoint/WIP - More spans across the board, specifically through th…

    …e target transaction.
    
    Unfortunately, the TrafficReplayer fails to load because of a race condition around statically initializing the Otel SDK.
    However, all unit tests are working, so this is a checkpoint release before I remove the reliance on static otel initialization and move toward it being done via contexts.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 21, 2023
    Configuration menu
    Copy the full SHA
    195d0ba View commit details
    Browse the repository at this point in the history

Commits on Dec 22, 2023

  1. Test bugfix. toString() wasn't threadsafe. Now it is.

    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 22, 2023
    Configuration menu
    Copy the full SHA
    bef9944 View commit details
    Browse the repository at this point in the history
  2. Refactoring and code consolidation around context management.

    The static initializer race conditions have been resolved (no more static otel for tracing and metering) and the E2E solution is functional again.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 22, 2023
    Configuration menu
    Copy the full SHA
    8b50c89 View commit details
    Browse the repository at this point in the history
  3. Refactoring. Which classes emit metrics, all scopes have names, and g…

    …rouping more contexts together in files.
    
    1) RootOtelContext creates all SimpleMeteringClosures and IInstrumentationAttributes has default methods to wire together all metric emissions (making it a lot more convenient).
    2) All IInstrumentationAttributes have a scopeName now so that it's consistent and can be used to make metrics and spans.
    3) All replayer contexts were split into interfaces and implementations (bridge or PIMPL pattern).  I've moved them around so that they now all reside in the tracing directory in the same type of structure.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 22, 2023
    Configuration menu
    Copy the full SHA
    0e5fe09 View commit details
    Browse the repository at this point in the history

Commits on Dec 23, 2023

  1. Refactor TestContext so that it doesn't use statics and allows caller…

    …s to bring it up with or w/out InMemory exporters.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 23, 2023
    Configuration menu
    Copy the full SHA
    613a504 View commit details
    Browse the repository at this point in the history
  2. Remove a hardcoded path to my local directory

    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 23, 2023
    Configuration menu
    Copy the full SHA
    7df0dcc View commit details
    Browse the repository at this point in the history
  3. Fixed bugs in trace management and forced a lot more test code to tak…

    …e a context into it so that span verifications can happen within unit tests.
    
    The FullTrafficReplayerTest now has a test that is verifying the number of spans that were reported.  That was usefuly to work through to fix the bugs in double counting some spans.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 23, 2023
    Configuration menu
    Copy the full SHA
    de0d482 View commit details
    Browse the repository at this point in the history
  4. Add another scheduled span before the request is sent.

    I'm not adding one BETWEEN writes yet because that code is too complicated and should be simplified.  I'm also not sure that that isn't going to flood the traces for limited value.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 23, 2023
    Configuration menu
    Copy the full SHA
    72b1ca8 View commit details
    Browse the repository at this point in the history

Commits on Dec 26, 2023

  1. Move all span names into virtual interface functions so that they can…

    … be used for some metrics too.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Dec 26, 2023
    Configuration menu
    Copy the full SHA
    6043eed View commit details
    Browse the repository at this point in the history

Commits on Jan 2, 2024

  1. Pass more contexts, make contexts able to express more metrics, and e…

    …mit more.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 2, 2024
    Configuration menu
    Copy the full SHA
    37b99eb View commit details
    Browse the repository at this point in the history

Commits on Jan 3, 2024

  1. Minor bugfixes that make a huge difference. Fix a broken unit test as…

    … a metric name had changed and meter delta events as upDownCounters.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 3, 2024
    Configuration menu
    Copy the full SHA
    6684486 View commit details
    Browse the repository at this point in the history
  2. Some refactoring to increase the typesafety and to support greater co…

    …ntrol over which attributes are included within metrics and spans.
    
    I've also added "activeConnection" for capture connections.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 3, 2024
    Configuration menu
    Copy the full SHA
    7bf4388 View commit details
    Browse the repository at this point in the history

Commits on Jan 4, 2024

  1. Fix mend security issue for json-path CVE by updating opensearch-secu…

    …rity to 2.11.1.0
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 4, 2024
    Configuration menu
    Copy the full SHA
    8ef0376 View commit details
    Browse the repository at this point in the history
  2. Remove zipkin as a tracing sink.

    I've personally been using jaeger more and zipkin has been crashing with an OOM on startup (probably due to my docker setup) and I suspect that it has been causing greater problems with the otel collector.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 4, 2024
    Configuration menu
    Copy the full SHA
    37ae548 View commit details
    Browse the repository at this point in the history

Commits on Jan 5, 2024

  1. Make attribute name filtering more generic and fix a bug in negation …

    …so that the connectionId is now emitted from the activeConnections metric.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 5, 2024
    Configuration menu
    Copy the full SHA
    1172912 View commit details
    Browse the repository at this point in the history

Commits on Jan 8, 2024

  1. Minor tweaks to the otel collector (including renaming from 'demo') a…

    …nd adding some TODOs for future research.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 8, 2024
    Configuration menu
    Copy the full SHA
    22296b7 View commit details
    Browse the repository at this point in the history
  2. Set the aggregation temporality to delta rather than cumulative.

    I still need to continue to knock the dimensionality of the data (unique attributes) down considerably, but this at least mitigates the grpc overflow errors that I was seeing.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 8, 2024
    Configuration menu
    Copy the full SHA
    d1b237a View commit details
    Browse the repository at this point in the history
  3. Wrap all metric emissions within the context's span so that the metri…

    …c can be emitted with the span data as its exemplar.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 8, 2024
    Configuration menu
    Copy the full SHA
    fdd8141 View commit details
    Browse the repository at this point in the history

Commits on Jan 9, 2024

  1. In progress changes. I'm trying to track down a regression and want t…

    …o preserve new work first.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 9, 2024
    Configuration menu
    Copy the full SHA
    490521d View commit details
    Browse the repository at this point in the history

Commits on Jan 10, 2024

  1. In-progress checkpoint (code won't compile). Setting up separate metr…

    …ic instruments for each of the contexts' needs.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    f199e98 View commit details
    Browse the repository at this point in the history
  2. Another in-progress checkpoint (still won't compile) where I'm moving…

    … metric instruments into the context classes.
    
    I'm not happy with the scope type propagation through so much of the interface hierarchy, so I'm going to revisit that next.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    e110540 View commit details
    Browse the repository at this point in the history

Commits on Jan 11, 2024

  1. Another checkpoint that still doesn't compile, but less files (I thin…

    …k) are problematic.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 11, 2024
    Configuration menu
    Copy the full SHA
    156ae72 View commit details
    Browse the repository at this point in the history
  2. Stop passing the Root telemetry scope as a generic parameter to all o…

    …f the instrumentation interfaces.
    
    That was only needed for one call, which is easily inlined into the implementation classes by putting createChildContext() calls into each parent.  That allows the implementations to go back to their own root scopes and do whatever is necessary.  Lots of code is a lot easier to read and maintain now.
    There are still other changes that I'm in the process of making, including supporting linked spans and squashing compilation errors (still).
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 11, 2024
    Configuration menu
    Copy the full SHA
    5dc32d9 View commit details
    Browse the repository at this point in the history

Commits on Jan 12, 2024

  1. Working on updating proxy code to get everything to compile.

    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 12, 2024
    Configuration menu
    Copy the full SHA
    320e9d8 View commit details
    Browse the repository at this point in the history
  2. More refactoring, still doesn't all compile, but most of it does (top…

    …-level Capture proxy is remaining).
    
    I need to stitch together a couple root context classes to provide disparate top-level contexts.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 12, 2024
    Configuration menu
    Copy the full SHA
    9601a68 View commit details
    Browse the repository at this point in the history

Commits on Jan 13, 2024

  1. Fix the last of the compilation errors though tests are failing still.

    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 13, 2024
    Configuration menu
    Copy the full SHA
    5f6bb3f View commit details
    Browse the repository at this point in the history

Commits on Jan 14, 2024

  1. Bugfixes and test fixes to get all of the unit tests to pass.

    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 14, 2024
    Configuration menu
    Copy the full SHA
    ccc0e2a View commit details
    Browse the repository at this point in the history

Commits on Jan 15, 2024

  1. Bugfixes. Stop metering double events in a couple spots and fix a con…

    …nection id naming bug that was causing FullReplayerTests to fail.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 15, 2024
    Configuration menu
    Copy the full SHA
    0744424 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    82f8ebb View commit details
    Browse the repository at this point in the history

Commits on Jan 16, 2024

  1. Upgrade otel libraries to 1.34.1 from 1.32 and add the enable_open_me…

    …trics flag to the prometheus exporter for the otel collector to support exemplars.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 16, 2024
    Configuration menu
    Copy the full SHA
    35a9185 View commit details
    Browse the repository at this point in the history
  2. Fix a bug where the current scope's attributes weren't being added in…

    …to its own span. I've also updated linked spans to allow for more than one (will test shortly)
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 16, 2024
    Configuration menu
    Copy the full SHA
    0e8379d View commit details
    Browse the repository at this point in the history

Commits on Jan 18, 2024

  1. Add a TestContext for every replayer test via inheritance on the Test…

    … class so that I can initialize and teardown the context (and do some sanity checks in the process).
    
    I've already found and fixed a couple issues with scoped contexts being doubly closed, which has resulted in UpDownCounters being corrupted.
    The final check on the TestContext to make sure that all scopes are closed isn't active yet since the results are "862 tests completed, 808 failed".
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 18, 2024
    Configuration menu
    Copy the full SHA
    a076bc3 View commit details
    Browse the repository at this point in the history

Commits on Jan 19, 2024

  1. Change how MetricInstruments classes are instantiated.

    I had been pulling activity names from the enclosing class, but in cases of inheritance, that was unmaintainable.  That had caused a number of automatically-generated metrics to be missing because subclasses would use the superclass ACTIVITY_NAME definition.  Now, all the MetricInstruments constructors are private and the objects are created via factory functions that are alongside the inner class definitions.  That moves the naming responsibilities to the actual part of the code that cares and makes the system more foolproof.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 19, 2024
    Configuration menu
    Copy the full SHA
    d3ee4f1 View commit details
    Browse the repository at this point in the history
  2. Build fix - When refactoring to use TestContexts more globally, a tes…

    …t setup method was corrupted.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 19, 2024
    Configuration menu
    Copy the full SHA
    54c5e27 View commit details
    Browse the repository at this point in the history
  3. Spin up a grafana container in the docker solution with simple creden…

    …tials.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 19, 2024
    Configuration menu
    Copy the full SHA
    0ce6694 View commit details
    Browse the repository at this point in the history

Commits on Jan 21, 2024

  1. Start to get Source/Target comparison metrics in place and more refac…

    …toring to rely upon values within contexts rather than passing redundant copies for things like request/channel keys.
    
    Logging comparison metrics as logs is being phased out, being replaced by using metric and span attributes to directly pick up values (and letting those downstream dashboards to add additional logic to determine if values were matching/not, though statusMatch does still exist as a metric - only because its quick and easy on both ends).
    All tests pass, but I haven't done any testing with the dockerSolution.
    I need to re-enable the kafka container test because I need to be validating constantly that that test is working, but as of now, it throws an OOM error, which needs to be investigated.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 21, 2024
    Configuration menu
    Copy the full SHA
    c8674eb View commit details
    Browse the repository at this point in the history

Commits on Jan 22, 2024

  1. Minor cleanup on exception tracking

    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 22, 2024
    Configuration menu
    Copy the full SHA
    ffcc09b View commit details
    Browse the repository at this point in the history
  2. Bugfix - a class was inheriting from the Connection context's MetricI…

    …nstruments class when it should not have been
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 22, 2024
    Configuration menu
    Copy the full SHA
    dd89336 View commit details
    Browse the repository at this point in the history
  3. Cleanup build.gradle files' open-telemetry dependencies. Embrace otel…

    … as an api dependency for coreUtilities, since that's a tight coupling.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 22, 2024
    Configuration menu
    Copy the full SHA
    a15d08a View commit details
    Browse the repository at this point in the history

Commits on Jan 23, 2024

  1. Partial checkin to delete dead code and clean up imports and style is…

    …sues.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 23, 2024
    Configuration menu
    Copy the full SHA
    5c454ca View commit details
    Browse the repository at this point in the history
  2. Addressing PR Feedback with some localized cleanups

    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 23, 2024
    Configuration menu
    Copy the full SHA
    bf8ea86 View commit details
    Browse the repository at this point in the history

Commits on Jan 24, 2024

  1. aws cli wasn't functional within my arm64 container because the docke…

    …rfile was hardcoded for x86_64.
    
    Install awscli via pip now, which should work on both environments.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 24, 2024
    Configuration menu
    Copy the full SHA
    b641c5a View commit details
    Browse the repository at this point in the history

Commits on Jan 26, 2024

  1. Rework otel-collector container packaging.

    Support 3 different otel-collector container configurations for docker-compose.
    * Prometheus + Jaeger + OpenSearch (for metrics, traces, and 'analytics/logs') - all of which are local
    * AWS CloudWatch + AWS X-Ray (using a mounted credentials file if present, or reliant how well the base image can resolve credentials) + OpenSearch (local)
    * All of the above
    
    `./gradlew :dockerSolution:composeUp` will launch the first configuration by default.  Passing the flag `-Potel-collector=otel-aws.yml` will use the AWS configuration and `-Potel-collector=otel-everything.yml` will use the 'everything' configuration.
    
    The otel-collector container itself still uses the custom-built collector from mikaylathompson as the base image.  However, the extended image normalizes the user and entrypoint so that further extensions and applications of the container can behave as if the base image was the AWS Distro for OpenTelemetry collector base image (amazon/aws-otel-collector, see https://aws-otel.github.io/docs/setup/docker-images).  That also includes being able to convert the credentials file into environment variables since the otel-collector was otherwise struggling to use the credentials file directly.
    
    The otel-collector configuration is contained within one yaml file and the code within otelConfigs now allows one to fully materialize an otel-collector config file.  The otel-config*.yaml files within the dockerSolution have been created via the makeConfigFiles.sh script.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 26, 2024
    Configuration menu
    Copy the full SHA
    cd34b32 View commit details
    Browse the repository at this point in the history

Commits on Jan 27, 2024

  1. PR feedback including:

    Use System.nanotime() instead of Instant.now() for duration calculations.
    Call the fillAttributes() method from the super (class or interface) before doing any more puts.
    Correct the order that attributes are pulled from the scope hierarchy (and add a test)
    Use the ActivityNames pattern for IWireCaptureContexts
    StreamLifecycleManager no longer implements Autoclosable.  I didn't need it to close down contexts and no classes had implemented a non-empty version except to log that it was called.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 27, 2024
    Configuration menu
    Copy the full SHA
    a4173c0 View commit details
    Browse the repository at this point in the history
  2. Change path from otelcol to otelCollector and enable the collector an…

    …d OS analytics engine by default.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 27, 2024
    Configuration menu
    Copy the full SHA
    2072f69 View commit details
    Browse the repository at this point in the history

Commits on Jan 28, 2024

  1. Split the implementations of fillAttributes into two.

    One for attributes that should be present in all sub-spans and another where the attributes should only be present within the current one (and not children)
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 28, 2024
    Configuration menu
    Copy the full SHA
    be689b5 View commit details
    Browse the repository at this point in the history
  2. Fix the dependencies for logging leaves and add 'processors' and 'rec…

    …eivers' for metrics, traces and logs templates.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 28, 2024
    Configuration menu
    Copy the full SHA
    607ff05 View commit details
    Browse the repository at this point in the history
  3. Setting the docker command for the otel-collector service to use the …

    …aws config file
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 28, 2024
    Configuration menu
    Copy the full SHA
    d480ed8 View commit details
    Browse the repository at this point in the history
  4. Set the permissions for the otel container to write to cloudwatch and…

    … xray
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 28, 2024
    Configuration menu
    Copy the full SHA
    59db42f View commit details
    Browse the repository at this point in the history

Commits on Jan 31, 2024

  1. Minor cleanup

    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Jan 31, 2024
    Configuration menu
    Copy the full SHA
    76c8c31 View commit details
    Browse the repository at this point in the history

Commits on Feb 1, 2024

  1. Fix the runTestBenchmarks script to work when the endpoint uses http …

    …instead of https. I've also removed the 'no-ssl' option and deduce it from the protocol
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Feb 1, 2024
    Configuration menu
    Copy the full SHA
    2cc67fd View commit details
    Browse the repository at this point in the history
  2. Merge branch 'main' into ReplayerInstrumentation

    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    
    # Conflicts:
    #	test/awsE2ESolutionSetup.sh
    gregschohn committed Feb 1, 2024
    Configuration menu
    Copy the full SHA
    6e70d1a View commit details
    Browse the repository at this point in the history

Commits on Feb 3, 2024

  1. Aesthetic formatting changes

    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Feb 3, 2024
    Configuration menu
    Copy the full SHA
    9425ab6 View commit details
    Browse the repository at this point in the history

Commits on Feb 4, 2024

  1. IInstrumentationAttributes no longer has scope related functionality.…

    … That's been pushed down to its subclass IScopedInstrumentationAttributes.
    
    Attributes for spans are now filled when the span is closed rather than when it is created.  This gives less leeway to being able to override the value w/out changing the value within the context class, but the context class should be the ground truth, record of value, so this seems like it's the right behavior anyway.
    With those changes, I did some cleanup on the attribute values that were being tracked for tuple comparison.  Now status codes are tracked as metric AND span attributes.  That should make it MUCH easier to search metrics for specific patterns that popped up in metrics.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Feb 4, 2024
    Configuration menu
    Copy the full SHA
    deb19a4 View commit details
    Browse the repository at this point in the history

Commits on Feb 6, 2024

  1. Merge branch 'main' into ReplayerInstrumentation

    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    
    # Conflicts:
    #	TrafficCapture/captureOffloader/src/test/java/org/opensearch/migrations/trafficcapture/StreamChannelConnectionCaptureSerializerTest.java
    #	TrafficCapture/dockerSolution/src/main/docker/migrationConsole/runTestBenchmarks.sh
    #	deployment/cdk/opensearch-service-migration/lib/service-stacks/migration-analytics-stack.ts
    gregschohn committed Feb 6, 2024
    Configuration menu
    Copy the full SHA
    a5f947e View commit details
    Browse the repository at this point in the history
  2. README documentation for the Instrumentation + some cleanup.

    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Feb 6, 2024
    Configuration menu
    Copy the full SHA
    9a31210 View commit details
    Browse the repository at this point in the history
  3. When the first bucket size is <=0 for the CommonScopedMetricInstrumen…

    …t constructor override, throw an IllegalArgumentException.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Feb 6, 2024
    Configuration menu
    Copy the full SHA
    805e13b View commit details
    Browse the repository at this point in the history

Commits on Feb 7, 2024

  1. Add tracing and metrics for replayer sockets as they're created and c…

    …losed.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Feb 7, 2024
    Configuration menu
    Copy the full SHA
    9aa3432 View commit details
    Browse the repository at this point in the history
  2. Bugfix, test fix, lint fix.

    Bugfix is in the capture proxy's channel context's `sendMeterEventsForEnd()` override to call super so that we'll pickup duration, etc metrics too.
    The lint fix is in a new python script to facilitate testing from the migration console.
    The test fix is to give each TrafficReplayer run a fresh TestContext.   That context includes channel contexts, which should not be reused across process boundaries and likewise shouldn't be getting reused if we're trying to simulate that for repeated runs.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Feb 7, 2024
    Configuration menu
    Copy the full SHA
    48a45ab View commit details
    Browse the repository at this point in the history
  3. Fix an edge case where a socketChannel might not have been created ev…

    …en though the channel context was.
    
    Also make minor fixes to other tests to make them more resilient.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Feb 7, 2024
    Configuration menu
    Copy the full SHA
    16a9010 View commit details
    Browse the repository at this point in the history

Commits on Feb 8, 2024

  1. Handle SocketContexts as first class contexts rather than trying to i…

    …mplicitly manage them within a ChannelContext.
    
    I've also made a couple more test and production data structures more threadsafe.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Feb 8, 2024
    Configuration menu
    Copy the full SHA
    4b102f7 View commit details
    Browse the repository at this point in the history
  2. Fix an issue with when to close the SocketContext and some memory lea…

    …ks in test code.
    
    The SocketContext is now closed in the callback for the channel close rather than before we call close().  That should make a test failure due to duplicate close() calls much less likely and should also give us a better idea of when the socket was actually closed.
    There were some OutOfMemoryErrors coming back from the github action after 10 minutes of testing.  I believe that this was due to having a number of InMemoryMetricExporters that periodic metric exporters were pumping to (in perpetuity, even after the test was complete).  We were also potentially tracking lots of backtraces in the ContextTrackers.  Both of these now have close() calls that clears all of that logged data.  That's now called by TestContext.close(), which is wired for each InstrumentationTest.
    The next commit will tie off a lot more loose ends, but this commit was tested more extensively, hence the reason that I'm keeping them separate.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Feb 8, 2024
    Configuration menu
    Copy the full SHA
    1cb8927 View commit details
    Browse the repository at this point in the history
  3. Tie off more loose ends for memory leaks during test runs.

    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Feb 8, 2024
    Configuration menu
    Copy the full SHA
    441ca40 View commit details
    Browse the repository at this point in the history

Commits on Feb 9, 2024

  1. Test fixes + make scheduled contexts use System.nanotime instead of I…

    …nstants, even if an Instant is what's passed in to the constructor
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Feb 9, 2024
    Configuration menu
    Copy the full SHA
    dc76239 View commit details
    Browse the repository at this point in the history
  2. Set the x-ray exporter attribute index_all_attributes=true so that at…

    …tributes end up as annotations instead of metadata so that they can be searched.
    
    To search in x-ray, use `annotationId.ATTRIBUTE_NAME="..."`.
    
    Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
    gregschohn committed Feb 9, 2024
    Configuration menu
    Copy the full SHA
    8a875b3 View commit details
    Browse the repository at this point in the history