Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calling MetricsLogger.flush() causes RuntimeWarnings #52

Open
Dunedan opened this issue Aug 27, 2020 · 8 comments
Open

Calling MetricsLogger.flush() causes RuntimeWarnings #52

Dunedan opened this issue Aug 27, 2020 · 8 comments
Labels
documentation Improvements or additions to documentation

Comments

@Dunedan
Copy link
Contributor

Dunedan commented Aug 27, 2020

Calling MetricsLogger().flush() as documented in the README causes RuntimeWarnings:

Here is a minimal example:

#!/usr/bin/env python3

import os
os.environ["AWS_LAMBDA_FUNCTION_NAME"] = "dummy-function-name"

from aws_embedded_metrics import metric_scope

@metric_scope
def foo(metrics):
    metrics.put_metric("foo", 1, "Count")
    metrics.flush()

foo()

Output when calling:

./poc.py:11: RuntimeWarning: coroutine 'MetricsLogger.flush' was never awaited
  metrics.flush()
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
{"LogGroup": "dummy-function-name", "ServiceName": "dummy-function-name", "ServiceType": "AWS::Lambda::Function", "executionEnvironment": "", "memorySize": "", "functionVersion": "", "logStreamId": "", "_aws": {"Timestamp": 1598540408195, "CloudWatchMetrics": [{"Dimensions": [["LogGroup", "ServiceName", "ServiceType"]], "Metrics": [{"Name": "foo", "Unit": "Count"}], "Namespace": "aws-embedded-metrics"}]}, "foo": 1}
@jaredcnance
Copy link
Member

jaredcnance commented Aug 27, 2020

This is related to #21 where we will likely make flush() synchronous. Today, it is an async call and should be awaited. We can update the documentation in the meantime to make this more clear.

@jaredcnance jaredcnance added the documentation Improvements or additions to documentation label Aug 27, 2020
@ryandeivert
Copy link

what's the status on this issue? this is super noisy in lambda CWL output

@SamStephens
Copy link

@ryandeivert status is unchanged. The method is still asynchronous, and you need to be awaiting it. In my code, I'm doing this:

@metric_scope
def handler(event, context, metrics):
    try:
        # Actual handler logic
    finally:
        # Need to call flush like this because it's a coroutine/asynchronous
        loop = asyncio.get_event_loop()
        loop.run_until_complete(metrics.flush())

@ryandeivert
Copy link

ryandeivert commented Nov 13, 2021

yes I understand - but for larger lambda codebases, where the bulk of the logic does not occur within the scope of a single function, this isn't a super ideal way to use this library (as a decorator). effectively, I'd have to wrap any of my functions that log metrics with this, instead of just creating a logger object to be used directly. unless I'm misunderstanding the API, in which I'd love to hear about alternatives.

edit: can you also clarify why certain certain properties are injected into the output, with no ability to override these (see here)

I'm currently using the below to work around this:

from aws_embedded_metrics import MetricsLogger as _MetricsLogger
from aws_embedded_metrics.environment.lambda_environment import LambdaEnvironment


class MetricsLogger(_MetricsLogger):
    def __init__(self):
        super().__init__(None, None)
        self.environment = LambdaEnvironment()

    def flush(self) -> None:
        """Override the default async MetricsLogger.flush method, flushing to stdout immediately"""
        sink = self.environment.get_sink()
        sink.accept(self.context)
        self.context = self.context.create_copy_with_context()

    def with_dimensions(self, *dimensions):
        return self.set_dimensions(*dimensions)


def main():

    new_logger = MetricsLogger()
    new_logger.put_metric('metric_name', 10).with_dimensions({'dim01': 'value01', 'dim02': 'value02'})
    new_logger.flush()

@heldersepu
Copy link

@ryandeivert status is unchanged. The method is still asynchronous, and you need to be awaiting it. In my code, I'm doing this:

@metric_scope
def handler(event, context, metrics):
    try:
        # Actual handler logic
    finally:
        # Need to call flush like this because it's a coroutine/asynchronous
        loop = asyncio.get_event_loop()
        loop.run_until_complete(metrics.flush())

@SamStephens Why would anyone need to do that?
That is already done here:
https://github.com/awslabs/aws-embedded-metrics-python/blob/v3.0.0/aws_embedded_metrics/metric_scope/__init__.py#L48-L50
Unless I'm missing something there is no need to call flush ourselves

@SamStephens
Copy link

@ryandeivert thanks for your workaround, it's saved my bacon with Flask, Gunicorn and Gevent where for reasons I don't fully understand I cannot use Flask's async support.

However, also I don't understand why you need your workaround with Lambda. If you're actually calling other functions, surely your main functions really looks like

def main():

    new_logger = MetricsLogger()
    do_some_work(new_logger)
    do_something_else(new_logger)
    new_logger.flush()

If so, I don't actually see how this is different to

@metric_scope
def main(new_logger):
    do_some_work(new_logger)
    do_something_else(new_logger)

@SamStephens
Copy link

what's the status on this issue? this is super noisy in lambda CWL output

@ryandeivert it's worth noting this isn't just noise, the warning means that it's possible for the Lambda function to be shutdown before the flush actually completes, because you're not awaiting it to complete. This means there's a chance of metrics being lost.

@lukepafford
Copy link

lukepafford commented Sep 11, 2024

Yeah this is pretty annoying. I basically want to emit a metric in our lambda that can possibly process multiple failing hostnames:

from aws_embedded_metrics.logger.metrics_logger import MetricsLogger

async def emit_hostname_failure(logger: MetricsLogger, hostname: str) -> None:
    logger.set_namespace("MyNamespace")
    logger.put_metric("HostnameFailure", 1, "Count")

    # Hostname is a high cardinality value. Do NOT use put_dimensions, but instead use set_property
    # where results will be queried through CloudWatch Insights.
    logger.set_property("Hostname", hostname)

    # A property can only be tied to a single metric, so call flush
    # for each device data point.
    await logger.flush() # ERROR! Result of async function call is not used; use "await" or assign result to variable

Unfortunately I now need to either figure out how to make the lambda work async, or override my own class like @ryandeivert did. Either way this is a headache.


Looks like wrapping the call in asyncio.run should work to run the function synchronously:

import asyncio
from aws_embedded_metrics.logger.metrics_logger_factory import create_metrics_logger

def handler(event, context):
    logger = create_metrics_logger()
    ...
    asyncio.run(emit_hostname_failure(logger, "hostname")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

6 participants