Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add StartedQueries stat to InternalResourceGroup for tracking query start rate #23984

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

xkrogen
Copy link
Member

@xkrogen xkrogen commented Oct 30, 2024

Description

Add a new StartedQueries stat to InternalResourceGroup to track the rate of queries being started/submitted to a resource group. After #22957 we have nearly all of the useful operational stats that you would want to know about a resource group (like CPU / memory, queued/running queries at any given time, etc.), but one thing that is missing is an understanding of the rate at which queries are being submitted into this resource group.

Additional context and related issues

Currently we have a metric TimeBetweenStartsSec but it doesn't meet the needs described here:

  1. It only tracks time between starts, not the count of queries
  2. It doesn't count the first start after newly becoming eligible (refer to this discussion), limiting its utility for tracking the query submission/start rate.
  3. It currently appears to be broken in two ways; first, it only updates for non-leaf groups (refer to this Slack discussion); second, it is a CounterStat, but is trying to measure timing information, so it really should be a TimeStat. Currently it is putting timing information (count of seconds) into a CounterStat which doesn't really produce meaningful data.

Point (3) is fixable and I have a WIP to address it (xkrogen@c2d474d), but after working on it I have become convinced that even if fixed, it is confusing and not useful. I am more inclined to just remove it and entirely replace it with this new metric. Will be happy to do that as a follow-on to this PR if others agree, or if there is consensus that TimeBetweenStartsSec indeed adds significant value and it's worth fixing the issues described in (3), I can finish up my WIP changes.

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(X) Release notes are required, with the following suggested text:

# General
* Expose query start count/rate metrics for resource groups. ({issue}`23984`)

@cla-bot cla-bot bot added the cla-signed label Oct 30, 2024
@xkrogen xkrogen force-pushed the xkrogen/resource-group-metrics branch from 2a0d6f5 to 65116b6 Compare October 30, 2024 23:10
@xkrogen
Copy link
Member Author

xkrogen commented Oct 31, 2024

cc @piotrrzysko @hashhar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

1 participant