Skip to content

Commit

Permalink
Merge branch 'master' into dependabot/go_modules/github.com/influxdat…
Browse files Browse the repository at this point in the history
…a/influxdb-client-go-1.4.0
  • Loading branch information
katyanna authored Jan 9, 2024
2 parents 85e5c1e + 905cf8a commit f92a08a
Show file tree
Hide file tree
Showing 10 changed files with 560 additions and 52 deletions.
87 changes: 87 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -702,6 +702,93 @@ you need to define a `key` or other `tag` with a "star" query syntax like
metric label definitions. If both annotations and corresponding label is
defined, then the annotation takes precedence.


## Nakadi collector

The Nakadi collector allows scaling based on [Nakadi](https://nakadi.io/)
Subscription API stats metrics `consumer_lag_seconds` or `unconsumed_events`.

### Supported metrics

| Metric Type | Description | Type | K8s Versions |
|------------------------|-----------------------------------------------------------------------------|----------|--------------|
| `unconsumed-events` | Scale based on number of unconsumed events for a Nakadi subscription | External | `>=1.24` |
| `consumer-lag-seconds` | Scale based on number of max consumer lag seconds for a Nakadi subscription | External | `>=1.24` |

```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapp-hpa
annotations:
# metric-config.<metricType>.<metricName>.<collectorType>/<configKey>
metric-config.external.my-nakadi-consumer.nakadi/interval: "60s" # optional
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: custom-metrics-consumer
minReplicas: 0
maxReplicas: 8 # should match number of partitions for the event type
metrics:
- type: External
external:
metric:
name: my-nakadi-consumer
selector:
matchLabels:
type: nakadi
subscription-id: "708095f6-cece-4d02-840e-ee488d710b29"
metric-type: "consumer-lag-seconds|unconsumed-events"
target:
# value is compatible with the consumer-lag-seconds metric type.
# It describes the amount of consumer lag in seconds before scaling
# additionally up.
# if an event-type has multiple partitions the value of
# consumer-lag-seconds is the max of all the partitions.
value: "600" # 10m
type: Value
# averageValue is compatible with unconsumed-events metric type.
# This means for every 30 unconsumed events a pod is added.
# unconsumed-events is the sum of of unconsumed_events over all
# partitions.
averageValue: "30"
type: AverageValue
```

The `subscription-id` is the Subscription ID of the relevant consumer. The
`metric-type` indicates whether to scale on `consumer-lag-seconds` or
`unconsumed-events` as outlined below.

`unconsumed-events` - is the total number of unconsumed events over all
partitions. When using this `metric-type` you should also use the target
`averageValue` which indicates the number of events which can be handled per
pod. To best estimate the number of events per pods, you need to understand the
average time for processing an event as well as the rate of events.

*Example*: You have an event type producing 100 events per second between 00:00
and 08:00. Between 08:01 to 23:59 it produces 400 events per second.
Let's assume that on average a single pod can consume 100 events per second,
then we can define 100 as `averageValue` and the HPA would scale to 1 between
00:00 and 08:00, and scale to 4 between 08:01 and 23:59. If there for some
reason is a short spike of 800 events per second, then it would scale to 8 pods
to process those events until the rate goes down again.

`consumer-lag-seconds` - describes the age of the oldest unconsumed event for
a subscription. If the event type has multiple partitions the lag is defined as
the max age over all partitions. When using this `metric-type` you should use
the target `value` to indicate the max lag (in seconds) before the HPA should
scale.

*Example*: You have a subscription with a defined SLO of "99.99 of events are
consumed within 30 min.". In this case you can define a target `value` of e.g.
20 min. (1200s) (to include a safety buffer) such that the HPA only scales up
from 1 to 2 if the target of 20 min. is breached and it needs to work faster
with more consumers.
For this case you should also account for the average time for processing an
event when defining the target.


## HTTP Collector

The http collector allows collecting metrics from an external endpoint specified in the HPA.
Expand Down
2 changes: 1 addition & 1 deletion docs/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ spec:
serviceAccountName: custom-metrics-apiserver
containers:
- name: kube-metrics-adapter
image: registry.opensource.zalan.do/teapot/kube-metrics-adapter:latest
image: ghcr.io/zalando-incubator/kube-metrics-adapter:latest
args:
# - --v=9
- --prometheus-server=http://prometheus.kube-system.svc.cluster.local
Expand Down
2 changes: 1 addition & 1 deletion example/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM registry.opensource.zalan.do/library/alpine-3.13:latest
FROM registry.opensource.zalan.do/library/alpine-3:latest
LABEL maintainer="Team Teapot @ Zalando SE <team-teapot@zalando.de>"

# add binary
Expand Down
2 changes: 1 addition & 1 deletion example/deploy/hpa.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
apiVersion: autoscaling/v2beta2
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: custom-metrics-consumer
Expand Down
28 changes: 14 additions & 14 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -3,17 +3,17 @@ module github.com/zalando-incubator/kube-metrics-adapter
require (
github.com/aws/aws-sdk-go v1.45.27
github.com/influxdata/influxdb-client-go v1.4.0
github.com/prometheus/client_golang v1.17.0
github.com/prometheus/client_golang v1.18.0
github.com/prometheus/common v0.45.0
github.com/sirupsen/logrus v1.9.3
github.com/spf13/cobra v1.7.0
github.com/spf13/cobra v1.8.0
github.com/spyzhov/ajson v0.9.0
github.com/stretchr/testify v1.8.4
github.com/szuecs/routegroup-client v0.21.1
github.com/zalando-incubator/cluster-lifecycle-manager v0.0.0-20230601114834-6ed1bba3c85d
golang.org/x/net v0.17.0
golang.org/x/oauth2 v0.13.0
golang.org/x/sync v0.4.0
golang.org/x/net v0.19.0
golang.org/x/oauth2 v0.15.0
golang.org/x/sync v0.5.0
k8s.io/api v0.24.17
k8s.io/apimachinery v0.24.17
k8s.io/apiserver v0.24.17
Expand Down Expand Up @@ -73,8 +73,8 @@ require (
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
github.com/pkg/errors v0.9.1 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/prometheus/client_model v0.4.1-0.20230718164431-9a2bf3000d16 // indirect
github.com/prometheus/procfs v0.11.1 // indirect
github.com/prometheus/client_model v0.5.0 // indirect
github.com/prometheus/procfs v0.12.0 // indirect
github.com/rogpeppe/go-internal v1.11.0 // indirect
github.com/spf13/pflag v1.0.5 // indirect
github.com/valyala/bytebufferpool v1.0.0 // indirect
Expand All @@ -97,17 +97,17 @@ require (
go.uber.org/atomic v1.9.0 // indirect
go.uber.org/multierr v1.8.0 // indirect
go.uber.org/zap v1.21.0 // indirect
golang.org/x/crypto v0.14.0 // indirect
golang.org/x/crypto v0.17.0 // indirect
golang.org/x/mod v0.12.0 // indirect
golang.org/x/sys v0.13.0 // indirect
golang.org/x/term v0.13.0 // indirect
golang.org/x/text v0.13.0 // indirect
golang.org/x/sys v0.15.0 // indirect
golang.org/x/term v0.15.0 // indirect
golang.org/x/text v0.14.0 // indirect
golang.org/x/time v0.3.0 // indirect
golang.org/x/tools v0.11.0 // indirect
google.golang.org/appengine v1.6.7 // indirect
google.golang.org/appengine v1.6.8 // indirect
google.golang.org/genproto v0.0.0-20230410155749-daa745c078e1 // indirect
google.golang.org/grpc v1.55.0 // indirect
google.golang.org/protobuf v1.31.0 // indirect
google.golang.org/grpc v1.56.3 // indirect
google.golang.org/protobuf v1.32.0 // indirect
gopkg.in/inf.v0 v0.9.1 // indirect
gopkg.in/natefinch/lumberjack.v2 v2.0.0 // indirect
gopkg.in/yaml.v2 v2.4.0 // indirect
Expand Down
Loading

0 comments on commit f92a08a

Please sign in to comment.