Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conntrack latency measurement recipe #376

Closed
wants to merge 6 commits into from

Conversation

enhaut
Copy link
Member

@enhaut enhaut commented Sep 2, 2024

Description

still WIP, opening just to get some feedback

This MR adds support for "latency on cache miss conntrack recipe" and is supposed to measure latency during conntrack's cache miss. That's done by measuring latency for data transfer over a newly opened TCP connection (uncached), comparing it to cached connection samples. The difference between cached and uncached transfer most of the time in order of magnitude.

Tests

Results.merge_with() test script
from lnst.RecipeCommon.Perf.Results import  SequentialPerfResult, ParallelPerfResult, PerfInterval

# no recursion branch test
multi_parallel = ParallelPerfResult()
sequential0 = SequentialPerfResult()
sequential1 = SequentialPerfResult()

for j in range(1):
   sequential0.append(PerfInterval(j, 0.1, "s", (j)*0.1+100))
   sequential1.append(PerfInterval(j+100, 0.1, "s", (j)*0.1+100))
multi_parallel.append(sequential0)
multi_parallel.append(sequential1)


multi_parallel2 = ParallelPerfResult()
sequentiall0 = SequentialPerfResult()
sequentiall1 = SequentialPerfResult()
for j in range(1):
   sequentiall0.append(PerfInterval(j, 0.1, "s", (j)*0.1+100))
   sequentiall1.append(PerfInterval(j+100, 0.1, "s", (j)*0.1+100))
multi_parallel2.append(sequentiall0)
multi_parallel2.append(sequentiall1)

r = multi_parallel.merge_with(multi_parallel2)

assert type(r) == type(multi_parallel)
assert type(r[0]) == type(multi_parallel[0])
assert sequential0[0] in r[0]
assert sequentiall0[0] in r[0]

assert sequential1[0] in r[1]
assert sequentiall1[0] in r[1]


# recursion branch test
seq_container0 = SequentialPerfResult()
seq_container0.append(sequential0)
seq_container1 = SequentialPerfResult()
seq_container1.append(sequential1)

seq_container2 = SequentialPerfResult()
seq_container2.append(sequentiall0)
seq_container3 = SequentialPerfResult()
seq_container3.append(sequentiall1)

multi_parallel = ParallelPerfResult()
multi_parallel.append(seq_container0)
multi_parallel.append(seq_container1)

multi_parallel2 = ParallelPerfResult()
multi_parallel2.append(seq_container2)
multi_parallel2.append(seq_container3)

r = multi_parallel.merge_with(multi_parallel2)
print(r)

assert type(r) == type(multi_parallel)
assert type(r[0]) == type(multi_parallel[0])
assert type(r[0][0]) == type(multi_parallel[0][0])

assert sequential0[0] in r[0][0]
assert sequentiall0[0] in r[0][0]

assert sequential1[0] in r[1][0]
assert sequentiall1[0] in r[1][0]

Just split of methods into multiple smaller. So they can be
reused by other parts of LNST as well.
Method slices raw samples by index.
Metdho `PerfList.merge_with()` merges 2 `PerfList` objects with
the same structure.
E.g. following results container:

```
ParallelPerfResult(
  SequentialPerfResult(
    PerfInterval(value=1)
  ),
  SequentialPerfResult(
    PerfInterval(value=2)
  )
)
```

merged with themself will result to:

```
ParallelPerfResult(
  SequentialPerfResult(
    PerfInterval(value=1),
    PerfInterval(value=1)
  ),
  SequentialPerfResult(
    PerfInterval(value=2),
    PerfInterval(value=2)
  )
)
```

It simply merges all the `PerfList` layers all the way to `PerfInterval`.
The only way to saving scalar samples/results is
to use PerfInterval and {Sequential,Parallel}PerfResult.
However, these are meant to store vector/multidimensional
data - value and duration and all the calculations
it does expects vector data.
Added support for measuring latency (in background) during
(flow) test.

The latency is measured over single long-lived TCP connection
that gathers `latency_packets_count-1` samples after measurement
start. It then runs `cache_poison_tool` function which is supposed
to somehow poison cache, then last sample is gathered.

`LatencyMeasurementResults` then distinguish between uncached
and cached latency which refers to 1st and last sample,
the middle samples respectively.
The problem this is trying to solve is to separate samples
that were gathered when DUT was able to cache everything needed
and samples where DUT hit lots of cache misses during connection
handling.

sq latency measurement results
This is supposed to test latency of conntrack during cache miss

sq ct latency on cache miss recipe
@@ -171,6 +171,25 @@ def time_slice(self, start, end):
)
return result

def samples_slice(self, slicer: callable):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i still think this probably shouldn't be part of the generic Perf.Results.PerfList class and instead you probably just want this functionality as a helper in the tests that use the specific type of slicing that you have here.

e.g. with some setups you could have:

PerfSequentialResult
    PerfSequentialResult: [PerfInterval, ...]
    PerfSequentialResult: [PerfInterval, ...]

which would mean that you ran 2 repetitions of a stream test for example... with the slice you would cut each repetition to a shorter one

but

PerfParallelResult
    PerfSequentialResult: [PerfInterval, ...]
    PerfSequentialResult: [PerfInterval, ...]

has a completely different meaning since now the streams are parallel.

Basically I'm not sure if having this method work recursively wouldn't lead to confusing situations based on how you organize the recursive hierarchy of PerfList type objects... and so it may be more relevant to instead have a "specific" helper function in a place which is informed about the hierarchy which it is working with and where we can write a specific enough documentation that the use case is understood.

@@ -155,6 +155,26 @@ def __setitem__(self, i, item):

super(PerfList, self).__setitem__(i, item)

def merge_with(self, iterable):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this again i would have as a separate helper method somewhere else.

maybe at some point when these methods are proven to be generic we could collect them into some "common" module that acts on PerfList objects but I don't think this should be included in the PerfList class itself.

class ParallelScalarResult(ParallelPerfResult):
@property
def average(self):
samples_count = sum([len(i) for i in self])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this won't work if:

ParallelScallarResult([ScalarSample, ...])

as ScalarSample doesn't support len

@@ -40,6 +40,43 @@ def end_timestamp(self):
def time_slice(self, start, end):
raise NotImplementedError()

class ScalarSample(PerfResult):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure about this implementation considering it's still using duration and has time slicing.... at least this should be refactored so that the common PerfInterval and ScalarSample property getters are in some common class...

from lnst.Controller.RecipeResults import ResultType


class LatencyMeasurement(BaseFlowMeasurement):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed on tech meeting that this could maybe work as a standard measurement which simply measures latency in a regular interval and is combined with an additional measurement which is "more primary" in the overall PerfRecipeConfiguration which has a "10 second" quiet period and then executes the poisoning, in that way you will get the following hierarchy:

1. cpu measurement: [....]
2. cpu measurement: [....]
3. latency measurement:   [10, 1, 1, 1, 1, 1, 100, 100, ...]
4. poisoning measurement: [0 , 0, 0, 0, 0, 100, 100, ...]

and afterwards you can postprocess these results to get:

1. cpu measurement: [....]
2. cpu measurement: [....]
3. latency measurement - start:    [10]
4. latency measurement - middle:   [1, 1, 1, 1, 1]
5. latency measurement - poisoned: [100, 100, ...]

and evaluate these each individually

@enhaut
Copy link
Member Author

enhaut commented Oct 18, 2024

Closing this, we internally agreed we actually don't have a customer use case for this test.

@enhaut enhaut closed this Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants