-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conntrack latency measurement recipe #376
Conversation
Just split of methods into multiple smaller. So they can be reused by other parts of LNST as well.
Method slices raw samples by index.
Metdho `PerfList.merge_with()` merges 2 `PerfList` objects with the same structure. E.g. following results container: ``` ParallelPerfResult( SequentialPerfResult( PerfInterval(value=1) ), SequentialPerfResult( PerfInterval(value=2) ) ) ``` merged with themself will result to: ``` ParallelPerfResult( SequentialPerfResult( PerfInterval(value=1), PerfInterval(value=1) ), SequentialPerfResult( PerfInterval(value=2), PerfInterval(value=2) ) ) ``` It simply merges all the `PerfList` layers all the way to `PerfInterval`.
b354444
to
216f876
Compare
The only way to saving scalar samples/results is to use PerfInterval and {Sequential,Parallel}PerfResult. However, these are meant to store vector/multidimensional data - value and duration and all the calculations it does expects vector data.
Added support for measuring latency (in background) during (flow) test. The latency is measured over single long-lived TCP connection that gathers `latency_packets_count-1` samples after measurement start. It then runs `cache_poison_tool` function which is supposed to somehow poison cache, then last sample is gathered. `LatencyMeasurementResults` then distinguish between uncached and cached latency which refers to 1st and last sample, the middle samples respectively. The problem this is trying to solve is to separate samples that were gathered when DUT was able to cache everything needed and samples where DUT hit lots of cache misses during connection handling. sq latency measurement results
This is supposed to test latency of conntrack during cache miss sq ct latency on cache miss recipe
216f876
to
94cb414
Compare
@@ -171,6 +171,25 @@ def time_slice(self, start, end): | |||
) | |||
return result | |||
|
|||
def samples_slice(self, slicer: callable): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i still think this probably shouldn't be part of the generic Perf.Results.PerfList
class and instead you probably just want this functionality as a helper in the tests that use the specific type of slicing that you have here.
e.g. with some setups you could have:
PerfSequentialResult
PerfSequentialResult: [PerfInterval, ...]
PerfSequentialResult: [PerfInterval, ...]
which would mean that you ran 2 repetitions of a stream test for example... with the slice you would cut each repetition to a shorter one
but
PerfParallelResult
PerfSequentialResult: [PerfInterval, ...]
PerfSequentialResult: [PerfInterval, ...]
has a completely different meaning since now the streams are parallel.
Basically I'm not sure if having this method work recursively wouldn't lead to confusing situations based on how you organize the recursive hierarchy of PerfList type objects... and so it may be more relevant to instead have a "specific" helper function in a place which is informed about the hierarchy which it is working with and where we can write a specific enough documentation that the use case is understood.
@@ -155,6 +155,26 @@ def __setitem__(self, i, item): | |||
|
|||
super(PerfList, self).__setitem__(i, item) | |||
|
|||
def merge_with(self, iterable): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and this again i would have as a separate helper method somewhere else.
maybe at some point when these methods are proven to be generic we could collect them into some "common" module that acts on PerfList objects but I don't think this should be included in the PerfList class itself.
class ParallelScalarResult(ParallelPerfResult): | ||
@property | ||
def average(self): | ||
samples_count = sum([len(i) for i in self]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this won't work if:
ParallelScallarResult([ScalarSample, ...])
as ScalarSample doesn't support len
@@ -40,6 +40,43 @@ def end_timestamp(self): | |||
def time_slice(self, start, end): | |||
raise NotImplementedError() | |||
|
|||
class ScalarSample(PerfResult): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure about this implementation considering it's still using duration and has time slicing.... at least this should be refactored so that the common PerfInterval
and ScalarSample
property getters are in some common class...
from lnst.Controller.RecipeResults import ResultType | ||
|
||
|
||
class LatencyMeasurement(BaseFlowMeasurement): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
discussed on tech meeting that this could maybe work as a standard measurement which simply measures latency in a regular interval and is combined with an additional measurement which is "more primary" in the overall PerfRecipeConfiguration which has a "10 second" quiet period and then executes the poisoning, in that way you will get the following hierarchy:
1. cpu measurement: [....]
2. cpu measurement: [....]
3. latency measurement: [10, 1, 1, 1, 1, 1, 100, 100, ...]
4. poisoning measurement: [0 , 0, 0, 0, 0, 100, 100, ...]
and afterwards you can postprocess these results to get:
1. cpu measurement: [....]
2. cpu measurement: [....]
3. latency measurement - start: [10]
4. latency measurement - middle: [1, 1, 1, 1, 1]
5. latency measurement - poisoned: [100, 100, ...]
and evaluate these each individually
Closing this, we internally agreed we actually don't have a customer use case for this test. |
Description
still WIP, opening just to get some feedback
This MR adds support for "latency on cache miss conntrack recipe" and is supposed to measure latency during conntrack's cache miss. That's done by measuring latency for data transfer over a newly opened TCP connection (uncached), comparing it to cached connection samples. The difference between cached and uncached transfer most of the time in order of magnitude.
Tests
Results.merge_with() test script