Use less state-full design for collecting metrics #102

izgzhen · 2018-12-31T03:50:22Z

discussion started in #93

anhnamtran · 2018-12-31T19:57:26Z

@Calvin-L @izgzhen I have several questions:

What do you think the stats structure would look like? Can it be a class with fields for all the data it stores? Can it just be a dictionary created by the improve_implementation method?
Should the stats structure handle threading itself or should improve_implementation handle threading. More specifically, since we are updating it in the ImproveQueryJob for-loop, are they sharing the same stats structure? Or are we somehow making one for each, then aggregating them into one in improve_implementation? Or maybe there's a better way than both of the ones I said.
How would we go about getting the difference in time between each improve loop iteration (the "forever" while-loop) if we want to preserve its responsibility and not have an out-parameter?
What Zhen mentioned in his comment in Counter for number of improvements done #93, not messing with improve itself also creates some difficulties in implementing improvement number limit.

Calvin-L · 2019-01-04T18:17:00Z

Ultimately the design is up to you, but:

What do you think the stats structure would look like?

It doesn't really matter, as long as it is documented. I would probably do something lightweight, like "a dictionary whose keys are method names and whose values are lists of the timestamps of when improvements were made".

since we are updating it in the ImproveQueryJob for-loop, are they sharing the same stats structure?

That would be an example of shared mutable state, so I'm opposed to it. It would be much better for each job to collect its own local statistics and to aggregate them at the end. If you use the dictionary I proposed, each job would store a list and the aggregation would just arrange them into a dictionary by job name.

How would we go about getting the difference in time between each improve loop iteration

This can be done by the loop in ImproveQueryJob.

How comfortable are you with generators (sometimes called "coroutines") in Python?

def generate():
  i = 0
  while i < 100:
    print("yielding...")
    yield i
    i += 1

for value in generate():
  print("got {}".format(value))

In the example, the generator generate() and the loop interleave: instead of printing 100 "yielding..." lines followed by 100 "got X" lines, the program alternates between "yielding..." and "got X".

The improve loop is a generator, so the time difference between yields can be easily calculated by the caller. Instead of printing "got X", compute the current time and subtract it from the previous time.

You may want to alter improve so that it yields once on every loop iteration, even if an improvement was not found. (In the other thread I wrote up a more detailed description of this idea.)

not messing with improve itself also creates some difficulties in implementing improvement number limit

Actually it doesn't! The reason you were messing with improve was to make it check the improvement number and stop when it reached a threshold. However improve already comes with a super-general stopping mechanism: the stop_callback parameter. All you should need to do is modify the stop callback passed by the ImproveQueryJob.

If you intend the limit to be global rather than per-job, then instead you should modify the top-level loop that collects results to call stop_jobs(...) and return when a threshold is reached.

izgzhen mentioned this issue Dec 31, 2018

Counter for number of improvements done #93

Merged

Calvin-L added the code quality label Dec 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use less state-full design for collecting metrics #102

Use less state-full design for collecting metrics #102

izgzhen commented Dec 31, 2018

anhnamtran commented Dec 31, 2018 •

edited

Loading

Calvin-L commented Jan 4, 2019

Use less state-full design for collecting metrics #102

Use less state-full design for collecting metrics #102

Comments

izgzhen commented Dec 31, 2018

anhnamtran commented Dec 31, 2018 • edited Loading

Calvin-L commented Jan 4, 2019

anhnamtran commented Dec 31, 2018 •

edited

Loading