-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keep runfiles tree IDs in memory for tools and multiple test attempts #23703
Conversation
@bazel-io fork 7.4.0 |
@tjgq Taking another look at this, I'm not entirely sure I understand the logic around runfiles mapping caching: Isn't the main case of reuse that of a tool used in multiple actions? It looks like that usage is never cached in |
I have converted this back to draft. I will add some tests and then take a deeper look at what's really happening in the case of a shared tool. |
@justinhorvitz Could you perhaps share the bits of internal bug referenced by
in 48c9255? Is there anything special about multiple test attempts that makes weak references work in this situation? |
The comment references some work I did to share a single runfiles mapping among concurrent executions of the same test, as in builds with high The point of the comment is to explain that using a weak reference is sufficient because it guarantees that at most 1 such runfiles mapping is retained (the active test execution will reference it, making it not GC-eligible). It goes on to say:
The background is that with the soft reference, we observed elevated OOM rates. In general, we avoid soft references because they are not guaranteed to be collected prior to exceeding |
@justinhorvitz Thanks for the explanation. What do you think of extending the caching to any non-test targets? Since they are used as tools, they may end up being used concurrently and thus affect peak memory usage too. |
Do you have evidence of a specific problem that needs optimizing? I think that's a prerequisite for considering such an optimization. Do non-test actions ever share the same |
I think I agree that the current state of this PR is not what we want; sorry for reviewing it hastily. |
I'm personally not in a good position to provide benchmarks, but certain actions in OSS Bazel can have very large runfiles mappings (e.g. @jbedard @alexeagle Do you happen to have numbers on the size of @justinhorvitz Could you perhaps run a benchmark with
As far as I understand the situation, they would naturally as |
While writing the compact execution log, make us of the information in `RunfilesTree` on whether the tree is likely to be reused by multiple spawns.
fbdfdff
to
6c813a2
Compare
@tjgq I updated the PR to get the desired behavior. Since tools are also logged as part of inputs, this requires checking the spawns mnemonic. |
src/main/java/com/google/devtools/build/lib/exec/CompactSpawnLogContext.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/exec/CompactSpawnLogContext.java
Outdated
Show resolved
Hide resolved
src/test/java/com/google/devtools/build/lib/exec/CompactSpawnLogContextTest.java
Show resolved
Hide resolved
@fmeum no I don't think we have numbers for that. Do you have a specific command or anything to measure what you're looking for? |
A simple metric that would already be very helpful is the largest number of files in a |
I added a counter for the number of times we call Build A
Build B
Since it's a simple change, it seems worth pursuing. I will run formal benchmarks to see if the savings show up in e2e metrics. |
@justinhorvitz I guess it makes sense that the improvements are somewhat proportional to It might be worth exploring using |
Internally we use
Using |
Benchmarks show a definite reduction of 1-4% in eden space garbage. There was a small cpu reduction but not statistically significant over 5 runs. Peak post-GC heap was not affected, probably because it was dominated by something else for these builds. I think we should proceed with the idea. |
Cherry-picks the following changes: * Optimize representation of runfiles in compact execution log (bazelbuild#23321) * Keep runfiles tree IDs in memory for multiple test attempts (bazelbuild#23703) * Fix naming inconsistency in `spawn.proto` (bazelbuild#23706) * Mark tool runfiles as such in expanded execution log (bazelbuild#23702) The cherry-picks required introducing a `Map<Artifact, RunfilesTree>` shim to `RunfilesSupplier` that matches the Bazel 8 way of obtaining a `RunfilesTree` from a runfiles middleman via `InputMetadataProvider`. Closes bazelbuild#23683 Closes bazelbuild#23710 Closes bazelbuild#23711 Closes bazelbuild#23734
@meisterT Could you take over the merge now that tjgq is OOO? |
@meisterT this is cl/680944521 internally |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is getting merged right now.
The changes in this PR have been included in Bazel 7.4.0 RC1. Please test out the release candidate and report any issues as soon as possible. |
While writing the compact execution log, make use of the information in
RunfilesTree
on whether the tree is likely to be reused by multiple test spawns and always keep it in memory for non-test spawns.