Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for compression from buildbarn #158

Open
eagleonhill opened this issue Dec 23, 2022 · 5 comments
Open

Support for compression from buildbarn #158

eagleonhill opened this issue Dec 23, 2022 · 5 comments

Comments

@eagleonhill
Copy link

Bazel remote protocol now have compression option. That would be very useful for cases where the bazel and build-barn is in different clusters, and also saves storage usage.

@EdSchouten
Copy link
Member

EdSchouten commented Dec 26, 2022

Hey @eagleonhill,

When compression support was added to the Remote Execution protocol, I was opposed to adding it in its given form. The reason being that the transmitting side does not properly send the size of the compressed object to the receiving side. With the way LocalBlobAccess is designed, it would be pretty hard to allocate space for objects as they are being written, as it tries to store objects consecutively. This means that compression would either be limited to just on wire (not at rest), or bb-storage would need to do unnecessary copying of data to guarantee consecutive storage.

Furthermore, one downside of compression is that it's a lot harder to provide fast random access to blobs. As Buildbarn relies heavily on virtual file systems (based on FUSE/NFSv4), this is something that we need. Even without compression the protocol's story around random access is a bit messy, as there's no way to guarantee data integrity by reading CAS objects partially.

Because of that, I am currently spending time with the remote API working group to get chunking/decomposition of CAS objects added. See these PRs:

Once we have proper support for chunking/decomposition, the story around compression will become a lot simpler. All (primitive) CAS objects will become bounded in size. This means that even if the receiving side doesn't know the size of a compressed CAS object, it can safely ingest it into memory to compute its size, before letting LocalBlobAccess write it to disk.

So in summary, the reason Buildbarn doesn't support compression is not because we don't care. We want to do it right, and that still takes a bit more time to sort out. I will keep this issue open, so that people can remain informed on updates.

@bytearth
Copy link

bytearth commented Mar 2, 2023

Hey @EdSchouten
compression is also a very useful feature for us.
I have read the code of bazel-remote, in their decision, when receive bytestream.Write request with resource path like
{instance_name}/uploads/{uuid}/compressed-blobs/{compressor}/{uncompressed_hash}/{uncompressed_size}{/optional_metadata}, they will decode the compressed data and write to the file if use diskCache backend.
zstd supports ”streaming decompression“, so they can decode and write in a loop, like this:

for {
    compressedData = src.Recv()
    uncompressedData = zstd.Decode(compressedData)
    dst.write(uncompressedData)
}

maybe buildbarn can have a try on it. when receive WriteRequest, Allocator a space with uncompressed_hash, then receive, decode, and write to the target Block.

@EdSchouten
Copy link
Member

That’s the point: if we add compression support, I don’t just want compression in transit. Also at rest. And that’s problematic, as we can’t allocate space for something we don’t know the size of.

@jmmv
Copy link

jmmv commented May 26, 2023

From my reading of this bug, I still don't understand why supporting gRPC-level in-transit compression (https://github.com/grpc/grpc-go/blob/master/Documentation/compression.md) is not desired/allowed. This feature should be easy to enable and it is completely invisible to the application-level logic, correct?

The reason I ask is because we face the situation where some of our artifacts are too large to download in a reasonable amount of time. However, these artifacts can be compressed very well and they almost-never change (so they stay in the cache for a long time)—so we don't really care about on-disk compression. Having the ability to compress in-transit only would solve 90% of our problems with slow builds due to downloads, and I think this should be a fairly simple change?

More specifically, what we'd be looking at is having compression between the frontend and bb-clientd, and nothing else. Bazel nor the storage nodes wouldn't even have to know that the compression is happening.

@kjteske
Copy link

kjteske commented Sep 14, 2023

+1 to gRPC-level in-transit compression.

Our biggest artifacts (~1GB) are large binaries with debug information that we know compress really well: adding "-gz=zlib" to our gcc options shrinks these by 60% or more. Unfortunately "-gz=zlib" adds 40+ seconds to the link time, which is a dealbreaker. We're optimistic gRPC compression would perform better.

(gcc 13 adds a "-gz=zstd" option which might be better, but we're on gcc 12.3 for now).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants