-
Notifications
You must be signed in to change notification settings - Fork 268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document a policy regarding memory allocation #638
Comments
Note that it's not just adding lots of checks: there are some API methods in CRoaring which allocate memory but do not return that memory (directly) and so how would we have to change the API to provide some kind of "error flag" that the user could check to determine whether allocation failed, in situations where we can't just return NULL to the user? My opinion as I wrote in #633 : I don't think we should try to detect memory allocation failures. We should document that we consider it illegal for memory allocation methods to fail / return NULL, and we should modify the built-in allocator implementation to call abort() or _exit() or similar if NULL is returned from an allocation method. And we shouldn't add checking code for all the allocations done throughout the code. In addition to considering how much work it would be for CRoaring to add all the memory checks and cleanups throughout the code, we have to realistically consider what the CRoaring user would do if they got back NULL: they would also have to commit to checking all the returned values... and if they get back NULL what would they do, other than die themselves? We could consider how users for whom this behavior is insufficient or undesirable could address it. For C++ users they could register memory allocators which throw an exception on failure, maybe: it wouldn't clean up (would be a memory leak and possibly even leave the roaring object in an invalid state, but it wouldn't crash). We could consider how difficult it would be to guarantee that a throw from an allocator would NOT leave the roaring object in an invalid state. This would require careful code review, to be sure. |
Rather than calling
Another consideration is whether we realistically expect to run on one of the platforms where dereferencing
I think we can provide a best-effort to clean a broken object if we want. Not repair, not validate, but free as many of the resources it holds after the error is detected as it can safely recognize. I don't think we can guarantee not to leave the roaring object in an invalid state after throwing, and I think it's OK to document it as such. The cleanup function would be a nice-to-have. Re: possibility of failure: I can think of one scenario outside of embedded which is 32-bits processes. Those can realistically exhaust the address space in the kind of application you'd use CRoaring for, and that is pretty much the only failure condition in modern Linux (no idea about other OSes). I will abstain of voting since my stakes aren't as high as they used to, I switched projects about two years ago and I haven't found an excuse to use bitmaps in the later projects I took on. I still feel OK giving opinions and debating tho. |
@Oppen For these sorts of things, there is no voting I think. What I hope is that we can arrive at a consensus. |
We already have the ability for the CRoaring user to provide alternate allocator methods. Isn't that sufficient to allow different behaviors on failure? Do we really need a separate callback, beyond that?
That's exactly what the (32bit) code does now: an incomplete "best effort attempt" to free memory. But speaking for myself that's what I don't want. I don't see the value in this partial implementation: for people who really want to use CRoaring in environments where memory is strictly limited and they have a way of dealing with an error due to out of memory, a partial implementation is not sufficient for them. They need guarantees, presumably with testing proving that it works. And for the rest of us it doesn't matter since we'll just die anyway. Either we should commit to this and finish the implementation so it's really correct, or abandon it as too much overhead/etc. and instead say that we expect that whatever allocator function we called it would either (a) return the requested memory, or (b) not return at all. What I was referring to by "not leaving the object in an invalid state" was not cleaning up after failure (allocators cannot fail!). Instead, I meant re-order the code so that we avoid making state changes that invalidate the CRoaring object, until after we've allocated all the memory we need to. This way if the allocator does something like throw an exception we have a reasonable expectation that the CRoaring object is still valid, in any context where the exception is caught. Maybe that is too hard or not worth it. Certainly it will only help C++ programs since there are no exceptions in C. |
I'm really of two minds about this: It's really hard to get right (error paths that are almost never hit are really hard to always be perfect at). But between windows (no overcommit), 32 bit systems, and custom allocators (setting up a limited custom allocator which ensures we never allocate over 1GiB or something seems like a possibly worthwhile thing to support), it would be nice to handle OOM gracefully. Also, C has |
Yes I thought of longjmp but it's difficult to see how a user would make use of it, other than to provide a clean exit path. That is, it's hard to imagine using longjmp around each call to CRoaring to detect if that particular call failed and handle it. The question is what do you mean by "handle OOM gracefully". What operations would be done in such a handler, that couldn't be accomplished by replacing the allocator functions? Basically I think we have three options:
In order to work on option 3 the very first thing we have to do is examine the public API and determine which methods could allocate memory. Then we have to decide how to communicate back to the user that memory allocation failed. This will definitely result in an API change. Some changes can be "quiet"; for example:
could be changed to return a bool and return false if memory allocation failed. But what do we do about things like:
which already uses its return value for something else? How does the caller determine whether this call failed due to an OOM issue? I suppose the only option is to create a separate function that can be called to determine whether there was an OOM error, that users who care can query after each CRoaring call to check for this problem. |
In C, the
malloc
function (and similar memory allocation functions) may 'fail' by returning a NULL pointer, or by some other means. Currently, CRoaring does not have a policy regarding failures in case of memory allocation.We have some checks in place (bailing out from function when the malloc failed), but there is no testing to see what happens when memory allocations can fail. So it is entirely possible that there are bugs. I would hope that static analysis could reveal such bugs, but I am not sure how to go about it.
We recently add 64-bit roaring bitmaps to C and the implementation did not check for NULLs. I have proposed to add checks at #633
How do we handle this?
That's not as clear cut as it seems. On many systems,
malloc
effectively always succeeds, and the failure comes when you try to access the memory. So checking for a NULL return is less useful than it seems. Furthermore, we specifically allow people to provide their own memory allocators.The proposal currently is to document our policy (the policy of the CRoaring project). Of course, we first need to agree on the policy. Thus I am inviting comments from anyone who might use CRoaring.
To be clear, I am not looking for code changes (though these would fine) but rather on a policy or agreed-upon convention. Code changes could then be allowed or rejected based on this policy.
Calling on @alexey-milovidov @BenPope @xiaoyong-z @madscientist @SLieve and others...
Credit to @Oppen for pointing out the issue.
The text was updated successfully, but these errors were encountered: