-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relaxing Move Requirements - edits / comments #14
base: master
Are you sure you want to change the base?
Conversation
@@ -139,24 +152,29 @@ for (std::string line; std::getline(std::cin, line);) { | |||
} | |||
``` | |||
|
|||
For the call to `std::getline()` to be valid with a moved from string it requires that `std::string()` _guarantees_ that `std::erase()` is valid on a moved from string. Changing the requirements of a moved from object does not change the guarantees of the standard components. | |||
For the call to `std::getline()` to be valid with a moved-from string, `std::string()` must _guarantee_ that `std::erase()` is valid on a moved-from string. <!-- I don't understand the assertion. How is erase involved? --> Changing the requirements of a moved-from object does not change the guarantees of the standard components. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::getline()
is defined to call str.erase()
in this case - I should make the connection more clear. This section was added as a response to complaints that it wouldn't be valid with my proposed wording for getline()
to call erase()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, feel free to hit the “resolve” button when you no longer need a reminder.
|
||
## Impact on the Standard | ||
|
||
All components which are _Movable_ in the Standard Library currently satisfy the proposed requirements as stated by both options below. **Both options are non-breaking changes and relax the requirements.** With either option, it may be possible to adopt these options retroactively as part of addressing a defect since neither option is a breaking change. | ||
|
||
<!-- Seems likely that the assertion that this is a non-breaking change will be argued with. You simply can't ever loosen or tighten concept requirements without breaking somebody. If someone wrote an algorithm that depends on the current stronger requirements, that algorithm is now broken with respect to its documentation, and a use of that algorithm with some type that has been documented to satisfy those requirements may break when the implementation of that type changes. Maybe you want to say “non-code-breaking changes?” --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure - although I do address the documentation issue below.
Concern has been raised that changing the documentation for requirements, especially the named requirements and concepts, would break existing code documentation that referenced the standard. However, taking a strict view of this would mean that the standard documentation could not be changed. For example, one of the libraries I work on has a `task<>` template which is documented as being ["Similar to `std::function` except it is not copyable and supports move-only and mutable callable targets..."](https://stlab.cc/libraries/concurrency/task/task/). Of note, this goes on to specify, "`stlab::task` satisfies the requirements of [MoveConstructible](https://en.cppreference.com/w/cpp/named_req/MoveConstructible) and [MoveAssignable](https://en.cppreference.com/w/cpp/named_req/MoveAssignable)." Weakening the requirements would mean that statement is still true. | ||
|
||
However, the concern over changing the documentation is one that should be considered in light of weakening the requirements retroactively as a defect. As that would mean even citing a specific version of the standard would break. | ||
|
||
<!-- I don't really understand the last sentence, and my sense is that politically, it's better not to try to pull off this line of argumentation, and just make a best effort at assessing the risk. I think there are some requirements that we absolutely wouldn't consider weakening because the risk is too great. In fact I am starting to think the better line here is to argue that Geoffrey misinterpreted the standard wording and the non-normative note is wrong, so this doesn't represent an actual weakening of requirements… IIUC --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be an interesting argument but many, including Howard Hinnant, agree the intent of the current standard is that a moved-from object must satisfy all the requirements of a non-moved-from object in whatever context it is used. Basically, we'd be arguing Howard didn't intend what he wrote.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whatever his intent, I am pretty sure I’m responsible for the original language, and Howard’s intent conflicts with the intent of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To elaborate, the fact that it went into a non-normative note is an indication that it was merely meant to clarify the intent of something that was already, strictly-speaking, specified by the standard. But the original wording was not intended to say that, and it seems unreasonable to draw the conclusion that it was, especially in light of the fact that the meaning Howard intended to “clarify” into existence is unworkable. The proper cure is to remove the incorrect non-normative wording, at least as a start.
@@ -167,7 +185,7 @@ Given the above general requirement, we can then specify what operations must ho | |||
|
|||
### Option 1 | |||
|
|||
Option 1 requires that a moved-from object can be used as an rhs argument to move-assignment only in the case that the object has been moved from and it is a self-move-assignment. It introduces a _moved-from-value_ to discuss the properties of the moved-from object without specifying a specific value and requires that self-move-assignment for the moved-from object is valid. The wording allows for `swap(a, a)` without allowing `a = move(a)` in general. | |||
Option 1 requires that a moved-from object can be used as an rhs argument to move-assignment only in the case of self-move-assignment. It introduces a _moved-from-value_ to discuss the properties of the moved-from object without specifying a specific value and requires that self-move-assignment for the moved-from object is valid. The wording allows for `swap(a, a)` without allowing `move(a), b = move(a)` in general. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did mean a = move(a)
although your phrasing is also true but this was to call out that only self-move-assignment for a moved-from object is required and not self-move-assignment in general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, wow… I see now. The original wording is really easy to misread as self-contradictory. Allow me to suggest:
Option 1 requires that a moved-from object can be used as an rhs argument to move-assignment only in the case of self-move-assignment. It introduces a _moved-from-value_ to discuss the properties of the moved-from object without specifying a specific value and requires that self-move-assignment for the moved-from object is valid. The wording allows for `swap(a, a)` without allowing `move(a), b = move(a)` in general. | |
Option 1 guarantees that the library will only perform self-move-assignment on values that have already been moved from. It introduces a _moved-from-value_ to discuss the properties of the moved-from object without specifying a specific value and requires that self-move-assignment for the moved-from object is valid. The wording allows for `swap(a, a)` without allowing `a = move(a)` in general. |
I'm not sure if I got the phrasing right; is this about restricting the library? Otherwise, maybe something like this:
Option 1 requires that a moved-from object can be used as an rhs argument to move-assignment only in the case of self-move-assignment. It introduces a _moved-from-value_ to discuss the properties of the moved-from object without specifying a specific value and requires that self-move-assignment for the moved-from object is valid. The wording allows for `swap(a, a)` without allowing `move(a), b = move(a)` in general. | |
Option 1 forbids self-move-assignment except on values that have already been moved from. It introduces a _moved-from-value_ to discuss the properties of the moved-from object without specifying a specific value and requires that self-move-assignment for the moved-from object is valid. The wording allows for `swap(a, a)` without allowing `a = move(a)` in general. |
@@ -382,7 +400,7 @@ Note that a new component could still provide stronger guarantees than is requir | |||
|
|||
### Class invariants | |||
|
|||
Although the proposal _Support for contract based programming in C++_, [P0542R5](#p0542), did not include class invariants, it is possible that a future version of the standard will. Class invariants are a way to state postconditions that apply to all (safe) operations on a type. For move, it would be desirable if class invariants where not validated for a moved from object by default, and more generally allow the declaration of _unsafe_ operations where invariants are not checked. One possible way this could be done would be with an attribute to declare if an invariant holds for a given argument: | |||
Although the proposal _Support for contract based programming in C++_, [P0542R5](#p0542), did not include class invariants, it is possible that a future version of the standard will. Class invariants are a way to state postconditions that apply to all (safe) operations on a type. For move, it would be desirable if class invariants where not validated for a moved-from object by default<!-- I very strongly disagree with this. The thing, if your type can be in a special moved-from state, is to include that state in the class invariants, and to document that the state is not in the domain of certain operations. The vast majority of types don't need such a special state, so this does not introduce much complexity, and for those that do, it simplifies reasoning for users. The ability to document an invariant that doesn't always hold is a convenience for type authors in these rare cases, but it doesn't serve their users. It also weakens arguments for the power of documenting invariants, if you can't always count on them. -->, and more generally allow the declaration of _unsafe_ operations where invariants are not checked. One possible way this could be done would be with an attribute to declare if an invariant holds for a given argument: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree with the comment - having a way to include the moved-from-state in the invariant requires the state to be testable and weakens the invariant unnecessarily. It would imply that any operation can leave the object in a moved-from state. This path just turns all objects into maybe monads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
having a way to include the moved-from-state in the invariant requires the state to be testable
The state needs to be one that can be handled by the destructor; what kind of state could be created that can be handled by the destructor but can't be otherwise validated? Making your argument about testability stick IMO means not just proving that such a type could exist, but that it's likely to exist and the efficiency cost of making it work in a simpler world is significant enough to justify the complexity created by a world with invariants-that-sometimes-don't-hold.
and weakens the invariant unnecessarily
Of course I'm against weakened invariants, but I'm against them because invariants are powerful for reasoning about code, and the reason they are powerful is that they're simple and universal. Making invariants less universal weakens all invariants, not just the few where the moved-from state needs to be special… and as far as I can tell, these cases are very rare.
It would imply that any operation can leave the object in a moved-from state.
What's your reasoning for that assertion?
AFAICS any operation can leave the object in a moved-from state as long as that state is consistent with way that operation is documented, and if it is not consistent with the way the operation is documented, it is disallowed in the result. Including or excluding the moved-from state in the invariant doesn't change that one bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One argument I do appreciate (but am not swayed by): the default move- ctor/assign operation may create a state outside an arbitrarily-chosen invariant, and it's inconvenient to patch that up. The two possible cures in my world are:
- Weaken the invariant by adding a moved-from state.
- Write non-default move/assign that patch the state up so it satisfies the invariant.
Until I see that there actually exist significant types for which these cures are untenable, I'm not convinced a general exception to invariants should be carved out for the moved-from state. Essentially, what that does, is declare that “move” is an unsafe operation for all types unless otherwise specified. That would be yet another case of “C++ gets the defaults wrong.”
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answering here to keep things in one place. Consider this small example:
class url [[invariant: this->is_valid_url()]] {
std::string _url;
bool is_valid_url(); //...
public:
// all the public methods
};
The default implementation of move-ctor and move-assign will leave us with an unspecified _url
value, which may or may not satisfy the invariant. I can see a few options:
- Clear the
_url
and weaken the invariant to includeis_valid_url() || is_empty()
.
1a) Patch up all the operations to do something with the empty case.
1b) Add preconditions/postconditions to everything except the required move operations to exclude the empty case. - Not support move, have copy, and pay for the inefficiency.
- Allow some operations (the moved from cases) not to satisfy the invariant.
3a) Implicit for the move from cases
3b) Always explicit - something like:url([[unsafe]]url&&);
3b could be more explicit with something like: url([[noinvariant, postcondition: is_empty()]]url&&);
depending on how verbose we want this to be.
I do not see other options, and I find option 3 as the most sensible. Option 1 very much sounds like getting the defaults wrong. I lean toward 3a because I believe it is the common case, and the defaults should match the named requirements for the operations, but I wouldn't object to 3b.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you say "valid" I suppose you mean "well-formed"
If this were Swift I'd implement 1a by having a global constant URL string for the moved-from state, but std::string doesn't cooperate like that, unfortunately, and there's no way to share storage between immutable strings. I find it very interesting that not having COW makes these cases difficult.
For C++, here's what I'd do instead (this is 1a):
class url [[invariant: this->is_valid_url()]] {
static std::string const moved = "http://moved-from-url";
std::string _storage;
std::string& _url() {
return !_storage.empty ? _storage : _moved;
}
std::string const& _url() {
return !_storage.empty ? _storage : _moved;
}
bool is_valid_url(); //...
public:
// all the public methods, implemented in terms of _url()
};
Yes, you pay for a branch (which can be predicted with high accuracy) on every access. If that cost was unacceptable, I would weaken the statement of the invariant to include the empty state, because them's the facts: it does. Whether that means 1b, or making the operations fail at runtime given an empty state, is an engineering call. My guess is that if url
has any interesting operations, they're accessing the network, which can fail anyway… but that also means the cost of the branch is in the noise.
We disagree on a very fundamental level about the sensibility of 3. “Some operations do not satisfy the invariant” is a contradiction; the thing you're calling an “invariant” is a condition that is not in fact invariant. Just look up the common definition of class invariant. There is no provision for “some public operations leave the invariant broken.”
Now that I've explained why I find 3 to be insensible (in a very literal sense), perhaps you'd explain what makes it more sensible than option 1.
I think you must mean something very different than I do about “getting the defaults wrong.” I'm not talking about anything having to do with this particular type, but rather what it means for users of the language overall: in general, they have to assume that every type has unsafe public API that will get used implicitly by the compiler in very ordinary-looking code they write. You may be able to convince yourself these cases are OK, but still, it's an overall cost you pay in reasoning about correctness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By "valid" I mean has a correspondence to a concrete or abstract entity. i.e. has meaning. Your solution is an optimization of 1a. (we can call it 1c). You have an empty state but, like a NaN, is constructed to avoid overhead. It is a value to signify the object has no meaning. I'm not a fan of the approach - it allows values without meaning to escape into the system. Everything is effectively a 'maybe' type. I'd rather crash hard on first use (or better - have the compiler flag it!). Here that would require adding a bunch of preconditions "is_not_moved_from()" which puts you back into a 1b situation. Having a NaN or empty states does not improve software correctness. It simply allows meaningless values to escape into the system and makes it more difficult to find the underlying problem. NaN type values have their uses (for efficiency) but we shouldn't lose sight that they are a hack for efficiency and do not provide any level of safety.
We know private and protected operations can violate invariants. There is no reason why protected should be limited to inheritance - that is just an OOPs/language artifact. Safety and efficiency are often at odds - I still contend (conjecture) that you cannot always have both. At one point you mentioned that you disagreed with my array permutation example as to why. I think you mentioned a mechanism that you believe allows safe permutations that never leaves holes at any step - I'm skeptical.
The underlying issue here is that the moved-from object should not exist. Moving from an object should end the lifetime of the object - but since the object does exist under the current language rules, we need to deal with it. Move as defined in C++ is inherently unsafe for any type without a meaningful empty state. "Unspecified" is without meaning and not tagged as being without meaning so the "meaninglessness" of the value is not encoded. Move is not the only unsafe operation. The basic exception guarantee is unsafe, for many types (including ints) default construction is unsafe, NaNs are unsafe (but tagged), and so on. By not requiring that move is noexcept (for "safety") the standard introduced the need for things like valueless_by_exception
- another unsafe variant of NaN.
Also of note - for any implicit move operation the compiler ensures safety by destructing the rvalue after the move. The only unsafe move operations are those involving casts (like std::move()) - which is circumventing the type system. Be default you don' have any unsafe operations without the cast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: by the end, I'm basically convinced, by viewing move
as the unsafe operation
By "valid" I mean has a correspondence to a concrete or abstract entity. i.e. has meaning.
The invariant that instances of the type have such a correspondence cannot be efficiently supported in a language with non-destructive move. I don't believe inventing your own meaning for “invariant” is an honest solution; it's a hack, and it has various negative consequences that I've outlined above.
Your solution is an optimization of 1a. (we can call it 1c).
FWIW, I don't see any performance optimization here. I simply bottlenecked the branch that would otherwise always be required into accessor functions to avoid having a huge impact on the code. It's a coding/code-quality optimization from my POV.
You have an empty state but, like a NaN, is constructed to avoid overhead. It is a value to signify the object has no meaning. I'm not a fan of the approach - it allows values without meaning to escape into the system.
You don't need to sell me on that; I agree.
NaN type values have their uses (for efficiency) but we shouldn't lose sight that they are a hack for efficiency and do not provide any level of safety.
In general I agree with you, but apparently NaN specifically is also useful to prevent massive computations from being halted when some of the non-NaN results are going to be useful.
We know private and protected operations can violate invariants.
Yes, but the public API is supposed to encapsulate those.
There is no reason why protected should be limited to inheritance - that is just an OOPs/language artifact.
I don't know what this means or how it's relevant. I'm just talking about public APIs.
Safety and efficiency are often at odds - I still contend (conjecture) that you cannot always have both.
At one point you mentioned that you disagreed with my array permutation example as to why.
I think you mentioned a mechanism that you believe allows safe permutations that never leaves holes at any step - I'm skeptical.
I don't remember this problem right now, but swap doesn't leave holes at any step; isn't it obvious that you can always use swap to permute? There must be some context I've forgotten here.
The underlying issue here is that the moved-from object should not exist. Moving from an object should end the lifetime of the object - but since the object does exist under the current language rules, we need to deal with it.
Agreed on all counts! We need to deal with the reality that the object still exists.
Move as defined in C++ is inherently unsafe for any type without a meaningful empty state.
I mean something different by “unsafe” than you do, I'm pretty sure. A safe operation by my definition always leaves an object in a state such that thereafter, when all written contracts are followed, there is no UB. This is a meaningful measure of protections that can be enforced by the compiler, like API boundaries and the type system. It's also a meaningful generalization of what type safety offers. If we reduce safety to “what actually has meaning” I'm pretty sure we have to say everything is unsafe, because meaning can be imposed by the use-case for a type.
Move is not the only unsafe operation. The basic exception guarantee is unsafe,
Not by my definition of safe, and it's distinctly different, today, from what you're asking move be allowed to do. It
By not requiring that move is noexcept (for "safety") the standard introduced the need for things like valueless_by_exception - another unsafe variant of NaN.
FWIW, that wasn't for safety, that was for “move will sometimes copy, copy can throw, and we want to be able to generate default move operations for existing types without breaking code.” There was no other way to resolve those engineering constraints, AFAICT; it simply acknowledges reality.
Also of note - for any implicit move operation the compiler ensures safety by destructing the rvalue after the move. The only unsafe move operations are those involving casts (like std::move()) - which is circumventing the type system. Be default you don' have any unsafe operations without the cast.
That's very interesting, and I think it mostly deals with my objections. OK, we can say std::move
is the unsafe operation and that using it can result in an object that doesn't satisfy invariants, but will be destructible. It still bothers me that we will routinely be writing (and synthesizing!) public API that leaves invariants broken; this will take some explaining to make people understand it. It also leads to some programming guidelines, like “any operation that leaves a visible moved-from state behind is unsafe, and should be so labeled when public.”
@@ -398,7 +416,7 @@ Parent, Sean. _Move Annoyance_, Addison-Wesley, 31 Mar. 2021, <https://sean-pare | |||
|
|||
Sutter, Herb. “Move, Simply.” _Sutter's Mill_, 21 Feb. 2020, <https://herbsutter.com/2020/02/17/move-simply/>. | |||
|
|||
Romer, Geoffrey. “Moved-from Objects Need Not Be Valid.” _C++ Standards Committee Papers_, ISO/IEC WG21, 10 Jan. 2020, <https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p2027r0.pdf>. | |||
Romer, Geoffrey. “Moved-from Objects Need Not Be Valid.” _C++ Standards Committee Papers_, ISO/IEC WG21, 10 Jan. 2020, <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p2027r0.pdf>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Standard site lost their certs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤷
|
||
Supporting self-move-assignment for this narrow case imposes some additional complexity because `a = move(a)` is, in general, a contradiction and is not required by the implementation of any standard component. The implementation required postcondition of `a = move(b)` is that `a` holds the prior value of `b` and the value of `b` is unspecified, but may be guaranteed to be a specific value. For example, if `a` and `b` are both of type `my_unique_ptr<T>` with the guarantee that `a` will hold the prior value of `b`, and `b` will be equal to `nullptr`. Then for the expression `a = move(a)`, the only way both of those guarantees could be satisfied is if `a` is already equal to `nullptr`. The current standard avoids this contradiction by defining the postcondition of move assignment for `std::unique_ptr<T>` as equivalent to `reset(r.release())` which provides a stronger guarantee than any standard component implementation requires while satisfying the Standard requirements. | ||
<!-- Doesn't this simply indicate that the guarantee you wrote for my_unique_ptr is wrongly phrased unless we add a special case for &a == &b? IMO you're not (yet) clearly pointing at an actual problem here. --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look to rephrase this but I don't want to have to carve out an exception for self-move-assignment. It doesn't make sense to have a guarantee and then an exception for the aliased case. We have a guarantee for the lhs of assignment, and frequently we have a guaranteed value for a moved-from-object. Carving out a special case for self-move-assignment is just a hack to avoid the contradiction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤷 “It's just a hack” seems to be in the eye of the beholder.
The standard as currently written is intended to allow move self-assignment; that seems pretty clear from lots of things, including the unique_ptr
semantics you note elsewhere. Whether that was a good idea or not is a separate question that I don't want to grapple with; it's probably too late to take that decision back anyhow. In a world where move self-assignment is allowed, if you are going to give postconditions for the rhs of move-assignment, you have to carve out a special case for self-assignment; that's just another rule of programming in C++, which is a consequence of the language design, like the rule of 5.
You do realize I'm not suggesting you propose carving out an exception in the standard, right?
No description provided.