Skip to content

Commit

Permalink
Auto merge of rust-lang#118796 - Nadrieril:fix-exponential-id-match-2…
Browse files Browse the repository at this point in the history
…, r=cjgillot

Exhaustiveness: Improve complexity on some wide matches

rust-lang#118437 revealed an exponential case in exhaustiveness checking. While [exponential cases are unavoidable](https://compilercrim.es/rust-np/), this one only showed up after my rust-lang#117611 rewrite of the algorithm. I remember anticipating a case like this and dismissing it as unrealistic, but here we are :').

The tricky match is as follows:
```rust
match command {
    BaseCommand { field01: true, .. } => {}
    BaseCommand { field02: true, .. } => {}
    BaseCommand { field03: true, .. } => {}
    BaseCommand { field04: true, .. } => {}
    BaseCommand { field05: true, .. } => {}
    BaseCommand { field06: true, .. } => {}
    BaseCommand { field07: true, .. } => {}
    BaseCommand { field08: true, .. } => {}
    BaseCommand { field09: true, .. } => {}
    BaseCommand { field10: true, .. } => {}
    // ...20 more of the same

    _ => {}
}
```

To fix this, this PR formalizes a concept of "relevancy" (naming is hard) that was already used to decide what patterns to report. Now we track it for every row, which in wide matches like the above can drastically cut on the number of cases we explore. After this fix, the above match is checked with linear-many cases instead of exponentially-many.

Fixes rust-lang#118437

r? `@cjgillot`
  • Loading branch information
bors committed Dec 24, 2023
2 parents ebb821f + efb04e6 commit 1a086e4
Show file tree
Hide file tree
Showing 2 changed files with 277 additions and 29 deletions.
234 changes: 205 additions & 29 deletions compiler/rustc_pattern_analysis/src/usefulness.rs
Original file line number Diff line number Diff line change
Expand Up @@ -300,6 +300,166 @@
//!
//!
//!
//! # `Missing` and relevancy
//!
//! ## Relevant values
//!
//! Take the following example:
//!
//! ```compile_fail,E0004
//! # let foo = (true, true);
//! match foo {
//! (true, _) => 1,
//! (_, true) => 2,
//! };
//! ```
//!
//! Consider the value `(true, true)`:
//! - Row 2 does not distinguish `(true, true)` and `(false, true)`;
//! - `false` does not show up in the first column of the match, so without knowing anything else we
//! can deduce that `(false, true)` matches the same or fewer rows than `(true, true)`.
//!
//! Using those two facts together, we deduce that `(true, true)` will not give us more usefulness
//! information about row 2 than `(false, true)` would. We say that "`(true, true)` is made
//! irrelevant for row 2 by `(false, true)`". We will use this idea to prune the search tree.
//!
//!
//! ## Computing relevancy
//!
//! We now generalize from the above example to approximate relevancy in a simple way. Note that we
//! will only compute an approximation: we can sometimes determine when a case is irrelevant, but
//! computing this precisely is at least as hard as computing usefulness.
//!
//! Our computation of relevancy relies on the `Missing` constructor. As explained in
//! [`crate::constructor`], `Missing` represents the constructors not present in a given column. For
//! example in the following:
//!
//! ```compile_fail,E0004
//! enum Direction { North, South, East, West }
//! # let wind = (Direction::North, 0u8);
//! match wind {
//! (Direction::North, _) => 1,
//! (_, 50..) => 2,
//! };
//! ```
//!
//! Here `South`, `East` and `West` are missing in the first column, and `0..50` is missing in the
//! second. Both of these sets are represented by `Constructor::Missing` in their corresponding
//! column.
//!
//! We then compute relevancy as follows: during the course of the algorithm, for a row `r`:
//! - if `r` has a wildcard in the first column;
//! - and some constructors are missing in that column;
//! - then any `c != Missing` is considered irrelevant for row `r`.
//!
//! By this we mean that continuing the algorithm by specializing with `c` is guaranteed not to
//! contribute more information about the usefulness of row `r` than what we would get by
//! specializing with `Missing`. The argument is the same as in the previous subsection.
//!
//! Once we've specialized by a constructor `c` that is irrelevant for row `r`, we're guaranteed to
//! only explore values irrelevant for `r`. If we then ever reach a point where we're only exploring
//! values that are irrelevant to all of the rows (including the virtual wildcard row used for
//! exhaustiveness), we skip that case entirely.
//!
//!
//! ## Example
//!
//! Let's go through a variation on the first example:
//!
//! ```compile_fail,E0004
//! # let foo = (true, true, true);
//! match foo {
//! (true, _, true) => 1,
//! (_, true, _) => 2,
//! };
//! ```
//!
//! ```text
//! ┐ Patterns:
//! │ 1. `[(true, _, true)]`
//! │ 2. `[(_, true, _)]`
//! │ 3. `[_]` // virtual extra wildcard row
//! │
//! │ Specialize with `(,,)`:
//! ├─┐ Patterns:
//! │ │ 1. `[true, _, true]`
//! │ │ 2. `[_, true, _]`
//! │ │ 3. `[_, _, _]`
//! │ │
//! │ │ There are missing constructors in the first column (namely `false`), hence
//! │ │ `true` is irrelevant for rows 2 and 3.
//! │ │
//! │ │ Specialize with `true`:
//! │ ├─┐ Patterns:
//! │ │ │ 1. `[_, true]`
//! │ │ │ 2. `[true, _]` // now exploring irrelevant cases
//! │ │ │ 3. `[_, _]` // now exploring irrelevant cases
//! │ │ │
//! │ │ │ There are missing constructors in the first column (namely `false`), hence
//! │ │ │ `true` is irrelevant for rows 1 and 3.
//! │ │ │
//! │ │ │ Specialize with `true`:
//! │ │ ├─┐ Patterns:
//! │ │ │ │ 1. `[true]` // now exploring irrelevant cases
//! │ │ │ │ 2. `[_]` // now exploring irrelevant cases
//! │ │ │ │ 3. `[_]` // now exploring irrelevant cases
//! │ │ │ │
//! │ │ │ │ The current case is irrelevant for all rows: we backtrack immediately.
//! │ │ ├─┘
//! │ │ │
//! │ │ │ Specialize with `false`:
//! │ │ ├─┐ Patterns:
//! │ │ │ │ 1. `[true]`
//! │ │ │ │ 3. `[_]` // now exploring irrelevant cases
//! │ │ │ │
//! │ │ │ │ Specialize with `true`:
//! │ │ │ ├─┐ Patterns:
//! │ │ │ │ │ 1. `[]`
//! │ │ │ │ │ 3. `[]` // now exploring irrelevant cases
//! │ │ │ │ │
//! │ │ │ │ │ Row 1 is therefore useful.
//! │ │ │ ├─┘
//! <etc...>
//! ```
//!
//! Relevancy allowed us to skip the case `(true, true, _)` entirely. In some cases this pruning can
//! give drastic speedups. The case this was built for is the following (#118437):
//!
//! ```ignore(illustrative)
//! match foo {
//! (true, _, _, _, ..) => 1,
//! (_, true, _, _, ..) => 2,
//! (_, _, true, _, ..) => 3,
//! (_, _, _, true, ..) => 4,
//! ...
//! }
//! ```
//!
//! Without considering relevancy, we would explore all 2^n combinations of the `true` and `Missing`
//! constructors. Relevancy tells us that e.g. `(true, true, false, false, false, ...)` is
//! irrelevant for all the rows. This allows us to skip all cases with more than one `true`
//! constructor, changing the runtime from exponential to linear.
//!
//!
//! ## Relevancy and exhaustiveness
//!
//! For exhaustiveness, we do something slightly different w.r.t relevancy: we do not report
//! witnesses of non-exhaustiveness that are irrelevant for the virtual wildcard row. For example,
//! in:
//!
//! ```ignore(illustrative)
//! match foo {
//! (true, true) => {}
//! }
//! ```
//!
//! we only report `(false, _)` as missing. This was a deliberate choice made early in the
//! development of rust, for diagnostic and performance purposes. As showed in the previous section,
//! ignoring irrelevant cases preserves usefulness, so this choice still correctly computes whether
//! a match is exhaustive.
//!
//!
//!
//! # Or-patterns
//!
//! What we have described so far works well if there are no or-patterns. To handle them, if the
Expand Down Expand Up @@ -669,11 +829,15 @@ impl fmt::Display for ValidityConstraint {
struct PatStack<'a, 'p, Cx: TypeCx> {
// Rows of len 1 are very common, which is why `SmallVec[_; 2]` works well.
pats: SmallVec<[&'a DeconstructedPat<'p, Cx>; 2]>,
/// Sometimes we know that as far as this row is concerned, the current case is already handled
/// by a different, more general, case. When the case is irrelevant for all rows this allows us
/// to skip a case entirely. This is purely an optimization. See at the top for details.
relevant: bool,
}

impl<'a, 'p, Cx: TypeCx> PatStack<'a, 'p, Cx> {
fn from_pattern(pat: &'a DeconstructedPat<'p, Cx>) -> Self {
PatStack { pats: smallvec![pat] }
PatStack { pats: smallvec![pat], relevant: true }
}

fn is_empty(&self) -> bool {
Expand Down Expand Up @@ -708,12 +872,17 @@ impl<'a, 'p, Cx: TypeCx> PatStack<'a, 'p, Cx> {
&self,
pcx: &PlaceCtxt<'a, 'p, Cx>,
ctor: &Constructor<Cx>,
ctor_is_relevant: bool,
) -> PatStack<'a, 'p, Cx> {
// We pop the head pattern and push the new fields extracted from the arguments of
// `self.head()`.
let mut new_pats = self.head().specialize(pcx, ctor);
new_pats.extend_from_slice(&self.pats[1..]);
PatStack { pats: new_pats }
// `ctor` is relevant for this row if it is the actual constructor of this row, or if the
// row has a wildcard and `ctor` is relevant for wildcards.
let ctor_is_relevant =
!matches!(self.head().ctor(), Constructor::Wildcard) || ctor_is_relevant;
PatStack { pats: new_pats, relevant: self.relevant && ctor_is_relevant }
}
}

Expand Down Expand Up @@ -779,10 +948,11 @@ impl<'a, 'p, Cx: TypeCx> MatrixRow<'a, 'p, Cx> {
&self,
pcx: &PlaceCtxt<'a, 'p, Cx>,
ctor: &Constructor<Cx>,
ctor_is_relevant: bool,
parent_row: usize,
) -> MatrixRow<'a, 'p, Cx> {
MatrixRow {
pats: self.pats.pop_head_constructor(pcx, ctor),
pats: self.pats.pop_head_constructor(pcx, ctor, ctor_is_relevant),
parent_row,
is_under_guard: self.is_under_guard,
useful: false,
Expand Down Expand Up @@ -897,8 +1067,9 @@ impl<'a, 'p, Cx: TypeCx> Matrix<'a, 'p, Cx> {
&self,
pcx: &PlaceCtxt<'a, 'p, Cx>,
ctor: &Constructor<Cx>,
ctor_is_relevant: bool,
) -> Matrix<'a, 'p, Cx> {
let wildcard_row = self.wildcard_row.pop_head_constructor(pcx, ctor);
let wildcard_row = self.wildcard_row.pop_head_constructor(pcx, ctor, ctor_is_relevant);
let new_validity = self.place_validity[0].specialize(ctor);
let new_place_validity = std::iter::repeat(new_validity)
.take(ctor.arity(pcx))
Expand All @@ -908,7 +1079,7 @@ impl<'a, 'p, Cx: TypeCx> Matrix<'a, 'p, Cx> {
Matrix { rows: Vec::new(), wildcard_row, place_validity: new_place_validity };
for (i, row) in self.rows().enumerate() {
if ctor.is_covered_by(pcx, row.head().ctor()) {
let new_row = row.pop_head_constructor(pcx, ctor, i);
let new_row = row.pop_head_constructor(pcx, ctor, ctor_is_relevant, i);
matrix.expand_and_push(new_row);
}
}
Expand Down Expand Up @@ -1108,7 +1279,10 @@ impl<Cx: TypeCx> WitnessMatrix<Cx> {
if matches!(ctor, Constructor::Missing) {
// We got the special `Missing` constructor that stands for the constructors not present
// in the match.
if !report_individual_missing_ctors {
if missing_ctors.is_empty() {
// Nothing to report.
*self = Self::empty();
} else if !report_individual_missing_ctors {
// Report `_` as missing.
let pat = WitnessPat::wild_from_ctor(pcx, Constructor::Wildcard);
self.push_pattern(pat);
Expand Down Expand Up @@ -1167,6 +1341,13 @@ fn compute_exhaustiveness_and_usefulness<'a, 'p, Cx: TypeCx>(
) -> WitnessMatrix<Cx> {
debug_assert!(matrix.rows().all(|r| r.len() == matrix.column_count()));

if !matrix.wildcard_row.relevant && matrix.rows().all(|r| !r.pats.relevant) {
// Here we know that nothing will contribute further to exhaustiveness or usefulness. This
// is purely an optimization: skipping this check doesn't affect correctness. See the top of
// the file for details.
return WitnessMatrix::empty();
}

let Some(ty) = matrix.head_ty(mcx) else {
// The base case: there are no columns in the matrix. We are morally pattern-matching on ().
// A row is useful iff it has no (unguarded) rows above it.
Expand All @@ -1179,8 +1360,14 @@ fn compute_exhaustiveness_and_usefulness<'a, 'p, Cx: TypeCx>(
return WitnessMatrix::empty();
}
}
// No (unguarded) rows, so the match is not exhaustive. We return a new witness.
return WitnessMatrix::unit_witness();
// No (unguarded) rows, so the match is not exhaustive. We return a new witness unless
// irrelevant.
return if matrix.wildcard_row.relevant {
WitnessMatrix::unit_witness()
} else {
// We choose to not report anything here; see at the top for details.
WitnessMatrix::empty()
};
};

debug!("ty: {ty:?}");
Expand Down Expand Up @@ -1223,32 +1410,21 @@ fn compute_exhaustiveness_and_usefulness<'a, 'p, Cx: TypeCx>(

let mut ret = WitnessMatrix::empty();
for ctor in split_ctors {
debug!("specialize({:?})", ctor);
// Dig into rows that match `ctor`.
let mut spec_matrix = matrix.specialize_constructor(pcx, &ctor);
debug!("specialize({:?})", ctor);
// `ctor` is *irrelevant* if there's another constructor in `split_ctors` that matches
// strictly fewer rows. In that case we can sometimes skip it. See the top of the file for
// details.
let ctor_is_relevant = matches!(ctor, Constructor::Missing) || missing_ctors.is_empty();
let mut spec_matrix = matrix.specialize_constructor(pcx, &ctor, ctor_is_relevant);
let mut witnesses = ensure_sufficient_stack(|| {
compute_exhaustiveness_and_usefulness(mcx, &mut spec_matrix, false)
});

let counts_for_exhaustiveness = match ctor {
Constructor::Missing => !missing_ctors.is_empty(),
// If there are missing constructors we'll report those instead. Since `Missing` matches
// only the wildcard rows, it matches fewer rows than this constructor, and is therefore
// guaranteed to result in the same or more witnesses. So skipping this does not
// jeopardize correctness.
_ => missing_ctors.is_empty(),
};
if counts_for_exhaustiveness {
// Transform witnesses for `spec_matrix` into witnesses for `matrix`.
witnesses.apply_constructor(
pcx,
&missing_ctors,
&ctor,
report_individual_missing_ctors,
);
// Accumulate the found witnesses.
ret.extend(witnesses);
}
// Transform witnesses for `spec_matrix` into witnesses for `matrix`.
witnesses.apply_constructor(pcx, &missing_ctors, &ctor, report_individual_missing_ctors);
// Accumulate the found witnesses.
ret.extend(witnesses);

// A parent row is useful if any of its children is.
for child_row in spec_matrix.rows() {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
// check-pass
struct BaseCommand {
field01: bool,
field02: bool,
field03: bool,
field04: bool,
field05: bool,
field06: bool,
field07: bool,
field08: bool,
field09: bool,
field10: bool,
field11: bool,
field12: bool,
field13: bool,
field14: bool,
field15: bool,
field16: bool,
field17: bool,
field18: bool,
field19: bool,
field20: bool,
field21: bool,
field22: bool,
field23: bool,
field24: bool,
field25: bool,
field26: bool,
field27: bool,
field28: bool,
field29: bool,
field30: bool,
}

fn request_key(command: BaseCommand) {
match command {
BaseCommand { field01: true, .. } => {}
BaseCommand { field02: true, .. } => {}
BaseCommand { field03: true, .. } => {}
BaseCommand { field04: true, .. } => {}
BaseCommand { field05: true, .. } => {}
BaseCommand { field06: true, .. } => {}
BaseCommand { field07: true, .. } => {}
BaseCommand { field08: true, .. } => {}
BaseCommand { field09: true, .. } => {}
BaseCommand { field10: true, .. } => {}
BaseCommand { field11: true, .. } => {}
BaseCommand { field12: true, .. } => {}
BaseCommand { field13: true, .. } => {}
BaseCommand { field14: true, .. } => {}
BaseCommand { field15: true, .. } => {}
BaseCommand { field16: true, .. } => {}
BaseCommand { field17: true, .. } => {}
BaseCommand { field18: true, .. } => {}
BaseCommand { field19: true, .. } => {}
BaseCommand { field20: true, .. } => {}
BaseCommand { field21: true, .. } => {}
BaseCommand { field22: true, .. } => {}
BaseCommand { field23: true, .. } => {}
BaseCommand { field24: true, .. } => {}
BaseCommand { field25: true, .. } => {}
BaseCommand { field26: true, .. } => {}
BaseCommand { field27: true, .. } => {}
BaseCommand { field28: true, .. } => {}
BaseCommand { field29: true, .. } => {}
BaseCommand { field30: true, .. } => {}

_ => {}
}
}

fn main() {}

0 comments on commit 1a086e4

Please sign in to comment.