Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New application #959

Merged
merged 10 commits into from
Dec 20, 2023
Merged

New application #959

merged 10 commits into from
Dec 20, 2023

Conversation

gabriel-barrett
Copy link
Member

@gabriel-barrett gabriel-barrett commented Dec 13, 2023

This PR changes the way application and environments work.

First, the most important: the environments. The way Lurk used to work is that every time an operation changed an environment, you'd have to push a continuation that would recover the unchanged environment. This was an invariant that absolutely had to be followed, because we would not save the environment for the argument of the function, instead relying on the environment that was returned. That is, it was the responsibility of whoever modified the environment to recover it after reducing the expression, not the responsibility of the whoever must save expressions to the continuation to also save the running environment. This was quite unusual, so I decided to go with the more common approach, of saving environments in continuations. Here are the benefits, and some caveat:

1 - I've removed both lookup and tail continuations, which were in fact identical (these were the continuations responsible for retrieving the environment). This removed a hash8 slot in make_thunk.
2 - I was able to save all necessary environments on continuations without using reusing slots. Saving the environment for the coprocessor was a bit tricky since its continuation already used the 4 slots, so I had to use a hash2 to combine two elements, but it was a reused slot, so no extra slots were required
3 - It made lookup take less iterations since there's no need to waste one step retrieving the environment. This saved about 100+ iterations for (fib 20)
4 - This change fixes #687
5 - Since no one has the responsibility of cleaning up after modifying an environment, the environment that results after you finished reducing the expression is considered garbage, so it must be discarded. That's why apply_continuation won't use the input environment at all
6 - Also because of 5, eval would leak possibly sensitive information after reducing, since it would return the modified environment. So, primarily for privacy reasons, the last step (when you go from outermost to terminal), the environment is discarded and a nil is returned instead. You might protest at first, because now any (expr, env, outermost) will return (val, nil, terminal), and not (val, env, terminal). But this behaviour, I think, is more elegant. Here's the reason: if you start with (expr, env, outermost), you can think of it as a shortcut to the evaluation of the top-level ((let ((x a) ...) expr), nil, outermost), for a let expression that builds env. In fact, if you start from such a triple, you will inevitably pass through the state (expr, env, outermost), which was not the case before (you would pass through a similar triple, but the continuation would be a tail continuation).

If I've convinced everyone of this change, I will now explain the apply change, which saved another 200+ iterations on (fib 20). The way Lurk used to work, is that an expression such as (f a b c) would first spend time rewriting it to (((f a) b) c), to only then start reducing. Now, we push (a b c) list to a continuation (called Call), evaluate f, which should return a function (or we fail). The second change is that functions can have multiple arguments. So, when we apply Call, we check 4 cases:

1 - If the function parameters and the call arguments are both empty (nil), we enter the body of the function
2 - If the function parameters is not empty, but the call arguments are, we return the function as is, exactly like current Lurk works
3 - If the function parameters is empty and the call arguments are not empty, we fail, like Lurk
4 - If both are not empty, we start evaluating the first argument, and push a Call2 continuation with function and rest of arguments

The way Call2 works:
1 - If the function has exactly one parameter, we enter body of the function, with the call argument added to the environment. If the rest of the call argument is non-nil, we also push a Call continuation (this is the oversaturated case)
2 - If the function has more than one parameter, we create a new function with one less parameter and an extended environment. If the rest of the call argument is nil, we return this function, otherwise we push this function to a Call2 continuation with the tail of the rest, evaluating the argument

Minor change: because of multi-argument functions, one more hash6 was added, so we would go from hash6 = 3 and hash8 = 4 to hash6 = 4 and hash8 = 3 (from the environment change). However, making functions hash8 will reuse one slot, so in the end we've reached hash6 = 0 and hash8 = 6

@gabriel-barrett gabriel-barrett requested review from a team as code owners December 13, 2023 18:03
Copy link
Member

@arthurpaulino arthurpaulino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all and most importantly, this is awesome 🎉 🔥.

I'm requesting changes because we need to decide what to do about the expected iterations on tests. And also because the functions to compute the number of iterations on fibonacci needs to be adjusted (itcalc.rs included)

About expected iterations, I think they're going to slow us down every time we make improvements on our model. My thinking is that we should test for correctness while aiming for iteration optimality. Pretty much like what we do with the number of constraints on our circuit.

Following that line of thought, we could have a minimally representative computation meant specifically to show us the number of iterations we need to finish it. And all other tests (evaluation and proving) would be set free from the need to match a certain number of iterations precisely.

Of course, this is just my take. I'm favoring dev speed if we're already (minimally) convinced that our model computes what we believe it should.

src/lem/tests/eval_tests.rs Outdated Show resolved Hide resolved
src/proof/tests/mod.rs Outdated Show resolved Hide resolved
@arthurpaulino
Copy link
Member

If we no longer need the dummy var, we should get rid of
https://github.com/lurk-lab/lurk-rs/blob/381bc6260e4ad4d1ee11c640bfd9a9aafc04187b/src/state.rs#L254
Then close #9.

And I also think that this PR closes #8.

Comment on lines +993 to +996
match var.tag {
Expr::Sym => {
let fun: Expr::Fun = cons4(vars, body, env, foo);
return (fun, env, cont, apply)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this check is not necessary and we could go straight to

let fun: Expr::Fun = cons4(vars, body, env, foo);
return (fun, env, cont, apply)

It will save a few constraints by deferring the error, which will happen at some point if some variable is not a symbol.
I'm following the rule of thumb "allow the execution of correct Lurk programs to be proved faster by the cost of delaying the failure of incorrect Lurk programs"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will save some constraints, but it is too minimal for it to matter. The way it is now more closely follows current Lurk semantics. For instance, in both current Lurk and this branch, (lambda (x 0) x) does not fail, but (lambda (0) 0) and ((lambda (x 0) x) 0) fail on both. If I remove this line (and the similar one in apply_cont), these last two expressions will not fail

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the practical benefit of (lambda (0 x) nil) failing earlier if (lambda (x 0) nil) doesn't, especially since this asymmetry has extra costs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can think of (lambda (x 0) nil) as (lambda (x) (lambda (0) nil)) which will reduce to <FUNCTION (x) (lambda (0) nil)>. The reason it doesn't fail is because functions will delay execution. I think that it doesn't matter too much, but I was trying to follow Lurk's current semantics. The extra cost is just a few constraints

What do you think @porcuquine ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @gabriel-barrett for two reasons, based on my understanding from offline discussion:

  • We should studiously avoid gratuitous changes to Lurk Alpha reduction behavior. We can make changes, but those need to be strongly justified as carrying their weight in terms of optimization or other benefit. Every change we make to fundamental reduction behavior is inherently negative — and the more time we spend revisiting basic semantic decisions, the less we have available for other purposes.
  • Apart from the general principle, the actual details of this case support Lurk Alpha's original decision. The Lurk reduction step should never create 'bad data' from 'good data'. In this case, a lambda list containing an illegal value is perfectly good Lurk data. It happens to be good Lurk data specifying a 'bad program' because it contains invalid syntax. If the Lurk reduction then creates a malformed function object (Fun) that should never exist, then the Lurk reduction is violating a language invariant.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can only fix this if we do argument checking, which will increase the number of iterations. The check costs 50 constraints and 20 auxiliaries, which is not too much. Here's what I propose: keep this check so that it is behaviourally equal to Alpha, and doesn't break any of the tests (if I change, it will break one of the tests). It also makes Lurk less weird.

Once we get a smaller circuit, these constraints might become relevant, so we can do a review of the whole circuit and decide whether or not to reduce checks to get less constraints. Is that fine?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I've expressed my points and we're all on the same boat, aware of what's happening, I'm cool with any path you guys pick.

Btw, I don't think that's exactly broken. It depends on how you look at it.

If we want to help the users with early errors of this kind, this could be raised at parsing time (or interning time). And we could check all variables without any extra cost on the circuit. This check could also be performed for let and letrec, making the following error before evaluation even starts:

(let ((a 1) (0 0)) nil)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this is a good idea. We should check this at parsing time

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually disagree. 'read time' needs to not impose any syntax because that's not how Lisp works. I think it's fine to have a provable compilation step, after which the result is assumed to have been checked, though.

@porcuquine
Copy link
Collaborator

Just a quick note: if this PR changes the iteration count, then it should also update the fibonacci example (and any benchmarks that also depend on the constants displayed there): https://github.com/lurk-lab/lurk-rs/blob/main/examples/fibonacci.rs#L18-L22

We should probably extract the core of that example into a unit test so we know we're actually (correctly) using a valid fibonacci function when we perform the 'limit' trick.

arthurpaulino
arthurpaulino previously approved these changes Dec 20, 2023
Copy link
Member

@arthurpaulino arthurpaulino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment on lines +1769 to +1770
expect_eq(cs.aux().len(), expect!["8823"]);
expect_eq(cs.num_constraints(), expect!["10628"]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Faster + slimmer = tasty

@@ -18,7 +18,7 @@ fn fib_expr<F: LurkField>(store: &Store<F>) -> Ptr {
// The env output in the `fib_frame`th frame of the above, infinite Fibonacci computation contains a binding of the
// nth Fibonacci number to `a`.
fn fib_frame(n: usize) -> usize {
11 + 16 * n
11 + 10 * n
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the fibonacci benchmark needs to be updated to reflect this change also.

@gabriel-barrett gabriel-barrett added this pull request to the merge queue Dec 20, 2023
Merged via the queue into main with commit 535af59 Dec 20, 2023
12 checks passed
@gabriel-barrett gabriel-barrett deleted the gb/new-app branch December 20, 2023 23:52
@huitseeker huitseeker mentioned this pull request Dec 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Env seems to be lost Rethink dummy var. Optimize function calls
3 participants