Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mal: improve MAL implementation. #384

Closed
wants to merge 6 commits into from

Conversation

asarhaddon
Copy link
Contributor

Generally: remove variables used only once, introduce a variable when
the same result is computed twice.

core_rs: replace strings with symbols because it is more consistent with
the public interface of env.mal
bind-env:

  • compare with '& instead of "&", avoiding a conversion per iteration
  • compute first once, store the result
    env-find:
  • move if env at start of recursion, avoiding to get in a row.
    stepA_mal:
  • QUASIQUOTE: merge the macro with following evaluation. In other words,
    (newQUASIQUOTE ast env) = (EVAL (oldQUASIQUOTE ast) env)
    The result is more efficient, and arguably more readable.
    o- eval-ast, eval: one do is sufficient to debug, remove a duplicate.
  • LET: move form into the signature. The recursion does not change much,
    but the initial call is shorter and more intuitive.
  • EVAL:
    • remove initial (not (list? ast)) test, which was redundant
      (MACROEXPAND would do nothing).
    • replace (nil? (first ast)) with (empty? ast), more explicit
      (also, (nil 1) is now reported as incorrect
    • try*: stop checking that first component of optional argument is
      catch*. No other user input is checked explicitly.
  • repl-loop: a slight modification avoids to create a new environment
    depth for each new line.

@asarhaddon
Copy link
Contributor Author

The commit contains many unrelated changes, but splitting them would be a lot of work and would probably not help much the review.
I would like your opinion about the following ideas.

  • provide inc as example in step4, alongside with not (useful later for gensym)
  • provide gensym as an example of atom in step6. It is general enough not to fit well in any specific lib/* file, and the explanation in step8 could stay, though only informative.
  • in order to illustrate macros in step8, in addition to cond, I suggest
(defmacro! land [fn* [& xs] (if (empty? xs) true  `(if ~(first xs) (land ~~(rest xs)) false))))

cond uses (list 'if ..), so it makes sense that land illustrates quasiquoting.

Both make the MAL implementation more readable in several places, and their intent is probably easyer to grasp for people with a background in non-functional languages. cond illustrates well how macros can manipulate source code, while or seems a trick to shorten scripts in weakly-typed languages. It requires to understand the gensym trick, and can easily be replaced in the current MAL implementation. It belongs to lib/test_cascade.mal in my opinion.
If you agree with this suggestion, land could even be named and and replace the one in lib/test_cascade.mal. I understand that the or name is already taken by Clojure, and the related macro is useful to get the first successful result of a sequence of attempts. But and defined in lib/test_cascade.mal does not seem more useful than a macro returning a boolean.

@kanaka
Copy link
Owner

kanaka commented May 28, 2019

The changes generally look good. However, to test this I'm working on getting self-host testing working in travis. I haven't been running it in a while and I'm trying to get it a bit more automated this time. But before I make changes to the mal implementation itself I want to make sure they don't introduce new issues with self-hosting (I'm sure there are a bunch of issues that have crept in accidentally in the meantime).

@kanaka
Copy link
Owner

kanaka commented May 30, 2019

Okay, I finally got some of the Travis infrastructure cleaned up for testing self-host changes and so I've had a chance to do a more thorough review of these changes. Here is my overall feedback. I will add some inline comments as well.

Generally: remove variables used only once, introduce a variable when the same result is computed twice.

This does make the code more concise and because it is an interpreter a bit more efficient (less to process each time through). I think I'm fine with the cases where you did this. However, I often use this pattern quite intentionally (for my day job too) because it serves as a form of implicit documentation. There are some people who feel strongly that named variables in functions are a anti-pattern. I understand the argument to some degree, i.e. functions should be general algorithms that are free of context or externally imposed meaning. But long experience has told me that the code where this actually makes sense is rare and so the external meaning should be explicitly captured. It's not a hard and fast rule, but my rule of thumb is that if explicitly naming variables will help me to avoid multiple read-throughs to understand what is going on in the future then I do it.

core_rs: replace strings with symbols because it is more consistent with the public interface of env.mal

I agree.

bind-env:

  • compare with '& instead of "&", avoiding a conversion per iteration

Yep.

  • compute first once, store the result

Yep.

env-find:

  • move if env at start of recursion, avoiding to get in a row.

True enough.

stepA_mal:

  • QUASIQUOTE: merge the macro with following evaluation. In other words,
    (newQUASIQUOTE ast env) = (EVAL (oldQUASIQUOTE ast) env)
    The result is more efficient, and arguably more readable.

I agree that it is a bit more readable and is almost certainly more efficient. The main problem with this is that it is quite different from how most (maybe all) other implementations and also different from the pseudo-code in the guide. If you want to go this path, then I would want to see the guide and other implementations changed to match. If you go that route just be cognizant of the impact this may have on implementations that do their own GC/ref counting since any functions that take an environment and/or call EVAL tend to have GC/ref count subtleties that need to be taken into account.

  • eval-ast, eval: one do is sufficient to debug, remove a duplicate.

Actually the do that wraps around eval-ast and eval is extraneous and just adding overhead. I think at some point the do got copied from the debug prints and then I added another copy (the reason I have that there in the first place is to remind me that I need to the do if I add the debug print).

  • LET: move form into the signature. The recursion does not change much, but the initial call is shorter and more intuitive.

I like the change to the LET definition, although args should be changed to binds. It should probably have always been binds but becomes more confusing with the additional form argument. For the call to LET it should be explicit about the arguments. LET is not multi-arity but using apply with rest implies that it is (also apply is pretty inefficient in some implementations compared to direct calls). So something like this:

(LET (new-env env) (nth ast 1) (nth ast 2))
  • EVAL:
  • remove initial (not (list? ast)) test, which was redundant (MACROEXPAND would do nothing).

Many implementations can't rely on macroexpand doing this implicitly. So this is a bit out of step with common implementation code but I can live with it.

  • replace (nil? (first ast)) with (empty? ast), more explicit (also, (nil 1) is now reported as incorrect

Yep. Good catch.

  • try*: stop checking that first component of optional argument is catch*. No other user input is checked explicitly.

Other implementations are kind of mixed on whether they do this or not. I'm good with this being omitted.

  • repl-loop: a slight modification avoids to create a new environment depth for each new line.

Yep.

(if m
(if (get m "ismacro")
true)))))))))
(macro? (env-get env a0))))))))
Copy link
Owner

@kanaka kanaka May 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

macro? is not a required built-in for self-hosting so it can't be assumed to exist here.

(= 'defmacro! a0)
(let* [f (EVAL (nth ast 2) env)
m (meta f)]
(env-set env (nth ast 1) ^(assoc (if m m {}) "ismacro" true) f))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with a1 and a2 not being pulled out in the let* but I think mac still needs to be explicit. This line is doing a lot of heavy lifting and it's hard to parse for me let alone somebody unfamiliar with Lisp. I think with-meta is also clearer here than ^.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last commits should take all your answers into account.

A conflict in ps/stepA has been the opportunity to revert the changes you have rejected and avoid contradictions in consecutive commits. It also helped me to realize how much work a simple change like adding inc in step A can cause, so I will not insist on adding inc, gensym, land into the step files anymore.

About quasiquote: I realize that this issue deserves a separate pull request.

About macro?: this function is (re)defined in core.mal and the optional built-in is never used. It may be wise to prefix the MAL version _macro? as done for _fn?.

About and

  • the name could be logical-and, and2 (as in lib/equality.mal) or && (suggesting untyped parameters, lazy evaluation and boolean result to anyone knowing it). Avoiding confusion with Clojure's and is easy.
  • The diff above shows where (and ab) would be more readable than (if a b) or (cond (not a) false :else b). The intent of _fn? and macro? would be more visible without the conversion to boolean. The two is-pair, is-macro-call functions seem to only exist in order to make a conjunction readable. Do you think the increase in readability is worth an auxiliary and function in core.mal?
--- a/mal/core.mal                                                             
+++ b/mal/core.mal                                                             
@@ -1,14 +1,15 @@
+(defmacro! and (fn* [& xs]                                                    
+  (if (empty? xs)                                                             
+    true                                                                      
+    (list 'if (first xs) (cons 'and (rest xs)) false))))                      
+                                                                              
 (def! _fn? (fn* [x]                                                           
-  (if (fn? x)                                                                 
-    (not (get (meta x) "ismacro"))                                            
-    false)))                                                                  
+  (and (fn? x)                                                                
+       (not (get (meta x) "ismacro")))))                                      
                                                                               
 (def! macro? (fn* [x]                                                         
-  (if (fn? x)                                                                 
-    (if (get (meta x) "ismacro")                                              
-      true                                                                    
-      false)                                                                  
-    false)))                                                                  
+  (and (fn? x)                                                                
+       (get (meta x) "ismacro"))))                                            
                                                                               
 (def! core_ns                                                                 
   [['= =]                                                                     
--- a/mal/stepA_mal.mal                                                        
+++ b/mal/stepA_mal.mal                                                        
@@ -7,8 +7,8 @@
                                                                               
 ;; eval                                                                       
 (def! is-pair (fn* [x]                                                        
-  (if (sequential? x)                                                         
-    (not (empty? x)))))                                                       
+  (and (sequential? x)                                                        
+       (not (empty? x)))))                                                    
                                                                               
 (def! QUASIQUOTE (fn* [ast]                                                   
   (if (not (is-pair ast))
@@ -18,18 +18,21 @@
        (= 'unquote a0)                                                        
        (nth ast 1)                                                            
                                                                               
-       (if (is-pair a0) (= 'splice-unquote (first a0))) ; `if` means `and`    
+       (and (is-pair a0)                                                      
+            (= 'splice-unquote (first a0)))                                   
        (list 'concat (nth a0 1) (QUASIQUOTE (rest ast)))                      
                                                                               
        "else"                                                                 
        (list 'cons (QUASIQUOTE a0) (QUASIQUOTE (rest ast))))))))              
                                                                               
+                                                                              
+                                                                              
 (def! is-macro-call (fn* [ast env]                                            
-  (if (list? ast)                                                             
-    (let* [a0 (first ast)]                                                    
-      (if (symbol? a0)                                                        
-        (if (env-find env a0)                                                 
-          (macro? (env-get env a0))))))))                                     
+  (and (list? ast)                                                            
+       (let* [a0 (first ast)]                                                 
+         (and (symbol? a0)                                                    
+              (env-find env a0)                                               
+              (macro? (env-get env a0)))))))                                  
                                                                               
 (def! MACROEXPAND (fn* [ast env]                                              
   (if (is-macro-call ast env)                                                 

@kanaka
Copy link
Owner

kanaka commented May 30, 2019

Answers to your second set of questions. BTW, my assumption is that you are suggesting to make these changes to all implementations (I wouldn't want to do it just for the mal self-hosted implementations because that would diverge the mal implementation from the others):

  • provide inc as example in step4, alongside with not (useful later for gensym)

I think that would be a good improvement because regardless of the use in gensym, this would add a simple arithmetic function definition in addition to the boolean function not. However, note that this should really be added to the diagrams for steps4-A. Technically it's missing from stepA diagram right now but since this was really just used for gensym in that same step it wasn't super urgent. The diagrams are not automatically generated, it's a hand-drawn diagram in gliffy with multiple layers that are turned on and off for each (as I recall, it's been a while since I've had to update them).

  • provide gensym as an example of atom in step6. It is general enough not to fit well in any specific lib/* file, and the explanation in step8 could stay, though only informative.

I could be convinced of this I suppose although my inclination is against doing this at the moment. step 6 is a little bit heavy for some implementations (swap! can be a pretty big pain sometimes). I think it would be a reasonable way to demo atoms. On the other hand there are some downsides. In addition to requiring diagram updates, symbol is used for gensym and that is currently a step9 deferrable (even the mal impl currently doesn't need it until stepA for gensym). symbol fits pretty well with the other step9 deferrables but is a pretty big mismatch with the other goals of step 6.

  • in order to illustrate macros in step8, in addition to cond, I suggest
(defmacro! land [fn* [& xs] (if (empty? xs) true  `(if ~(first xs) (land ~~(rest xs)) false))))

cond uses (list 'if ..), so it makes sense that land illustrates quasiquoting.

Both make the MAL implementation more readable in several places, and their intent is probably easier to grasp for people with a background in non-functional languages. cond illustrates well how macros can manipulate source code, while or seems a trick to shorten scripts in weakly-typed languages. It requires to understand the gensym trick, and can easily be replaced in the current MAL implementation. It belongs to lib/test_cascade.mal in my opinion.
If you agree with this suggestion, land could even be named and and replace the one in lib/test_cascade.mal. I understand that the or name is already taken by Clojure, and the related macro is useful to get the first successful result of a sequence of attempts. But and defined in lib/test_cascade.mal does not seem more useful than a macro returning a boolean.

I think there are a couple of typos in your macro: [fn* -> (fn* and ~~ -> ~@. I definitely would support moving both gensym and and out of stepA into lib since your updated mal implementation shows that we can remove the need for them. I would not want to redefine and or or to be different from the Clojure behavior at this point. You noted that land would improve the mal implementation. Can you give some examples of that where cond wouldn't suffice?

@kanaka
Copy link
Owner

kanaka commented May 30, 2019

Ah, yes, I forgot that mal defined it's own macro? function. Yes, please rename it to _macro?. Please rebase this onto master so that I can test it with Travis self-host test fixes I pushed yesterday. Other than the _macro? change and doing a full test in Travis in self-host mode, I think this is almost ready to merge.

Regarding the and/land question. I'm kind of on the fence about whether the cleanup is worth it. The code is clearer in some cases (although is-macro-call? feels about the same to me). But I also think it's a plus that the mal implementation hasn't needed to define it's own macros (being able to define new language features is an awesome Lisp super-power, but I think not needing to do that is a plus). Also, the slow implementations are especially slow around macroexpansion. Your changes reduce the need for macros in some places, but this re-introduces them in others. Let's separate this into a separate pull request so that we can get this pull merged and so that I can performance test it independently on slow implementations and characterize the difference it makes (or not).

Finally functions ending in a number imply a arity rather than version, IMO. So I wouldn't go with and2. I'd probably prefer bool-and. And I didn't notice this before but lib/equality.mal has the same issue (or2 and and2 are multi-arity so they should probably be named bool-or and bool-and too).

@kanaka
Copy link
Owner

kanaka commented May 30, 2019

I see that you already did the rebase. The travis test shows weird failures for make, guile and scheme. Although there might be something weird going on because that test run is for a commit that's not actually on that branch: https://travis-ci.org/kanaka/mal/builds/539247295

@kanaka
Copy link
Owner

kanaka commented May 30, 2019

I cancelled the build and I'm going to restart just make, guile and the schemes to see more quickly if those problems exists still: https://travis-ci.org/kanaka/mal/builds/539323954 If they pass I'll restart the whole test.

@kanaka
Copy link
Owner

kanaka commented May 30, 2019

Problem still exists. Apparently guile and scheme have had issues with perf1 and perf2 from an earlier commit (but still passing overall because perf3 passes). But something about the most recent changes causes perf3 to fail in make, guile and scheme.

@asarhaddon
Copy link
Contributor Author

About bool-and and similar: the concerns that you raise about avoiding macros when possible seem sufficient to demotivate further work on this, with so little a benefit. Thanks for your answers.

@kanaka
Copy link
Owner

kanaka commented May 30, 2019

Looks like the guile and scheme perf1/perf2 failures were introduced by the lib refactor (4924fac has the issue, 3e9b89d does not have the issue). I'll work up a fix so that any perf failure will cause a Travis error (and not just perf3). I'm also attempting to track down where the more critical make, guile, scheme failure was introduced.

Fair enough about the bool-and changes. BTW, thank you for all the work you've been doing to clean and improve things. A lot of the rationale for why things are one way or another aren't explicitly documented anywhere and many of the changes you have proposed make a lot of sense in other contexts (or if the goals of mal were different than they are). So I'm hoping all my feedback is coming across positively and not discouraging you at all. Again, thanks for what you are doing.

@kanaka
Copy link
Owner

kanaka commented May 30, 2019

Looks like ed51300 "Remove gensym, inc and or from step files" is the culprit for the new failure because the prior commit (607401d "mal: improve MAL implementation") works like before (make works and perf3 for scheme/guile works).

@asarhaddon
Copy link
Contributor Author

make^perf1 fails for ed51300 "Remove gensym, inc and or from step files" but runs fine with its parent "mal: improve MAL implementation". So it may be a different issue than guile&scheme.
Thanks for providing such an opportunity to learn things. I have no problem with some ideas being refused for good reasons (though I prefer being the one explaining the reasons, of course :-].

@kanaka
Copy link
Owner

kanaka commented May 30, 2019

Heh, x-post. It certainly might be a different issue. Although guile and scheme have a new failure in perf3 in ed51300 so it's at least a correlation if not causation :-).

@kanaka
Copy link
Owner

kanaka commented May 30, 2019

Okay, I figure out the issue with make and you're right that it likely is unrelated. It's a problem parsing the URL in the comment for gensym (the '#' gets treated as a make comment during parsing). I don't have a clear immediate fix, so I suggest changing the comment to be this in the mean time:

;; Generate a hopefully unique symbol. See "Plugging the Leaks" at
;; http://www.gigamonkeys.com/book/macros-defining-your-own.html

@asarhaddon
Copy link
Contributor Author

Soft tests are ready for a separate PR once this one is merged.

@kanaka
Copy link
Owner

kanaka commented May 31, 2019

@asarhaddon okay, please rebase this on master to pull in travis self-host supporting changes I made. You'll also need to merge with the env changes @bjh21 made (basically change "--outer--" to :outer). I'll run a final test in self-host mode and then merge.

Generally: remove variables used only once with generic names,
introduce a variable when the same result is computed twice.

core_rs: replace strings with symbols because it is more consistent with
 the public interface of `env.mal`
bind-env:
- compare with '& instead of "&", avoiding a conversion per iteration
- compute `first` once, store the result
env-find:
- move `if env` at start of recursion, avoiding to `get` in a row.
stepA_mal:
- eval-ast, eval: remove duplicate `do`
- LET: move `form` into the signature. The recursion does not change much,
  but the initial call is shorter and more intuitive.
- EVAL:
  - remove initial (not (list? ast)) test, which was redundant
    (MACROEXPAND would do nothing).
  - replace (nil? (first ast)) with (empty? ast), more explicit
    (also, `(nil 1)` is now reported as incorrect
  - `try*`: stop checking that first component of optional argument is
    `catch*`. No other user input is checked explicitly.
- repl-loop: a slight modification avoids to create a new environment
  depth for each new line.
* Move `gensym` and `inc` from step files to `lib/trivial.mal`.
* Move `or` from step files to `lib/test_cascade.mal`.
  Shorten it because `(first ())` returns `nil`
* Update process and tests accordingly (not the figures yet).
Make: avoid # character.
Guile: avoid `unquote` inside a vector inside a list inside `quasiquote`.
The bug in scheme/ is most probably the same.
@kanaka
Copy link
Owner

kanaka commented Jun 3, 2019

Hmm, I wonder if there is some interaction between the environment keyword change and these changes because there seem to be some new self-host failures: https://travis-ci.org/kanaka/mal/builds/540861424 (still running as I write this). Testing both of them independently showed all green (apart from make, guile and scheme that is).

Actually, looking more closely, a number of these are failures in step2 tests on (abc 1 2 3). I do notice that you dropped exception reporting from step2_eval.mal. I think that explains a number of the failures. I can't explain why previous testing didn't show this, but I'm pretty sure that's the issue. Maybe some interaction between that and outer as a keyword?

@kanaka
Copy link
Owner

kanaka commented Jun 3, 2019

Looks like some of the others are macro failures in step8 tests. Perhaps the hacks to get guile and scheme working caused those problems (I think I only checked those with make, guile and scheme and not with all the implementations).

Unlike the one in `env.mal`, the `get` built-in used during step2
returns `nil`, so the MAL implementation must throw an error.
Let `with-meta f m` create a function even if f is a macro, instead of
setting an "ismacro" metadata that is never used again (and breaks
self-hosting).

Also move the test for fn? from Optional to Deferrable, the function
is used for self-hosting.
@asarhaddon
Copy link
Contributor Author

Several issues remain with self-hosting:

  • wasm, step8: macros have symptoms similar to guile and scheme.
  • powershell, step4: some built-in functions like println fail to return nil.
  • rexx, perf: crashes.
  • vala, step4: (prn core_ns) has the expected result at the end of mal/core.mal, but the same command in ma/step4.mal right after (load-file core.mal) prints an empty array and
GLib-GObject-CRITICAL **: 14:53:33.784: g_object_ref: assertion 'G_IS_OBJECT (object)' failed

@asarhaddon
Copy link
Contributor Author

The fix for guile seems to break scheme.
Last commit introduces a test that (fn? ^{"ismacro" true} (fn* () 0)) returns true.
The test would have been useful by revealing that guile and scheme define metadata interfering with the MAL implementation of macros, breaking self-hosting.
It also seem to reveal a similar error in BASIC.
However, it should be improved or removed because it fails for the MAL implementation itself.

@asarhaddon
Copy link
Contributor Author

The attachment below contains a suggestion that may work-around most issues. It seems difficult to implement macros without interfering with user code, but if we do it, we might as well avoid metadata and move all metada stuff to Optional.
This is only a separate file because I am reluctant to run all tests before you approve the idea, but I have checked it on two implementations.
macro-without-meta.diff.txt (.txt because .diff extension is not supported)

@asarhaddon
Copy link
Contributor Author

In order to ease review:

@asarhaddon
Copy link
Contributor Author

#400 now replaces this PR.

@asarhaddon asarhaddon closed this Jul 9, 2019
@asarhaddon asarhaddon deleted the improve-mal-impl branch July 9, 2019 12:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants