Skip to content

Latest commit

 

History

History
3108 lines (2327 loc) · 83 KB

level_2.org

File metadata and controls

3108 lines (2327 loc) · 83 KB

Level 2: Things you would know if you knew C/C++

Cross Cutting Concepts

Scopes can be said to to have the following features/properties which will be discussed throughout this section and assumed to have knowledge of.

These are that Scopes:

  1. supports multi-stage programming where the compiler can stay online during exectution,
  2. supports many zero-cost abstractions when possible (such as compile time computations), and
  3. is statically typed,
  4. is low level and concerned with memory layout and management directly,
  5. supports the C ABI directly.
  6. has a borrow checker.

We will discuss everything but the borrow checker in detail in this section. Borrow checking and lifetime management is a somewhat more advanced feature than is available in C/C++; made popular by the Rust programming language. It is unavoidable that we must confront it when using C/C++ equivalent functionality in this level, however we avoid discussion of direct manipulation of its features or constructs meant to directly leverage it.

Instead we will describe the general behavior and the minimum amount of annotations and code needed to “make it happy” and avoid errors. This is good because ultimately it helps us make less errors and the overhead on the code is rather minimal when compared to writing C/C++ code.

Compilation & Code Generation

What is a Multi-Stage Compiler?

Perhaps most important to frame Scopes’ behavior and most different from other languages (besides Common Lisp) is that the Scopes compiler is considered to be a “multi-stage” compiler. To understand this lets consider the more common types of compilation you may have experience with.

In languages like C/C++ you invoke the compiler once and emit your finished program which can then be executed. No new code can be generated for the program. This is called “ahead of time” or an AOT compiler.

Languages like Java are both AOT compiled as well as “just in time” (JIT) compiled. This is a big topic with many nuances but suffice to say that the runtime engine waits to compile your code until it sees fit (typically only just before it actually needs to be run, hence the name). In this model the compiler stays online during execution but is not explicitly controlled by the programmer. Instead the runtime will have extensive intelligent features to try to determine the best time to compile code.

In dynamic languages there are flavors of all in between and including those running on a virtual machine without compilation (to machine code) happening at all.

Scope’s follows a third model in which the compiler stays online during execution (like a JIT compiler) but is explicitly controlled by the programmer. In the later point this is actually closer to an AOT compiler if you consider that your build system is actually a programming language, just a different language. This is called a Multi-Stage Compiler. There are very few multi-stage compilers in the wild but include heavyweights like Common Lisp.

All the types of compilers have their benefits and drawbacks.

AOT compilation can produce extremely lean and predictable executables that are suitable for constrained environments like microcontrollers or as operating system kernels. The drawback of being AOT only however is that you lose a lot of dynamicism and flexibility. This is why there are so many higher-level languages implemented on top of C/C++ that allow for more flexible execution of code without having to invoke a monolithic compiler like GCC. Furthermore, its very common to have a program with a plethora of compilation flags that determine run-time characteristics. If you want to turn a feature on or off, you may have to recompile which can be time consuming and difficult to do and deploy.

Relatedly, an AOT model typically assumes the existence of a “build system”. These are the Make/Autotools and CMakes of the world that everyone loves to hate for good reason. These things have the difficult job of orchestrating the not-so-programmable monolothic compilers (like GCC). While not all build systems suck, they generally do as they are surprisingly hard to get right and easy to underestimate the complexity. They are also typically an afterthought to the actual code itself. The need for a “build system” is present in any language but for those with less complicated compilation needs are primarily around just to do things like packaging.

JIT compilers (and interpreted languages in general) on the other hand make code compilation completely transparent to the user. There is no build system because there really isn’t even a compiler to invoke. The compiler exists completely in the runtime. JIT compilers are very advanced pieces of technology which are more similar to magic than an AOT compiler. You don’t know how and when its compiling your code but most of the time its doing a pretty damn good job of it. JIT compilers are really good at squeezing performance for free out of code written in languages that don’t typically excel at low-level optimizations. Notable examples are the HotSpot VM for Java, V8 for Javascript, and PyPy for Python. They can make code for these languages run at speeds approaching hand optimized ones in C/C++ for many use cases.

Sounds too good to be true, and it is if your doing anything that has real-time constraints or must do manual memory management. The latter is a result of the incremental compilation that typically requires an automatic garbage collector (GC) in order to keep everything straight. While there are a few reasons to not use a GC one of the main ones is that garbage collectors will need to suspend the operation of your program in order to do their work. This process cannot be controlled in your program and so if you have real-time constraints you cannot guarantee that you will meet these constraints because the GC may pause at any moment. This feeds into the other probem with real-time and JIT compilers. As indicated in the name a JIT compiler will compile your code only when it is needed. To do this it needs to suspend operation of your program to compile the code. Again you cannot control this and invalidates any real-time constraints you may have. Furthermore, a JIT compiler may often compile your code multiple times in different scenarios. For instance if it thinks this code will only be run once it may not take the time to optimize it heavily, but if it thinks it will be run a lot may pause longer to do more optimizations. So not only will you miss real-time deadlines but you can’t even be sure your code will run as fast as you need it to. Compare that to the AOT case in which optimizations levels are explicitly controlled and the cost for them is paid up-front and before your code ever even executes.

The multi-stage compiler is kind of the “have your cake and eat it too” position. You get the flexibility of a JIT compiler (or interpreted language) and the predictability of an AOT compiler all in one package.

How does it reconcile these two modes of compilation? By organizing your code into stages and allowing explicit control of the compiler in the language itself. Its really a nice middel ground between the two extremes of the other compilers. In AOT you really only have one stage, that being all your code when you compile it into machine code. In a JIT compiler you have a multitudinous fractal of stages, small to large pieces of code being compiled all over the place at random times. In a multi-stage program you will have something in between.

Lets make this a little more concrete before moving on. In Scopes compilation can be triggered in a couple different ways.

  1. When a module is imported
  2. When the run-stage function is run
  3. When “including” C code via the Clang bridge
  4. By running the various compile* commands:
    • compile which returns a function pointer to the compiled code.
    • compile-object for machine code for a target platform/architecture into object files
    • compile-glsl for compiling to GLSL
    • compile-spirv for compiling to SPIR-V

The “stage” in multi-stage programming is controlled by the run-stage symbol. This signals to the Scopes compiler something akin to a translation unit (TU) in C/C++. The code before a run-stage will be run/compiled and injected into the code of the next stage.

Here is a minimal example:

fn help (msg)
    print "Help!" msg

run-stage;

help "Me"

In this example there isn’t any reason to actually separate this code into two stages. It will be too difficult to explain at this moment when exactly a new stage is needed and so will be mentioned when needed.

Whats important to takeaway from this is that when and where your code is compiled is very straightforward compared to a JIT compiler. Compilation just occurs of each stage in the order they are declared and only when the previous stage finishes execution.

There are drawbacks to a multi-stage compiler as well perhaps. First it inherits the large run-time size of a JIT compiler as it needs the compiler to run the stages. Currently that means bringing along LLVM for the ride which is quite hefty.

Its worth noting that Scopes can replicate the behavior of an AOT compiler and generate objects that require no runtime & compiler other than a normal C run time. However, this is a little tricky and subtle topic and so we leave it for later. So while its possible it certainly isn’t the most important use case, at least yet.

Another potential perceived drawback is the complexity or unfamiliarity of a multi-stage compiler. Programmers are used to AOT compilers & build systems or interpreted languages. Using a multi-stage compiler brings in a host of new concepts and commands that must be learned before the language can be used effectively. Hopefully, for the curious looking for maximum expressivity and performance this will be no issue.

Compilation Steps & Code Generation

Now that we have contextualized broadly how the Scopes compiler operates on source code its time to dive a little deeper into the specifics of the compilation itself. This aspect is also considerably different from other programming languages and is worth understanding at least at a surface level.

The following table summarizes the steps of compilation and execution of Scopes code:

OrderStepFromToMacro System
1ParsingData Interchange Format (SLN)S-Expression Tree
2ExpansionS-Expression TreeUntyped Scopes ASTSugar
3CheckingUntyped Scopes ASTTyped Scopes ILSpice
4TranslationTyped Scopes ILLLVM IR / SPIR-V / GLSL
5ExecutionLLVM IR / SPIR-VProgram Output

This table will be very important to understand later concepts in metaprogramming and the two macro systems (sugars and spices of which we briefly met in Level 1).

However for now its enough to look at steps 3 through 5.

In a scripting language like python you would probably only have steps 1 and 5, and perhaps 4 depending on the runtime.

In a language with metaprogramming (i.e. macros) you add in step 2 which allows for generating new code. C/C++ both have this with different sophistication steps.

For a statically typed language you add in step 3 which is to do type checking.

Step 4 is common to both Scopes and C/C++ however it is different in that this is controlled directly in the code rather than through a compiler and/or build system as discussed in the last section.

Constant vs Dynamic Values

Constant compile-time values are the dual to dynamic run-time values in that constant values can be known before compilation to machine code.

The compile-time vs run-time distinction in Scopes is considerably different than in most purely AOT compiled languages like C/C++ and is a major feature of the language. This is due to the multi-stage aspect of the compiler.

In every stage it is both compile-time and run-time and so to say generically something is a “compile-time” computation is tautological and useless.

However, the compile-time of one stage is the run-time of the next stage and so in a relative sense the distinction still carries some sense.

Before we consider the implications of “constant” values on our code lets distinguish between constant and dynamic values.

Scopes does provide some features to actually test for “constant-ness” in constant? which we can test on some basic things.

print "i32?" (constant? 1)

let a = 1
print "let?" (constant? a)

local b = 1
print "local?" (constant? b)

global c = 1
print "global?" (constant? c)

So we can see that primitives like 1 and those assigned to symbols by let are constant but local and global defined variables are not.

Lets try some other ones:

print "list?" (constant? '(a b))
print "symbol?" (constant? 'Hello)

print "cast value?" (constant? (1 as f32))

using import UTF-8
print "char?" (constant? (char32 "a"))

fn hello ()
    print "hello"

print "function?" (constant? hello)

print "string?" (constant? "hello")

print "rawstring?" (constant? ("hello" as rawstring))

import Map
print "module?" (constant? Map)

print "tuple?" (constant? (tupleof 1 2:f32 "hello"))

using import String
print "String?"(constant? (String "hello"))

All these are constant except a String object, which as we will see later is heap allocated.

But these are all somewhat static objects. What about if we look at the inputs and outputs of functions. In this example the function is just the identity function and just returns the value:

fn nop (thing)
    thing

let b = 3

print "b:" (constant? b)
print "nop b:" (constant? (nop b))

And we can see that a constant value passed through a function is no longer a constant.

But why? The reason is that because functions are like little black boxes the compiler cannot reason about what the output of the function will be.

But in reality we know that this function is so simple that the compiler really could figure out what this is doing (and probably eliminate it).

Like what is the difference between calling our function and just presenting the value itself? Nothing.

In Scopes we have a way to express functions not as black-boxes but moreso as ways to copy-paste code around so that the compiler would be able to figure these things out. You might call these “white-boxes”. The simplest is the inline declaration which we introduced as an alternative syntax for defining functions.

Here is the above example rewritten with inline:

inline nop (thing)
    thing

let b = 3

print "b:" (constant? b)
print "nop b:" (constant? (nop b))

We can see now the result is constant after passing through this inline “function”.

In reality whats happening is that the body of the inline function is more-or-less copied into the call-site of the inline. I.e. it would look like this:

inline nop (thing)
    thing

let b = 3

print "b:" (constant? b)

# print "nop b:" (constant? (nop b))
# replaced with
print "nop b:" (constant? b)

Which then makes sense why its also constant.

You might recognize this as a kind of macro, like in the C/C++ preprocessor which can copy-paste code around. However, this macro is different in that it is hygenic. Meaning that it still obeys scoping rules like a normal function. Lets test this out:

local a = 1
local b = 2

inline increment (a)
    a + 1

print (increment b)
print a

Notice that the variable a in the outer scope did not get incremented.

We will have much more to say on macros and metaprogramming in higher levels but we will cover the topics that a C/C++ programmer might achieve through the use of the preprocessor in this level.

We only bring it up here because of its relevance to constant values. Macros tend to produce constant values because they are run and generate code before any actual “run-time” code is run. Its as if you had written the code yourself in an editor.

What else can we do that is constant? A comprehensive answer is not given here but we can show a few things which are:

let a = 1
let b = 2

print "addition?" (constant? (b + a))
print "exponentiation?" (constant? (a ^ b))


let s0 = "hello"
let s1 = "world"

print "string concatenation?" (constant? (.. s0 s1))

TODO: explain the “static” variants of everything.

  • static-if

Static Types

That Scopes is statically typed (like C/C++) is one of the biggest differences between a “dynamically typed” scripting language like Python.

Here we will show how to get information on types, use types, and how types relate to the run time, but later in “Type Definitions” we will cover how to create your own types as this is not essential to being able to use the language – although very useful.

Type Symbols & Type Information

First it helps to be able to figure out information on the types of values or what symbols correspond to types. By convention types are commonly CamelCased for complex types outside of the builtins, but is by no means necessarily true.

We have already covered one of these typeof but lets recap:

print (typeof 'print)
print (typeof true)
print (typeof 1)

Some other useful functions related to the type are qualifiersof & storageof which will give you more information about a value including things like references etc.

using import String

print "let:"
let let-str = (String "hello")
print "    typeof:" (typeof let-str)
print "    qualifiersof:" (qualifiersof let-str)
print "    storageof:" (storageof let-str)

print "local:"
local local-str = (String "hello")
print "    typeof:" (typeof local-str)
print "    qualifiersof:" (qualifiersof local-str)
print "    storageof:" (storageof local-str)

print "global:"
global global-str = (String "hello")
print "    typeof:" (typeof global-str)
print "    qualifiersof:" (qualifiersof global-str)
print "    storageof:" (storageof global-str)

You will notice that the qualifiersof information is different for all of them.

The meaning behind all of these will be discussed later, as it does have to do with the type system to some degree (and specifically the borrow checker) but for now we will stick to the common meaning of the words types.

Type Annotations

In this section we cover how to add explicitly annotate initialization statements with types similar to how you would in C/C++

In Level 1 we were mostly able to ignore having to declare types at all. This is because Scopes is able to infer types. Being able to infer types is not a unique feature of Scopes and other languages, particularly those from the functional languages like OCaml, have type inference.

However, the mainstream statically typed languages like C/C++ or Rust all do not do type inference, meaning you must – almost – always declare your types; even if the compiler could have done it for you.

While these seems annoying there is a utility in this in that everything is annotated explicitly so you don’t get confused when something gets inferred to a type you didn’t intend to.

In Scopes you can choose to let the compiler infer types for you (when it can) or explicitly declare them.

Because the syntax is often optimized for automatic type inference the explicit type declarations typically are available as optional extra syntax.

Here is an example of how to annotate the type for a variable declaration:

let int : i32 = 0:i32
print int

You can also declare the variable and type without initializing it:

local count : i32

print count
print (typeof count)
count = 4
print count
print (typeof count)

TODO there are other syntaxes as well which should be covered.

Function Types

Function types are a bit more complicated. We can annotate them easily enough with the returning syntax:

fn nop-num (num)
    returning (_: i32)

    num

print (nop-num 0)

print (typeof nop-num)

First lets explain some of the syntax. returning is a builtin, _: is a special type called Arguments, and the rest of the values in the list are the types of the arguments you will be returning.

Then notice how that for a function instead of getting the full type signature we only get the symbol Closure, which indicates that it is a compile-time closure.

To get the concrete type signature we need to use the tool static-typify:

fn nop-num (num)
    returning (_: i32)

    return num

let func = (static-typify nop-num i32)

print "Closure result" (nop-num 0)
print "static-typified result" (func 0)
print ""
print "function:" func
print "function type:" (typeof func)

# TODO: you can also cast functions directly
# nop-num as (@ (function))

The arguments to static-typify are the function template Closure and the types of the arguments you want to instantiate it as.

The return value of static-typify is also a function that you can actually call as well, as shown in the example.

However the type signature of the function is now known and made explicit: (opaque@ (i32 <-: (i32))) which means “an opaque pointer to a function that accepts an i32 and returns an i32”.

(Pointers and opaque pointers will be discussed later.)

Here we just introduce the basics of how functions are typed but, there is a lot more to say about function templates and their uses which will be discussed later.

Error Types

Because of Scopes special handling of errors by “patching” the return types we can also statically define the types of errors which can be raised using the raising syntax:

fn raise-error ()
    raising Error

    if true
        error "error"

    else
        "no error"

print (static-typify raise-error)

We can see now a much more complicated type signature (formatted for clarity):

(%1: fn raise-error () : (opaque@ (string <-: () raises Error)) 
(%2: branch #unnamed : string (...)) (return %2)):(opaque@ (string <-: () raises Error))

Low Level Memory Management & Layout

Pointers

In normal usage of Scopes you will be much less concerned with pointers than you would be writing C/C++ code.

However they will still come up when either you are explicitly managing memory when creating a datastructure on the heap or when interfacing with C/C++ code that returns pointers as part of its API.

Normally you would receive pointers either from C code or memory allocation. We will discuss memory allocation later so to start this off we need to obtain a pointer to observe.

To do this we will obtain the pointer to a local value using the & operator.

local a = 3
print (& a)
print (typeof (& a))
print (qualifiersof (& a))
print (storageof (& a))

The representation of the value should look something like this: $rironedapoxiken:(mutable@ (storage = 'Function) i32)

Or if you execute on the REPL: $rironedapoxiken:(mutable@ (storage = Private) i32)

Looking at the typeof and storageof portion of this we can see that the type is mutable@ indicating it is a pointer, and specifically a “mutable” pointer.

The (storage = 'Function) indicates the “storage class” where 'Function means it is stack storage and, as we will see, 'Private means global memory.

This is intended to support code generation to targets like SPIR-V primarily.

And at the right-hand-side we see i32 indicating that the value being pointed to is an i32 integer.

Also notice that we needed a local value. Because let is immutable we cannot directly access the memory, which would imply that we could edit it.

That means we can also get pointers from global values:

global b = 4
print &b

Here we also see the alternative syntax for getting the pointer to a value: &b.

The representation should be something like this: Global$genaroked:(mutable@ (storage = 'Private) i32).

Very similar except that it has a “Global” indicator at the beginning and is the 'Private storage class.

We mentioned that the pointers we were dealing with above were “mutable” pointers. You can check if something is mutable using the mutable? keyword:

local a = 1
global b = 2

print "a mutable?:" (mutable? &a)
print "b mutable?:" (mutable? &b)
print "b mutable?:" (mutable? &c)

You can express the types of pointers in a couple ways for both immutable constant and mutable pointers:

# these are equivalent
let p = (pointer i32)
let ap = (@ i32)

let mp = (mutable pointer i32)

print "p:" (typeof p) p
print "  mutable?:" (mutable? p)

print "ap:" (typeof ap) ap
print "  mutable?:" (mutable? p)

print "mp:" (typeof mp) mp
print "  mutable?:" (mutable? mp)

Alternatively there are other ways to write a mutable pointer:

print (mutable@ i32)
print (mutable (@ i32))

And finally there is a third kind of pointer which is the “opaque” pointer. This indicates that there is no storage for the type.

Here is an example:

print (pointer (type T))

let opaque-ptr = (pointer (type V))
print (opaque-ptr)

Opaque pointers are common when dealing with function pointers as we have seen previously with static-typify.

Getting Pointers
Null Pointer

The simplest pointer that we can conjure out of thin air is the so-called “null” pointer which points to nothing.

let nul-ptr = (nullof (@ i32))
print nul-ptr

let nul-ptr = (nullof (mutable@ i32))
print nul-ptr

Null pointers are equal to the special value null:

let nul-ptr = (nullof (@ i32))

if (nul-ptr == null)
    print "nul-ptr is null"
else
    print "nul-ptr is not null"

local a = 1

if (&a == null)
    print "&a is null"
else
    print "&a is not null"

You can cast the pointers to the address which is a u64. We can see that the address of the null pointer is 0.

print ((nullof i32) as u64)

For other values we can see different numbers:

local a = 3

print (&a as u64)
Stack & Global Pointers

Above we showed how to get pointers from local (stack created with alloca) and global (data segment) allocated values. Because these are not heap allocated they do have some extra rules surrounding their use.

Global pointers can be accessed anywhere since the values are in the data segment of the program:

fn test ()
    global b = 2
    print (deref (@ &b))
    &b

local b = (test)

print (@ b)

However local variables are on the stack so if you return a pointer to a local that was allocated on an upper framestack then it will be invalidated outside the scope:

fn test ()
    local b = 2
    print (@ &b)
    &b

let b = (test)

print b
print (@ b)
References

TODO

Then there are reference types which are different from pointer types:

print (& i32)
print (mutable (& i32))

Major Constructs & Routines

Tuples

In Level 1 we saw how to dynamically define tuples with tupleof. You can also declare the type in full first before instantiating.

print
    ((tuple i32 f32) 0:i32 1:f32)
let tup-type = (tuple (a = i32) (b = u64))

print (tup-type (a = 0) (b = 1:u64))

Strings

There are two common types of strings in Scopes which is necessary for C compatibility. This might be simplified in the future but nonetheless its useful to understand the difference between them.

Scopes Strings

The vanilla string in Scopes is the type string. This is what you get from the primitive form.

let digits = "0123456789"

print (typeof digits)

You can retrieve elements (characters) from this string.

Where the value is the int value for the char (i8) it encodes.

let digits = "0123456789"

print (typeof (digits @ 1))
print (digits @ 1)

The other kind of string is similar to the Array type previously discussed. It is allocated on the heap and can grow in size.

It is provided in the standard library module String:

using import String

let str = (String "Hello")
let str = ("Hello" as String)

print (typeof str)

You will notice that the type of String is only <GrowingString i8>.

In the future their may be support for similar constructs like FixedString and parametric types.

It has similar methods as Array:

using import String

local str = (String "Hello")

print ('capacity str)
'append str " there sir"
print ('capacity str)

print str

C-like Strings

These are null-terminated strings that are compatible with C-strings.

They can be constructed using the rawstring type.

let cstring = ("hello" as rawstring)

print (typeof cstring)

In keeping with how strings are implemented in C, this is really just a pointer to an array of characters (i8) as we can see from the above type.

Putting Them Together

This is perhaps the biggest “wart” in Scopes that most users will encounter and it is there for a good reason: compatibility with C.

Hating on C strings is a very common thing to do, but because maintaining a 1:1 correspondance with C is a very high priority it must be dealt with. Thankfully Scopes provides some great tools for working with this complication.

Also, you may not actually have to deal with rawstring very often in your code. Only in the places where you interface with C code will it be a problem.

In practice you can cast `rawstring` to the appropriate Scopes type and move on.

Here are some other notes on converting between the string types.

When declaring a string literal, because it is constant, a cast via as is zero-cost and the string type for the literal is never instantiated.

E.g.:

"hello" as rawstring

using import String
"hello" as String

You can also convert a String to a rawstring easily:

using import String

("hello" as String) as rawstring

However to convert a rawstring to a String you will need to construct it directly.

using import String

let rstr = ("hello" as rawstring)

let str = (String rstr 5)

# or get the length dynamically using the string C lib
import C.string
let str = (String rstr (C.string.strlen rstr))

The last thing you might want to convert to a string fairly often (and especially when interacting with the C standard library) is an array of char to a string.

This can be done as follows:

using import String

# Must be local because we need a pointer to it
local arr = (arrayof i8 0 1 2 3)

# pass a pointer to the array and the length of the array
let str = (String (& arr) (countof arr))

Encodings & Conversion

In addition to converting between the string types you will also at some point need to deal with encodings and converting between arrays of bytes/ints and strings.

For this there is the UTF-8 module in the standard library for which we already saw the use of the char32 function.

The encoding (ints to string) and decoding (string to ints) functions are currently only implemented as generators; which while very useful are a little cumbersome to use if you aren’t familiar with generators and collectors yet.

So we suggest simply making a wrapper function that will do the conversion for you without generators:

using import itertools
let utf = (import UTF-8)

fn utf8-encode (arr)
    ->>
        arr
        utf.encoder
        string.collector ((countof arr) * (sizeof i32))

local decoded-string = (arrayof i32 63:i32 97:i32)
print (utf8-encode decoded-string)

# single charactar encode
fn char-encode (ch)
    local arr = (arrayof i32 ch)
    (utf8-encode arr)

print (char-encode 63:i32)

TODO make the decoder since there is no default collector for arrays.

Structs

Structs are a similar construction as in C/C++, however they are different in that they aren’t a concept built into the core language and instead are provided in the standard library.

Here is an example of defining a struct type:

using import struct

struct Example
    value : i32
    choice = false
    text : string = ""

First we import the symbols in the struct module (i.e. struct) and then we define the fields.

Fields can be declared in 3 ways:

  1. with a type only (which must be provided upon construction)
  2. with only a default value which the type will be inferred
  3. both a type and a default value, which must match

In the syntax used above there will be a new symbol defined as Example.

using import struct

let Example =
    struct
        value : i32
        choice = false
        text : string = ""

# 1. C-ish looking declaration
local example : Example
    value = 100
    text = "test"

# 2. Assignment "scopes-style"
local example =
    Example
        value = 100
        text = "test"

print example.value
print example.text

Just to emphasize that we are still in Scopes and that you can still use all the parens you want to make them:

using import struct

struct thing
    what : string

let t = (thing (what = "test"))

print t.what
using import struct

struct thing
    what : string
    size : u32

let t =
    thing
        "Other thing"
        1:u32

print t.what
print t.size

let d =
    thing
        size = 1:u32
        what = "Other thing"

print d.what
print d.size

Arrays

C-style Arrays

First we must talk about the C-style arrays.

let arr = ((array f32 2) 0 1)
print arr
let arr = (arrayof f32 0 1 2 3)

print arr
let arr = (arrayof f32 0 1 2 3)

print (arr @ 1)

Array of structs

using import struct

struct Dog
    name : string
    bark : string = "woof"
    height : f32

let d0 =
    Dog
        "Fido"
        "Bow! Wow!"
        43

let d1 =
    Dog
        "Max"
        "Wong! Wong!"
        56

# array type can't be accessed with dynamically generated indices
# (like from the for loop below) because you could easily go beyond
# the bounds of the array
local dogs = (arrayof Dog d0 d1)

for idx in (range 2)
    # access the struct members of the array elements
    print ((dogs @ idx) . name) "says" ((dogs @ idx) . bark)

for idx in (range 2)
    # access the struct members of the array elements
    ((dogs @ idx) . name) = "George"

    print ((dogs @ idx) . name)


local dog-arr = (array Dog 2)
for i in (range 2)
    (dog-arr @ i) =
        Dog
            "Max"
            "Wong! Wong!"
            56

Arrays

using import Array

# Fixed size array
local arr = ((Array i32 10))
print (typeof arr)

# Growing array (e.g. C++ vector)
local arr = ((Array i32))
print (typeof arr)

# You can explicitly use GrowingArray or FixedArray types
local garr = ((GrowingArray i32))
local farr = ((FixedArray i32 10))

# add a value to the array
let element = ('append arr 0)

print element

print (countof arr)
print (arr @ 0)

# assign to a particular location
arr @ 0 = 2
print (arr @ 0)

# TODO
# insert values in between
# 'append arr 4
# 'insert arr 1 3

print "last:" ('last arr)
print "pop:" ('pop arr)

# WARNING: segfault, no last element
# print "last:" ('last arr)

# remove
'append arr 0
'append arr 1

'remove arr 0

print arr

# you can swap values
print "Before Swap"

'append arr 0
print (arr @ 0)
print (arr @ 1)

print "After Swap"
'swap arr 0 1

print (arr @ 0)
print (arr @ 1)

# # reverse
# print "reverse"

# arr = ('reverse arr)
# print (arr @ 0)
# print (arr @ 1)

# # sort
# print "sort"
# 'sort arr
# print (arr @ 0)
# print (arr @ 1)


# remove all values in the array
'clear arr
print (countof arr)

# WARNING: segfault if you try to access values that aren't there
#
# arr @ 0

# get the capacity of the array, when this is exceeded it will be
# expanded
print "capacity:" ('capacity arr)

# add capacity + 1 elements
for i in (range 5)
    'append arr i

# capacity is expanded
print "capacity:" ('capacity arr)

# again
for i in (range 6)
    'append arr (i + 5)
print "capacity:" ('capacity arr)

# etc.

# fixed arrays have the capacity you give them
local arr = ((Array i32 10))

print "capacity:" ('capacity arr)


# You can use 'resize' or 'reserve' to force a particular capacity

# resize will initialize the elements
print "resize"

local arr = ((Array i32))

print "capacity:" ('capacity arr)
print "countof:" (countof arr)
'resize arr 10
print "capacity:" ('capacity arr)
print "countof:" (countof arr)

print (arr @ 0)

# reserve will not initialize the elements

print "reserve"
local arr = ((Array i32))

print "capacity:" ('capacity arr)
print "countof:" (countof arr)
'reserve arr 10
print "capacity:" ('capacity arr)
print "countof:" (countof arr)

# WARNING: segfault, not initialized
# print (arr @ 0)


# casting to generators

You can also construct arrays with initial values:

using import Array

let things = ((Array string) "a" "b" "c")

let numbers =
    (Array f32)
        4.0
        3.0

print (numbers @ 0)

using import struct

struct Dog plain
    name : string
    bark : string = "woof"
    height : f32

let dogs =
    (Array Dog)
        Dog
            "Fido"
            "Bow! Wow!"
            43
        Dog
            "Max"
            "Wong! Wong!"
            56


print ((dogs @ 0) . name)

Some Examples

Looping Over Arrays

Arrays can be cast to generators implicitly so we can loop over them directly:

using import Array

let things = ((Array string) "a" "b" "c")

for thing in things
    print thing

A common pattern in programming languages is to loop over a range of values with an index.

In “C-style” you would use a for-loop with an increment counter and then access the data from the array you want to iterate over.

In Scopes you can do this if you know everything statically/constant:

using import Array

let array_size = 3

let things = ((Array string array_size) "a" "b" "c")

# print (things @ 0)

loop (idx = 0)

    if (idx < array_size)

        print (.. (tostring idx) ": " (things @ idx))

        repeat (idx + 1)
    else
        break idx

;

If you don’t know the length of the array you can do something like this:

using import Array

local things = ((Array string) "a" "b" "c")

for idx in (range (countof things))
    print (.. (tostring idx) ": " (things @ idx))

However, here you have a potential to go out-of-bounds with the loop because the range is not constant and computed at run time. I.e. if it was (range 4) you would get a segfault.

Notice also that to make this work we needed to make the things a mutable variable with local.

So this isn’t really a recommended way to do things.

Similar to how you would do this in Python you can use the zip generator from itertools:

using import itertools
using import Array

let things = ((Array string) "a" "b" "c")

for idx thing in (zip (range (countof things)) things)
    print (.. (tostring idx) ": " thing)

Scopes (not the language)

TODO

VarArgs

TODO

Function Templates

Option

Enum and Union Types

We have already introduced Enums but there are some more details that are relevant to this level.

First is that there is a “plain” version of enums, which is quite different in terms of API. This plain version primarily exists to interface with C code.

using import enum

enum Things1
    A
    B

enum Things2 plain
    A
    B

print Things2.A
print (typeof Things2.A)

print "superof Things1:" (superof Things1)
print "superof Things2:" (superof Things2)

# INVALID
# print (Things1.A == 0)
print (Things1.A.Literal == 0)

print (Things2.A == 0)

From this we see that the supertype is different and is a subtype of CEnum instead of Enum.

Second we see that we don’t need to explicitly get the Literal value for comparisons to work.

Enums for Polymorphic Exceptions

using import enum

enum Errors
    A
    B

try
    if true
        raise Errors.A
    else
        raise Errors.B
except (e)
    print e

Interacting with C Code

Scopes was specifically designed to be ABI compatible with C and has extensive support for doing so.

You may actually find that its easier to run C code from inside Scopes, no joke!

Additionally it is possible to compile Scopes code as an object file which can be used from C code.

Including C Functions

If you have a newer version of Scopes the following modules are included the standard library already since they are commonly used:

import C.stdio
import C.string
import C.stdlib
import C.socket

For the other libraries you will need to manually include them. This is a fairly common thing to do in Scopes for interfacing with external libraries. The C standard library functions are easy to work with in the tutorial since they are available on all systems, but the method generalizes and we will see how to do this later.

To load the header for a non-builtin C standard library you will need to use the include function which returns a Scope.

let stdio = (include "stdio.h")

Then you will need to access the actual symbols from this Scope.

The following sub-scopes are available for the different kinds of symbols in a C file:

  • enum
  • union
  • extern
  • typedef
  • const
  • define
  • struct
let stdio = (include "stdio.h")
for k v in stdio (print k)

A common pattern is to dump these all into the same scope for easier access:

TODO: for now it just prints all of them.

fn print_header_symbols (header-scope)
    # TODO: convert to a merge algorithm

    for symbol-key symbol-scope in header-scope

        # we must "unbox" the Scope since 'header' is a Scope and values
        # are "boxed" meaning they can have any type. To unbox is to tell
        # the compiler "hey this is of this type" since we know this is true
        let symbol-scope = (symbol-scope as Scope)

        loop (sub-key sub-value idx = ('next symbol-scope -1))
            if (idx == -1)
                break;

            print symbol-key ":" sub-key ":" sub-value
            # 'bind merge-scope sub-key sub-value

            'next symbol-scope idx


let header = (print_header_symbols (include "stdio.h"))

Other Libraries

TODO: update to a simpler example and explain shared library and search paths etc.

This works for standard libraries. But what about vendored libraries?

Here is a minimal example for loading the cross-platform windowing etc. library GLFW:

let glfw =
    include
        "GLFW/glfw3.h"
        options
            # "-v"
            .. "-I" module-dir "/../_guix/dev/dev/include"

let glfw-lib-path = (.. module-dir "/../_guix/dev/dev/lib/libglfw.so")

load-library glfw-lib-path

run-stage;
glfw.extern.glfwInit;

Here we have to add some options to the include function for the path to search for include files. These options correspond to what the clang compiler would expect from the command line.

In this example we installed the packages using the guix package manager in the _guix/dev/dev directory.

module-dir gives the current directory of the module that is executing and doesn’t include a trailing slash.

Calling Scopes Code from C

TODO

References, Ownership, and Storage

TODO

See: https://gist.github.com/radgeRayden/67b654b5bb8f3227749b5dd7a577ec4d

defer

Not really a feature in C++ but more from the Go family.

defer print "end of module"

let name = "Bob"

defer print (.. "Goodbye " name "!")

Creating Your Own Types

TODO: write introduction

Creating & Extending Types

Types are typically created using the type (or the older but still usable typedef) syntax.

The simplest usage is just to define a type:

type NewType

print NewType
print (typeof NewType)

In addition to creating completely new types you can subtype others.

For instance you can make a subtype of integers using the integer supertype:

type NewInt < integer

print (typeof NewInt)

There is a use for standalone types like this, but they are limited and not likely the first thing you would be interested. Typically you will want them to be instantiated and contain some data.

In order to actual use instantied types to hold data we can specify a “storage” type.

The storage type of a type or value can be retrieved with the storageof function:

print "storageof i32" (storageof i32)
print "storageof f32" (storageof f32)

# INVALID for abstract types
# (storageof integer)
# (storageof real)

The reason for this is that the logical types really only exist at compile time and its just to help us not make some mistakes. So to actually have some data we need a storage type that corresponds to actual bytes on the machine. We will discuss this in more detail later but we will show a simplistic example:

type MyInt : i32

print (storageof MyInt)

You can always cast a type to the storage type associated with it using storagecast:

type MyInt : i32

print (storagecast MyInt)

This definition will create a so-called “plain” type. We will discuss plain vs “unique” types later but briefly the unique variants of these type definitions are written with :: rather than :.

type MyUniqueInt :: i32

print (storageof MyUniqueInt)

We still can’t actually instantiate/construct a value for this type because we have no constructor. If you attempt to you will get an error indicating this.

We can write constructors (as we will see below) but a very common pattern will be to simply subtype from a type that already has a constructor. Subtyping is achieved by an additional < section in the type syntax:

type MyInt < integer : i32
print (storageof MyInt)

let i = (MyInt 2)

print i

print (typeof i)
print (storageof i)

print (storagecast i)

The integer type is the supertype of all the concrete integer (e.g. i32, u8, etc.). You can find the supertypes of existing types with the superof function:

print "superof i32" (superof i32)
print "superof f32" (superof f32)

Using this knowledge we can write our type definition more simply as:

type MyInt <: i32

print (storageof MyInt)

let i = (MyInt 2)

print i

print (typeof i)
print (storageof i)

Where the type of the new type will be subtyped from (superof i32) and the storage will be (storageof i32).

The same pattern holds for unique types:

type UniqueInt1 < integer :: i32

type UniqueInt2 <:: i32

Note that simple type aliases can be achieved with let alone:

let IntAlias = i32

print (IntAlias 3)

Subtyping Struct Types

Oftentimes we want to define a struct with methods etc. We can do this by subtyping from struct. However with the normal type syntax we cannot list the fields of the struct easily. So the struct syntax provides a subtyping syntax that allows us to define a new type that is backed by a struct of your chosen fields.

using import struct
using import String

type Animal < Struct

struct Dog < Animal
    color : String
    bark : String
    height : f32
    length : f32

let dog =
    Dog
        "yellow"
        "woof woof"
        34
        89
;

Methods

Similar to C++ classes and (what is called) Object-Oriented Programming types can have method functions on them.

type MyInt <: i32

    fn yell (self)
        print "hello"

let i = (MyInt 2)

'yell i

In this example we add a method to our new subtype that simply prints something.

We can see that the Python convention of explicitly writing out the self argument to the function is followed. The self variable can be used to access data from the instantiated object:

type MyInt <: i32

    fn yell (self)
        print "hello" self

let i = (MyInt 2)


'yell i

In this case since MyInt is a direct subtype of an integer that we can use it directly and the representation is close to the builtin one for an integer type.

We can also declare class methods by simply calling a method on a class without instantiating it:

type MyInt <: i32

    inline yell (cls)
        print "hello" cls

'yell MyInt

let i = (MyInt 2)

'yell i
;

Notice that this is an inline function so that the cls value can remain constant.

Notice that you can still call the method on the instance. However, by convention it seems like a good idea to distinguish class methods by using the Python convention of calling the first argument cls (for class) as opposed to self.

Struct methods

Structs can also have methods defined on them and can inherit them from super types.

using import struct
using import String

type Animal < Struct

    fn get-area (self)
        self.height * self.length

struct Dog < Animal
    color : String
    bark : String
    height : f32
    length : f32

    fn bark! (self)
        print self.bark

let dog =
    Dog
        "yellow"
        "woof woof"
        34
        89

'bark! dog
print ('get-area dog)

;

More on Method Calls

Up until this point we have used the syntax of 'method obj args...

Super Type Methods

Scopes is not an object-oriented language and does not provide the typical inheritance features found in them, such as multiple inheritance.

However the simple form of inheritance (which we call simply subtyping) is able to allow for the sharing of type methods in the subtypes.

Here is an example of

type MyInt <: i32


Metamethods & Operator Overload

Similar to Python types support the idea of metamethods (called magic methods in Python) which are special methods that when implemented can be used in a protocol for various kinds of tasks.

Metamethods are methods that start with a double underscore. The metamethod symbol must match the corresponding operators.

You can define them just like methods otherwise. Here we are are redefining the repr of our type:

type MyInt <: i32

    fn __repr (self)
        "MyInt"

let i = (MyInt 2)

print (repr i)

Operator Metamethods

We can also redefine operators as well, however these are a bit more complicated since we need to dispatch on the types.

These dispatching functions typically require then returning an inline function corresponding to the correct type signature.

In this example we are re-implementing the + operator by comparing the types of the inputs, ensuring they are the same, and then returning an inline that casts to the types underlying storage type and using the builtin operator for that storage type.

type MyInt <: i32

    inline __+ (lhs_type rhs_type)
        static-if (lhs_type == rhs_type)
            inline (a b)
                (storagecast a) + (storagecast b)

let i = (MyInt 2)
let j = (MyInt 3)

print (i + j)

Notice that we are using static-if in this dispatch because this should happen statically at all the call-sites.

Its a little complicated at first but obviously very powerful and a lot simpler than other compile-time dispatch systems like template metaprogramming.

List of Metamethods

MetamethodOperatorGeneric?Description
__reprreprnoGenerate a string representation of a value.
__++yesAdd two values.

Constructors & Destructors

To construct a type the __typecall metamethod is used.

type MyInt <: i32
    inline __typecall (cls int)
        super-type.__typecall cls int

let i = (MyInt 1)

print i

Properties

Modifying Existing Types

Another interesting feature is the ability to modify types in any location.

This lets you add behavior to a type defined in a library or just split up your own definitions if you are doing something fancy.

To do this we use the type+ syntax:

type MyInt <: i32

# end of initial type definition


# somewhere else in your code...
type+ MyInt
    fn yell (self)
        print "hello" self


let i = (MyInt 2)
'yell i

This is a very useful alternative to constantly subtyping or using multiple inheritance to get complex types.

Debugging

fn add (a b)
    dump a b
    a + b

add 3 4
let a = 3

report a

print a

Other Cool & Useful Constructs & Routines

I/O

Currently low level I/O is handled using the C standard libraries (or whatever other library you want).

Some tips though for interfacing with them.

using import String
import C.stdio

let input-prompt = ">"
let result-prompt = "==>"

# display a prompt
(C.stdio.fputs "> " C.stdio.stdout)

# allocate a C-array for collecting input
local input = ((array i8 2048))

# get input from stdin
(C.stdio.fgets input 2048 C.stdio.stdin)

# then convert to a string
let input-str = (String (& input) (countof input))

print (result-prompt input-str)

String Formatting

TODO

Dynamic Dispatch

using import enum
enum State
    a : StateA
    b : StateB

local curState : State = (State.a (StateA))
# now when you deal with states, you do this:
dispatch curState
case a ()
    'init a
case b ()
    'init b
default
    ;

# there's a shorthand for doing the same thing with all fields of an enum:
'apply curState
    inline (T self)
        'init self

Expression Chaining

TODO

Function Chaining

TODO

itertools

TODO

Run Time Closures: Captures

TODO

Box

Rc