Instead of passing a flag through the parser and tokenizer for telling
the input source if we need further input or not, we steal a trick from
Lua: In the REPL, we just continue to read lines and append them to the
input, until the input was loaded with no "unexpected EOF" error. After
all, when we didn't expect an EOF is exactly the scenario, when we need
more input.
Doing things this way simplifies a bunch of places and lets us remove
the ugly source_reader and iterative_runner concepts.
To allow the REPL to see the error that happened during loading required
some smaller refactorings, but those were honestly for the better
anyway.
I also decided to get rid of the token_source concept, the parser now
gets the tokenizer directly. This also made things a bit simpler, also
I want to soon-ish implement string interpolation, and for that the
parser needs to do more with the tokenizer than just reading the next
token.
One last thing: This also cleans up the web playground and makes the
playground and REPL share a bunch of code. Nice!
Pairs are 2-tuples of values that are constructed and matched with the `::`
operator. They can also be matched with a `:` operator, the LHS is an
expression then, the pair will then only match, if the LHS matches the
result of that expression.
Pairs should be useful to do something similar what sum types / tagged
unions do in statically typed languages, e.g. you could write something
like:
some := (symbol) # Somthing that creates a unique value
filter-map := {
_ [] -> []
f [x ~xs] ->
{
some:y -> [y ~(filter-map f xs)]
nil -> filter-map f xs
} (f x)
}
filter-map {
x?even -> some :: (* x 10)
_ -> nil
} some-list
The callback and the opaque data are now grouped together in a struct
instead of being passed individually into the tokenizer.
This also exposes the string source reader struct and therefore removes
the need of heap allocating it. Neat!
We now no longer call malloc/free/... directly, but use an allocator object
that is passed around.
This was mainly done as a preparation for a garbage collector: The
collector will need to know, how much memory we're using, introducing the
collector abstraction will allow the GC to hook into the memory allocation
and observe the memory usage.
This has other potential applications:
- We could now be embedded into applications that can't use the libc
allocator.
- There could be an allocator that limits the total amount of used memory,
e.g. for sandboxing purposes.
- In our tests we could use this to simulate out of memory conditions
(implement an allocator that fails at the n-th allocation, increase n by
one and restart the test until there are no more faked OOM conditions).
The function signature of the allocator is basically exactly the same as
the one Lua uses.
This avoids creating refcounted strings during evaluation and makes it
easier to use the same parsed string in multiple places (should be
useful once we implement functions).
A useful source reader implementation to pass in a source saved as an
in-memory string into the tokenizer.
This replaces the string_src_reader in the tokenizer_test and is even a bit
more flexible, by allowing any aplf_string_view as the source.
During development on the parser I got a "malloc(): corrupted top size"
error in the tokenizer when parsing `a=a`. I wrote this test to see if it
was really a problem with the tokenizer. It wasn't, lets keep the test
nontheless.