- peek functions will now return pointers directly
- cursor access methods can now return an error, if the cursor is already
at the end
Also rewrote some cursor loops to use the HASHMAP_EACH macro.
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
We didn't properly copy the opaque value before (a heap-allocated allocator
pointer), resulting in a use-after-free of the opaque on the copied map if
the old map was destroyed.
We now no longer call malloc/free/... directly, but use an allocator object
that is passed around.
This was mainly done as a preparation for a garbage collector: The
collector will need to know, how much memory we're using, introducing the
collector abstraction will allow the GC to hook into the memory allocation
and observe the memory usage.
This has other potential applications:
- We could now be embedded into applications that can't use the libc
allocator.
- There could be an allocator that limits the total amount of used memory,
e.g. for sandboxing purposes.
- In our tests we could use this to simulate out of memory conditions
(implement an allocator that fails at the n-th allocation, increase n by
one and restart the test until there are no more faked OOM conditions).
The function signature of the allocator is basically exactly the same as
the one Lua uses.
Only for evaluating expressions for now and right now the only exposed
operation is to debug print a value on the stack, this obviously needs to
be expanded.
I've done this for two reasons:
1. A Lua-style stack API is much nicer to work with than to manually manage
refcounts.
2. We'll soon need a more sophisticated garbage collector (if you even want
to count the refcounting as garbage collection). For this, the GC will
need root objects to start tracing for live objects, the stack will be
one of these roots.
This avoids creating refcounted strings during evaluation and makes it
easier to use the same parsed string in multiple places (should be
useful once we implement functions).
You can now set keys in dictionaries to a value. If a key in a key path is
missing, we automatically create an empty dictionary. Otherwise setting
deeply nested keys becomes annoying.
We're still missing predicates (need to be able to call functions for that
one) and assignments into dictionaries. But we now can deconstruct a list
and check against constants.
So things like this work now:
[1 foo ~bar [a b]] = [1 "Hello" 2 3 4 [5 6]]
# foo is: "Hello"
# bar is: [2 3 4]
# a is: 5
# b is: 6
Pretty cool :)
If you assign into a member access (`foo.bar = baz` or `foo@bar = baz`), it
is no longer permitted that the LHS of the at/dot is an arbitrary
assignable. It now must be a variable, at or dot. This disallows some silly
constructs (e.g. `[foo]@bar = baz`), increases the similarity to function
parameters and should make writing the evaluation code for these more easy.
The previous representation didn't properly model the fact that an
assignable / parameter can only be expanded, if it's a list element. This
now better models this. Other than being more correct, this should also
make evaluating these a bit easier.
While I was at it, I also improved the error message for multiple
expansions on the same level and added tests for these.
Since we're using refcounts, we don't really copy anything and no error can
occur. So let's make these callbacks return void to simplify things. This
also makes the return value false of the value getters unambiguous: It now
always means that the key was not present.
This is analogous to dictionaries and ensures that no circular references
can be created when using the exported API in apfl.h.
This also changes apfl_value_copy into apfl_value_incref to better reflect
what it does and to reflect that it is no longer an operation that can
fail.
Increasing the refcount (confusingly called copy before, fixed that too)
didn't work properly, as the object was copied and the refcount was only
updated in one of the copies.
This avoids copying the string every time we pass it around. Not too
important right now, but will become important onve we're able to evaluate
more complex expressions.