Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
#include <assert.h>
|
|
|
|
|
#include <stdio.h>
|
2022-07-15 19:56:30 +00:00
|
|
|
#include <string.h>
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
|
|
|
|
|
#include "alloc.h"
|
|
|
|
|
#include "bytecode.h"
|
|
|
|
|
#include "context.h"
|
|
|
|
|
#include "gc.h"
|
2022-07-28 18:46:32 +00:00
|
|
|
#include "matcher.h"
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
#include "resizable.h"
|
2022-04-15 12:41:22 +00:00
|
|
|
#include "scope.h"
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
#include "value.h"
|
|
|
|
|
|
2022-07-15 19:56:30 +00:00
|
|
|
// #define GC_DEBUG_COLLECT_EVERY_ALLOCATION 1
|
|
|
|
|
// #define GC_DEBUG_STATS 1
|
|
|
|
|
// #define GC_DEBUG_WIPE_RECLAIMED_OBJECTS 1
|
|
|
|
|
// #define GC_DEBUG_DUMP_GRAPH_ON_COLLECT 1
|
2022-08-12 12:50:28 +00:00
|
|
|
// #define GC_DEBUG_LOG_NEW_AND_RECLAIM 1
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
|
|
|
|
|
struct gc_object {
|
|
|
|
|
// Unlike most other tagged unions in apfl, the union is first here.
|
|
|
|
|
// This allows us to have pointers to the wrapped object that can be cast
|
|
|
|
|
// into gc_object pointers and vice versa.
|
|
|
|
|
union {
|
|
|
|
|
struct list_header list;
|
|
|
|
|
struct dict_header dict;
|
|
|
|
|
struct apfl_value var;
|
|
|
|
|
struct apfl_string string;
|
|
|
|
|
struct instruction_list instructions;
|
|
|
|
|
struct scope scope;
|
|
|
|
|
struct stack stack;
|
2022-07-11 19:41:05 +00:00
|
|
|
struct function function;
|
|
|
|
|
struct cfunction cfunction;
|
2022-07-28 18:46:32 +00:00
|
|
|
struct matcher_instruction_list matcher_instructions;
|
|
|
|
|
struct matcher matcher;
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
};
|
|
|
|
|
enum gc_type type;
|
|
|
|
|
enum gc_status status;
|
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
#define GC_OBJECTS_PER_BLOCK 128
|
|
|
|
|
|
|
|
|
|
struct gc_block {
|
|
|
|
|
struct gc_object objects[GC_OBJECTS_PER_BLOCK];
|
|
|
|
|
struct gc_block *next;
|
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
static void *
|
|
|
|
|
gc_allocator(void *opaque, void *oldptr, size_t oldsize, size_t newsize)
|
|
|
|
|
{
|
|
|
|
|
struct gc *gc = opaque;
|
|
|
|
|
|
2022-07-15 19:56:30 +00:00
|
|
|
#ifdef GC_DEBUG_COLLECT_EVERY_ALLOCATION
|
|
|
|
|
if (newsize != 0 && !gc->is_collecting) {
|
|
|
|
|
apfl_gc_full(gc);
|
|
|
|
|
}
|
|
|
|
|
#endif
|
|
|
|
|
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
void *out = ALLOCATOR_CALL(gc->base_allocator, oldptr, oldsize, newsize);
|
2022-07-14 20:08:50 +00:00
|
|
|
if (newsize != 0 && out == NULL && !gc->is_collecting) {
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
// We're out of memory! Try to get out of this situation by doing a full
|
|
|
|
|
// GC run.
|
|
|
|
|
apfl_gc_full(gc);
|
|
|
|
|
|
|
|
|
|
// Hopefully we now have memory again. Try the allocation again.
|
|
|
|
|
out = ALLOCATOR_CALL(gc->base_allocator, oldptr, oldsize, newsize);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if (newsize != 0 && out == NULL) {
|
|
|
|
|
return NULL;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// TODO: incremental GC step
|
|
|
|
|
|
|
|
|
|
return out;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
struct gc_object *
|
|
|
|
|
apfl_gc_object_from_ptr(void *ptr, enum gc_type type)
|
|
|
|
|
{
|
|
|
|
|
struct gc_object *object = ptr;
|
|
|
|
|
assert(object->type == type);
|
|
|
|
|
return object;
|
|
|
|
|
}
|
|
|
|
|
|
2022-07-01 20:00:58 +00:00
|
|
|
void
|
|
|
|
|
apfl_gc_init(struct gc *gc, struct apfl_allocator allocator, gc_roots_getter roots_getter, void *roots_getter_opaque)
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
{
|
|
|
|
|
gc->base_allocator = allocator;
|
|
|
|
|
gc->allocator = (struct apfl_allocator) {
|
|
|
|
|
.opaque = gc,
|
|
|
|
|
.alloc = gc_allocator,
|
|
|
|
|
};
|
|
|
|
|
gc->block = NULL;
|
|
|
|
|
|
2022-07-01 20:00:58 +00:00
|
|
|
gc->roots_getter = roots_getter;
|
|
|
|
|
gc->roots_getter_opaque = roots_getter_opaque;
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
gc->tmproots = (struct gc_tmproots) {
|
|
|
|
|
.roots = NULL,
|
|
|
|
|
.len = 0,
|
|
|
|
|
.cap = 0,
|
|
|
|
|
};
|
|
|
|
|
gc->tmproot_for_adding = NULL;
|
2022-07-14 20:08:50 +00:00
|
|
|
gc->is_collecting = false;
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
static struct gc_block *
|
|
|
|
|
new_block(struct gc *gc)
|
|
|
|
|
{
|
|
|
|
|
struct gc_block *block = ALLOC_OBJ(gc->allocator, struct gc_block);
|
|
|
|
|
if (block == NULL) {
|
|
|
|
|
return NULL;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
for (size_t i = 0; i < GC_OBJECTS_PER_BLOCK; i++) {
|
|
|
|
|
block->objects[i] = (struct gc_object) { .status = GC_STATUS_FREE };
|
|
|
|
|
}
|
|
|
|
|
block->next = NULL;
|
|
|
|
|
|
|
|
|
|
return block;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
static struct gc_object *
|
|
|
|
|
new_object_inner(struct gc *gc)
|
|
|
|
|
{
|
|
|
|
|
struct gc_block **cur = &gc->block;
|
|
|
|
|
|
|
|
|
|
while (*cur != NULL) {
|
|
|
|
|
struct gc_block *block = *cur;
|
|
|
|
|
for (size_t i = 0; i < GC_OBJECTS_PER_BLOCK; i++) {
|
|
|
|
|
if (block->objects[i].status == GC_STATUS_FREE) {
|
|
|
|
|
return &block->objects[i];
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
cur = &block->next;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
struct gc_block *nb = new_block(gc);
|
|
|
|
|
if (nb == NULL) {
|
|
|
|
|
return NULL;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
*cur = nb;
|
|
|
|
|
|
|
|
|
|
return &nb->objects[0];
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
static struct gc_object *
|
|
|
|
|
new_object(struct gc *gc, enum gc_type type)
|
|
|
|
|
{
|
|
|
|
|
struct gc_object *object = new_object_inner(gc);
|
|
|
|
|
if (object == NULL) {
|
|
|
|
|
return NULL;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
assert(object->status == GC_STATUS_FREE);
|
|
|
|
|
object->status = GC_STATUS_WHITE;
|
|
|
|
|
object->type = type;
|
|
|
|
|
return object;
|
|
|
|
|
}
|
|
|
|
|
|
2022-08-12 12:50:28 +00:00
|
|
|
static const char *type_to_string(enum gc_type);
|
|
|
|
|
|
|
|
|
|
#ifdef GC_DEBUG_LOG_NEW_AND_RECLAIM
|
|
|
|
|
# define LOG_NEW_AND_RECLAIM(fmt, ...) fprintf(stderr, fmt, __VA_ARGS__)
|
|
|
|
|
#else
|
|
|
|
|
# define LOG_NEW_AND_RECLAIM(fmt, ...)
|
|
|
|
|
#endif
|
|
|
|
|
|
|
|
|
|
#define IMPL_NEW(t, name, type, field) \
|
|
|
|
|
t * \
|
|
|
|
|
name(struct gc *gc) \
|
|
|
|
|
{ \
|
|
|
|
|
struct gc_object *object = new_object(gc, type); \
|
|
|
|
|
LOG_NEW_AND_RECLAIM("New %s object at %p\n", type_to_string(type), (void *)object); \
|
|
|
|
|
return object == NULL ? NULL : &object->field; \
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
IMPL_NEW(struct list_header, apfl_gc_new_list, GC_TYPE_LIST, list )
|
|
|
|
|
IMPL_NEW(struct dict_header, apfl_gc_new_dict, GC_TYPE_DICT, dict )
|
|
|
|
|
IMPL_NEW(struct apfl_value, apfl_gc_new_var, GC_TYPE_VAR, var )
|
|
|
|
|
IMPL_NEW(struct apfl_string, apfl_gc_new_string, GC_TYPE_STRING, string )
|
|
|
|
|
IMPL_NEW(struct instruction_list, apfl_gc_new_instructions, GC_TYPE_INSTRUCTIONS, instructions )
|
|
|
|
|
IMPL_NEW(struct scope, apfl_gc_new_scope, GC_TYPE_SCOPE, scope )
|
2022-07-11 19:41:05 +00:00
|
|
|
IMPL_NEW(struct function, apfl_gc_new_func, GC_TYPE_FUNC, function )
|
|
|
|
|
IMPL_NEW(struct cfunction, apfl_gc_new_cfunc, GC_TYPE_CFUNC, cfunction )
|
2022-07-28 18:46:32 +00:00
|
|
|
IMPL_NEW(struct matcher_instruction_list, apfl_gc_new_matcher_instructions, GC_TYPE_MATCHER_INSTRUCTIONS, matcher_instructions)
|
|
|
|
|
IMPL_NEW(struct matcher, apfl_gc_new_matcher, GC_TYPE_MATCHER, matcher )
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
|
|
|
|
|
size_t
|
|
|
|
|
apfl_gc_tmproots_begin(struct gc *gc)
|
|
|
|
|
{
|
|
|
|
|
return gc->tmproots.len;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
void
|
|
|
|
|
apfl_gc_tmproots_restore(struct gc *gc, size_t newlen)
|
|
|
|
|
{
|
|
|
|
|
assert(newlen <= gc->tmproots.len);
|
|
|
|
|
gc->tmproots.len = newlen;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
bool
|
|
|
|
|
apfl_gc_tmproot_add(struct gc *gc, struct gc_object *object)
|
|
|
|
|
{
|
|
|
|
|
// Since appending the new tmproot can trigger a garbage collection, we need
|
|
|
|
|
// to set the tmproot as the tmproot_for_adding, so we'll treat it as a root
|
|
|
|
|
// and not free it.
|
|
|
|
|
assert(gc->tmproot_for_adding == NULL);
|
|
|
|
|
gc->tmproot_for_adding = object;
|
|
|
|
|
|
|
|
|
|
bool ok = apfl_resizable_append(
|
|
|
|
|
gc->allocator,
|
|
|
|
|
sizeof(struct gc_object *),
|
|
|
|
|
(void **)&gc->tmproots.roots,
|
|
|
|
|
&gc->tmproots.len,
|
|
|
|
|
&gc->tmproots.cap,
|
|
|
|
|
&object,
|
|
|
|
|
1
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
gc->tmproot_for_adding = NULL;
|
|
|
|
|
|
|
|
|
|
return ok;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
static void
|
|
|
|
|
color_object_grey(struct gc_object *object)
|
|
|
|
|
{
|
|
|
|
|
object->status = object->status == GC_STATUS_BLACK ? GC_STATUS_BLACK : GC_STATUS_GREY;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
static void
|
|
|
|
|
visit_roots(struct gc *gc, gc_visitor visitor, void *opaque)
|
|
|
|
|
{
|
2022-07-01 20:00:58 +00:00
|
|
|
gc->roots_getter(gc->roots_getter_opaque, visitor, opaque);
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
|
|
|
|
|
for (size_t i = 0; i < gc->tmproots.len; i++) {
|
|
|
|
|
visitor(opaque, gc->tmproots.roots[i]);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if (gc->tmproot_for_adding != NULL) {
|
|
|
|
|
visitor(opaque, gc->tmproot_for_adding);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
static void
|
|
|
|
|
mark_roots_visitor(void *opaque, struct gc_object *root)
|
|
|
|
|
{
|
|
|
|
|
(void)opaque;
|
|
|
|
|
color_object_grey(root);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
static void
|
|
|
|
|
mark_roots(struct gc *gc)
|
|
|
|
|
{
|
|
|
|
|
visit_roots(gc, mark_roots_visitor, NULL);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
static void
|
|
|
|
|
visit_children(struct gc_object *object, gc_visitor cb, void *opaque)
|
|
|
|
|
{
|
|
|
|
|
switch (object->type) {
|
|
|
|
|
case GC_TYPE_LIST:
|
|
|
|
|
apfl_gc_list_traverse(&object->list, cb, opaque);
|
|
|
|
|
return;
|
|
|
|
|
case GC_TYPE_DICT:
|
|
|
|
|
apfl_gc_dict_traverse(&object->dict, cb, opaque);
|
|
|
|
|
return;
|
|
|
|
|
case GC_TYPE_VAR:
|
|
|
|
|
apfl_gc_var_traverse(&object->var, cb, opaque);
|
|
|
|
|
return;
|
|
|
|
|
case GC_TYPE_SCOPE:
|
|
|
|
|
apfl_gc_scope_traverse(&object->scope, cb, opaque);
|
|
|
|
|
return;
|
|
|
|
|
case GC_TYPE_STRING:
|
|
|
|
|
// Intentionally left blank. Object doesn't reference other objects.
|
|
|
|
|
return;
|
|
|
|
|
case GC_TYPE_INSTRUCTIONS:
|
|
|
|
|
apfl_gc_instructions_traverse(&object->instructions, cb, opaque);
|
|
|
|
|
return;
|
2022-07-11 19:41:05 +00:00
|
|
|
case GC_TYPE_FUNC:
|
|
|
|
|
apfl_gc_func_traverse(&object->function, cb, opaque);
|
|
|
|
|
return;
|
|
|
|
|
case GC_TYPE_CFUNC:
|
|
|
|
|
apfl_gc_cfunc_traverse(&object->cfunction, cb, opaque);
|
|
|
|
|
return;
|
2022-07-28 18:46:32 +00:00
|
|
|
case GC_TYPE_MATCHER_INSTRUCTIONS:
|
|
|
|
|
// Intentionally left blank. Object doesn't reference other objects.
|
|
|
|
|
return;
|
|
|
|
|
case GC_TYPE_MATCHER:
|
|
|
|
|
apfl_gc_matcher_traverse(&object->matcher, cb, opaque);
|
|
|
|
|
return;
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
assert(false);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
static void
|
|
|
|
|
trace_callback(void *opaque, struct gc_object *object)
|
|
|
|
|
{
|
|
|
|
|
(void)opaque;
|
|
|
|
|
color_object_grey(object);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
static void
|
|
|
|
|
trace(struct gc_object *object)
|
|
|
|
|
{
|
|
|
|
|
object->status = GC_STATUS_BLACK;
|
|
|
|
|
visit_children(object, trace_callback, NULL);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
static void
|
|
|
|
|
trace_while_having_grey(struct gc *gc)
|
|
|
|
|
{
|
|
|
|
|
bool found_grey;
|
|
|
|
|
do {
|
|
|
|
|
found_grey = false;
|
|
|
|
|
for (
|
|
|
|
|
struct gc_block *cur = gc->block;
|
|
|
|
|
cur != NULL;
|
|
|
|
|
cur = cur->next
|
|
|
|
|
) {
|
|
|
|
|
for (size_t i = 0; i < GC_OBJECTS_PER_BLOCK; i++) {
|
|
|
|
|
struct gc_object *object = &cur->objects[i];
|
|
|
|
|
if (object->status == GC_STATUS_GREY) {
|
|
|
|
|
trace(object);
|
|
|
|
|
found_grey = true;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
} while (found_grey);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
static void
|
|
|
|
|
deinit_object(struct gc *gc, struct gc_object *object)
|
|
|
|
|
{
|
|
|
|
|
switch (object->type) {
|
|
|
|
|
case GC_TYPE_LIST:
|
|
|
|
|
apfl_list_deinit(gc->allocator, &object->list);
|
|
|
|
|
return;
|
|
|
|
|
case GC_TYPE_DICT:
|
|
|
|
|
apfl_dict_deinit(&object->dict);
|
|
|
|
|
return;
|
|
|
|
|
case GC_TYPE_VAR:
|
|
|
|
|
return;
|
|
|
|
|
case GC_TYPE_STRING:
|
|
|
|
|
apfl_string_deinit(gc->allocator, &object->string);
|
|
|
|
|
return;
|
|
|
|
|
case GC_TYPE_INSTRUCTIONS:
|
|
|
|
|
apfl_instructions_deinit(gc->allocator, &object->instructions);
|
|
|
|
|
return;
|
|
|
|
|
case GC_TYPE_SCOPE:
|
|
|
|
|
apfl_scope_deinit(gc->allocator, &object->scope);
|
|
|
|
|
return;
|
2022-07-11 19:41:05 +00:00
|
|
|
case GC_TYPE_FUNC:
|
|
|
|
|
apfl_function_deinit(gc->allocator, &object->function);
|
|
|
|
|
return;
|
|
|
|
|
case GC_TYPE_CFUNC:
|
|
|
|
|
apfl_cfunction_deinit(gc->allocator, &object->cfunction);
|
|
|
|
|
return;
|
2022-07-28 18:46:32 +00:00
|
|
|
case GC_TYPE_MATCHER_INSTRUCTIONS:
|
|
|
|
|
apfl_matcher_instructions_deinit(gc->allocator, &object->matcher_instructions);
|
|
|
|
|
return;
|
|
|
|
|
case GC_TYPE_MATCHER:
|
|
|
|
|
apfl_matcher_deinit(gc->allocator, &object->matcher);
|
|
|
|
|
return;
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
assert(false);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
static void
|
|
|
|
|
sweep(struct gc *gc)
|
|
|
|
|
{
|
2022-07-15 19:56:30 +00:00
|
|
|
#ifdef GC_DEBUG_STATS
|
|
|
|
|
int reclaimed_objects = 0;
|
|
|
|
|
int reclaimed_blocks = 0;
|
|
|
|
|
#endif
|
|
|
|
|
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
struct gc_block **cur = &gc->block;
|
|
|
|
|
while (*cur != NULL) {
|
|
|
|
|
struct gc_block *block = *cur;
|
|
|
|
|
|
|
|
|
|
bool completely_free = true;
|
|
|
|
|
for (size_t i = 0; i < GC_OBJECTS_PER_BLOCK; i++) {
|
|
|
|
|
struct gc_object *object = &block->objects[i];
|
|
|
|
|
|
|
|
|
|
switch (object->status) {
|
|
|
|
|
case GC_STATUS_FREE:
|
|
|
|
|
break;
|
|
|
|
|
case GC_STATUS_WHITE:
|
2022-08-12 12:50:28 +00:00
|
|
|
LOG_NEW_AND_RECLAIM("reclaiming %p of type %s\n", (void *)object, type_to_string(object->type));
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
deinit_object(gc, object);
|
2022-07-15 19:56:30 +00:00
|
|
|
|
|
|
|
|
#ifdef GC_DEBUG_WIPE_RECLAIMED_OBJECTS
|
|
|
|
|
memset(object, 0, sizeof(struct gc_object));
|
|
|
|
|
object->type = 0xFF; // Some intentionally undefined type
|
|
|
|
|
#endif
|
|
|
|
|
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
object->status = GC_STATUS_FREE;
|
2022-07-15 19:56:30 +00:00
|
|
|
|
|
|
|
|
#ifdef GC_DEBUG_STATS
|
|
|
|
|
reclaimed_objects++;
|
|
|
|
|
#endif
|
|
|
|
|
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
break;
|
|
|
|
|
case GC_STATUS_GREY:
|
|
|
|
|
assert(false /*Encountered grey object while sweeping*/);
|
|
|
|
|
break;
|
|
|
|
|
case GC_STATUS_BLACK:
|
|
|
|
|
object->status = GC_STATUS_WHITE; // Prepare for next run
|
|
|
|
|
completely_free = false;
|
|
|
|
|
break;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if (completely_free) {
|
|
|
|
|
*cur = block->next;
|
|
|
|
|
FREE_OBJ(gc->allocator, block);
|
2022-07-15 19:56:30 +00:00
|
|
|
|
|
|
|
|
#ifdef GC_DEBUG_STATS
|
|
|
|
|
reclaimed_blocks++;
|
|
|
|
|
#endif
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
} else {
|
|
|
|
|
cur = &block->next;
|
|
|
|
|
}
|
|
|
|
|
}
|
2022-07-15 19:56:30 +00:00
|
|
|
|
|
|
|
|
#ifdef GC_DEBUG_STATS
|
|
|
|
|
fprintf(stderr, "gc: reclaimed %d objects, %d blocks\n", reclaimed_objects, reclaimed_blocks);
|
|
|
|
|
#endif
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
}
|
|
|
|
|
|
2022-07-15 19:56:30 +00:00
|
|
|
#ifdef GC_DEBUG_DUMP_GRAPH_ON_COLLECT
|
2022-11-19 21:07:26 +00:00
|
|
|
# define DUMP_ON_COLLECT() apfl_gc_debug_dump_graph(gc, apfl_format_file_writer(stderr))
|
2022-07-15 19:56:30 +00:00
|
|
|
#else
|
|
|
|
|
# define DUMP_ON_COLLECT()
|
|
|
|
|
#endif
|
|
|
|
|
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
void
|
|
|
|
|
apfl_gc_full(struct gc *gc)
|
|
|
|
|
{
|
2022-07-14 20:08:50 +00:00
|
|
|
assert(!gc->is_collecting);
|
|
|
|
|
gc->is_collecting = true;
|
|
|
|
|
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
mark_roots(gc);
|
2022-07-15 19:56:30 +00:00
|
|
|
DUMP_ON_COLLECT();
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
trace_while_having_grey(gc);
|
2022-07-15 19:56:30 +00:00
|
|
|
DUMP_ON_COLLECT();
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
sweep(gc);
|
2022-07-15 19:56:30 +00:00
|
|
|
DUMP_ON_COLLECT();
|
2022-07-14 20:08:50 +00:00
|
|
|
|
|
|
|
|
gc->is_collecting = false;
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
}
|
|
|
|
|
|
2022-07-28 18:46:32 +00:00
|
|
|
void
|
|
|
|
|
apfl_gc_add_child(struct gc_object *parent, struct gc_object* child)
|
|
|
|
|
{
|
|
|
|
|
if (parent->status == GC_STATUS_BLACK) {
|
|
|
|
|
color_object_grey(child);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
static const char *
|
|
|
|
|
dump_graph_bgcolor(enum gc_status status)
|
|
|
|
|
{
|
|
|
|
|
switch (status) {
|
|
|
|
|
case GC_STATUS_BLACK:
|
|
|
|
|
return "black";
|
|
|
|
|
case GC_STATUS_GREY:
|
|
|
|
|
return "grey";
|
|
|
|
|
default:
|
|
|
|
|
return "white";
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
static const char *
|
|
|
|
|
dump_graph_fgcolor(enum gc_status status)
|
|
|
|
|
{
|
|
|
|
|
switch (status) {
|
|
|
|
|
case GC_STATUS_BLACK:
|
|
|
|
|
return "white";
|
|
|
|
|
default:
|
|
|
|
|
return "black";
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
static const char *
|
|
|
|
|
type_to_string(enum gc_type type)
|
|
|
|
|
{
|
|
|
|
|
switch (type) {
|
|
|
|
|
case GC_TYPE_LIST:
|
|
|
|
|
return "list";
|
|
|
|
|
case GC_TYPE_DICT:
|
|
|
|
|
return "dict";
|
|
|
|
|
case GC_TYPE_VAR:
|
|
|
|
|
return "var";
|
|
|
|
|
case GC_TYPE_STRING:
|
|
|
|
|
return "string";
|
|
|
|
|
case GC_TYPE_INSTRUCTIONS:
|
|
|
|
|
return "instructions";
|
|
|
|
|
case GC_TYPE_SCOPE:
|
|
|
|
|
return "scope";
|
2022-07-11 19:41:05 +00:00
|
|
|
case GC_TYPE_FUNC:
|
|
|
|
|
return "func";
|
|
|
|
|
case GC_TYPE_CFUNC:
|
|
|
|
|
return "cfunc";
|
2022-07-28 18:46:32 +00:00
|
|
|
case GC_TYPE_MATCHER_INSTRUCTIONS:
|
|
|
|
|
return "matcher instructions";
|
|
|
|
|
case GC_TYPE_MATCHER:
|
|
|
|
|
return "matcher";
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
assert(false);
|
|
|
|
|
return "???";
|
|
|
|
|
}
|
|
|
|
|
|
2022-11-19 21:07:26 +00:00
|
|
|
struct dump_graph_roots_visitor_data {
|
|
|
|
|
struct apfl_format_writer w;
|
|
|
|
|
bool success;
|
|
|
|
|
};
|
|
|
|
|
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
static void
|
|
|
|
|
dump_graph_roots_visitor(void *opaque, struct gc_object *obj)
|
|
|
|
|
{
|
2022-11-19 21:07:26 +00:00
|
|
|
struct dump_graph_roots_visitor_data *data = opaque;
|
|
|
|
|
data->success = data->success
|
|
|
|
|
&& apfl_format_put_string(data->w, " ROOTS -> obj_")
|
|
|
|
|
&& apfl_format_put_poiner(data->w, (void *)obj)
|
|
|
|
|
&& apfl_format_put_string(data->w, "\n");
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
struct dump_graph_visitor_data {
|
2022-11-19 21:07:26 +00:00
|
|
|
struct apfl_format_writer w;
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
struct gc_object *parent;
|
2022-11-19 21:07:26 +00:00
|
|
|
bool success;
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
};
|
|
|
|
|
|
|
|
|
|
static void
|
|
|
|
|
dump_graph_visitor(void *opaque, struct gc_object *obj)
|
|
|
|
|
{
|
|
|
|
|
struct dump_graph_visitor_data *data = opaque;
|
2022-11-19 21:07:26 +00:00
|
|
|
data->success = data->success
|
|
|
|
|
&& apfl_format_put_string(data->w, " obj_")
|
|
|
|
|
&& apfl_format_put_poiner(data->w, (void *)data->parent)
|
|
|
|
|
&& apfl_format_put_string(data->w, " -> obj_")
|
|
|
|
|
&& apfl_format_put_poiner(data->w, (void *)obj)
|
|
|
|
|
&& apfl_format_put_string(data->w, "\n");
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
}
|
|
|
|
|
|
2022-11-19 21:07:26 +00:00
|
|
|
bool
|
|
|
|
|
apfl_gc_debug_dump_graph(struct gc *gc, struct apfl_format_writer w)
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
{
|
2022-11-19 21:07:26 +00:00
|
|
|
FMT_TRY(apfl_format_put_string(w, "digraph G {\n"));
|
|
|
|
|
|
|
|
|
|
struct dump_graph_roots_visitor_data roots_visitor_data = {
|
|
|
|
|
.w = w,
|
|
|
|
|
.success = true,
|
|
|
|
|
};
|
|
|
|
|
visit_roots(gc, dump_graph_roots_visitor, &roots_visitor_data);
|
|
|
|
|
FMT_TRY(roots_visitor_data.success);
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
|
|
|
|
|
for (struct gc_block *block = gc->block; block != NULL; block = block->next) {
|
|
|
|
|
int counts[4] = {0, 0, 0, 0};
|
|
|
|
|
for (size_t i = 0; i < GC_OBJECTS_PER_BLOCK; i++) {
|
|
|
|
|
struct gc_object *obj = &block->objects[i];
|
|
|
|
|
|
|
|
|
|
counts[obj->status]++;
|
|
|
|
|
if (obj->status == GC_STATUS_FREE) {
|
|
|
|
|
continue;
|
|
|
|
|
}
|
|
|
|
|
|
2022-11-19 21:07:26 +00:00
|
|
|
FMT_TRY(apfl_format_put_string(w, " blk_"));
|
|
|
|
|
FMT_TRY(apfl_format_put_poiner(w, (void *)block));
|
|
|
|
|
FMT_TRY(apfl_format_put_string(w, " -> obj_"));
|
|
|
|
|
FMT_TRY(apfl_format_put_poiner(w, (void *)obj));
|
|
|
|
|
FMT_TRY(apfl_format_put_string(w, "\n"));
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
FMT_TRY(apfl_format_put_string(w, " obj_"));
|
|
|
|
|
FMT_TRY(apfl_format_put_poiner(w, (void *)obj));
|
|
|
|
|
FMT_TRY(apfl_format_put_string(w, "[style=filled,fillcolor="));
|
|
|
|
|
FMT_TRY(apfl_format_put_string(w, dump_graph_bgcolor(obj->status)));
|
|
|
|
|
FMT_TRY(apfl_format_put_string(w, ",fontcolor="));
|
|
|
|
|
FMT_TRY(apfl_format_put_string(w, dump_graph_fgcolor(obj->status)));
|
|
|
|
|
FMT_TRY(apfl_format_put_string(w, ",label=\"Object "));
|
|
|
|
|
FMT_TRY(apfl_format_put_poiner(w, (void *)obj));
|
|
|
|
|
FMT_TRY(apfl_format_put_string(w, "\\ntype: "));
|
|
|
|
|
FMT_TRY(apfl_format_put_string(w, type_to_string(obj->type)));
|
|
|
|
|
FMT_TRY(apfl_format_put_string(w, "\"];\n"));
|
|
|
|
|
|
|
|
|
|
struct dump_graph_visitor_data visitor_data = {
|
|
|
|
|
.w = w,
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
.parent = obj,
|
2022-11-19 21:07:26 +00:00
|
|
|
.success = true,
|
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
visit_children(obj, dump_graph_visitor, &visitor_data);
|
|
|
|
|
FMT_TRY(visitor_data.success);
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
}
|
|
|
|
|
|
2022-11-19 21:07:26 +00:00
|
|
|
FMT_TRY(apfl_format_put_string(w, " BLOCKS -> blk_"));
|
|
|
|
|
FMT_TRY(apfl_format_put_poiner(w, (void *)block));
|
|
|
|
|
FMT_TRY(apfl_format_put_string(w, ";\n"));
|
|
|
|
|
|
|
|
|
|
FMT_TRY(apfl_format_put_string(w, " blk_"));
|
|
|
|
|
FMT_TRY(apfl_format_put_poiner(w, (void *)block));
|
|
|
|
|
FMT_TRY(apfl_format_put_string(w, " [label=\"Block "));
|
|
|
|
|
FMT_TRY(apfl_format_put_poiner(w, (void *)block));
|
|
|
|
|
FMT_TRY(apfl_format_put_string(w, "\\nfree "));
|
|
|
|
|
FMT_TRY(apfl_format_put_int(w, counts[GC_STATUS_FREE]));
|
|
|
|
|
FMT_TRY(apfl_format_put_string(w, ", black "));
|
|
|
|
|
FMT_TRY(apfl_format_put_int(w, counts[GC_STATUS_BLACK]));
|
|
|
|
|
FMT_TRY(apfl_format_put_string(w, ", grey "));
|
|
|
|
|
FMT_TRY(apfl_format_put_int(w, counts[GC_STATUS_GREY]));
|
|
|
|
|
FMT_TRY(apfl_format_put_string(w, ", white "));
|
|
|
|
|
FMT_TRY(apfl_format_put_int(w, counts[GC_STATUS_WHITE]));
|
|
|
|
|
FMT_TRY(apfl_format_put_string(w, "\"];\n"));
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
}
|
2022-11-19 21:07:26 +00:00
|
|
|
|
|
|
|
|
FMT_TRY(apfl_format_put_string(w, "}\n"));
|
|
|
|
|
|
|
|
|
|
return true;
|
Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).
The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.
The tri-color abstraction was chosen for two reasons:
- We don't have to maintain a list of objects that need to be marked, we
can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
we only do a stop-the-world collection).
This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.
As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 20:24:22 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
void
|
|
|
|
|
apfl_gc_deinit(struct gc *gc)
|
|
|
|
|
{
|
|
|
|
|
for (struct gc_block *block = gc->block; block != NULL; ) {
|
|
|
|
|
for (size_t i = 0; i < GC_OBJECTS_PER_BLOCK; i++) {
|
|
|
|
|
struct gc_object *object = &block->objects[i];
|
|
|
|
|
if (object->status != GC_STATUS_FREE) {
|
|
|
|
|
deinit_object(gc, object);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
struct gc_block *next = block->next;
|
|
|
|
|
FREE_OBJ(gc->allocator, block);
|
|
|
|
|
block = next;
|
|
|
|
|
}
|
|
|
|
|
gc->block = NULL;
|
|
|
|
|
|
|
|
|
|
FREE_LIST(gc->allocator, gc->tmproots.roots, gc->tmproots.cap);
|
|
|
|
|
}
|