apfl/src/token.c
Laria Carolin Chabowski 90a80152e1 Implement mark&sweep garbage collection and bytecode compilation
Instead of the previous refcount base garbage collection, we're now using
a basic tri-color mark&sweep collector. This is done to support cyclical
value relationships in the future (functions can form cycles, all values
implemented up to this point can not).

The collector maintains a set of roots and a set of objects (grouped into
blocks). The GC enabled objects are no longer allocated manually, but will
be allocated by the GC. The GC also wraps an allocator, this way the GC
knows, if we ran out of memory and will try to get out of this situation by
performing a full collection cycle.

The tri-color abstraction was chosen for two reasons:

- We don't have to maintain a list of objects that need to be marked, we
  can simply grab the next grey one.
- It should allow us to later implement incremental collection (right now
  we only do a stop-the-world collection).

This also switches to a bytecode based evaluation of the code: We no longer
directly evaluate the AST, but first compile it into a series of
instructions, that are evaluated in a separate step. This was done in
preparation for inplementing functions: We only need to turn a function
body into instructions instead of evaluating the node again with each call
of the function. Also, since an instruction list is implemented as a GC
object, this then removes manual memory management of the function body and
it's child nodes. Since the GC and the bytecode go hand in hand, this was
done in one (giant) commit.

As a downside, we've now lost the ability do do list matching on
assignments. I've already started to work on implementing this in the new
architecture, but left it out of this commit, as it's already quite a large
commit :)
2022-04-11 22:24:22 +02:00

122 lines
2.6 KiB
C

#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include "apfl.h"
static bool
has_text_data(enum apfl_token_type type)
{
switch (type) {
case APFL_TOK_NAME:
case APFL_TOK_STRING:
case APFL_TOK_COMMENT:
return true;
default:
return false;
}
}
static bool
has_numeric_data(enum apfl_token_type type)
{
switch (type) {
case APFL_TOK_NUMBER:
return true;
default:
return false;
}
}
void
apfl_token_deinit(struct apfl_allocator allocator, struct apfl_token *token)
{
if (has_text_data(token->type)) {
apfl_string_deinit(allocator, &token->text);
}
}
const char *
apfl_token_type_name(enum apfl_token_type type)
{
switch (type) {
case APFL_TOK_LPAREN:
return "(";
case APFL_TOK_RPAREN:
return ")";
case APFL_TOK_LBRACKET:
return "[";
case APFL_TOK_RBRACKET:
return "]";
case APFL_TOK_LBRACE:
return "{";
case APFL_TOK_RBRACE:
return "}";
case APFL_TOK_MAPSTO:
return "->";
case APFL_TOK_EXPAND:
return "~";
case APFL_TOK_DOT:
return ".";
case APFL_TOK_AT:
return "@";
case APFL_TOK_SEMICOLON:
return ";";
case APFL_TOK_LINEBREAK:
return "LINEBREAK";
case APFL_TOK_CONTINUE_LINE:
return "\\";
case APFL_TOK_COMMENT:
return "COMMENT";
case APFL_TOK_COMMA:
return ",";
case APFL_TOK_QUESTION_MARK:
return "?";
case APFL_TOK_STRINGIFY:
return "'";
case APFL_TOK_ASSIGN:
return "=";
case APFL_TOK_LOCAL_ASSIGN:
return ":=";
case APFL_TOK_NUMBER:
return "NUMBER";
case APFL_TOK_NAME:
return "NAME";
case APFL_TOK_STRING:
return "STRING";
}
return "(unknown token)";
}
void
apfl_token_print(struct apfl_token token, FILE *file)
{
if (has_text_data(token.type)) {
fprintf(
file,
"%s (" APFL_STR_FMT ") @ (%d:%d)\n",
apfl_token_type_name(token.type),
APFL_STR_FMT_ARGS(apfl_string_view_from(token.text)),
token.position.line,
token.position.col
);
} else if (has_numeric_data(token.type)) {
fprintf(
file,
"%s (%f) @ (%d:%d)\n",
apfl_token_type_name(token.type),
token.number,
token.position.line,
token.position.col
);
} else {
fprintf(
file,
"%s @ (%d:%d)\n",
apfl_token_type_name(token.type),
token.position.line,
token.position.col
);
}
}