Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That was a fun read. Very interesting.

I have noticed that the AST nodes don't have a parent reference. I wonder how it knows for example when encountering a "continue LABEL" that the code is nested maybe deeply in a loop with that label. The only way I would think of is traversing the tree up but I think there is no way of doing this. How do they do it?



Based on [1] (where continue is handled) and [2] (the branch statement type), looks like the target is simply passed to the function. This is a recursive descent parser, so passing the target down the statement handling stack is trivial (and indeed, this is what happens for for[3]).

[1] https://github.com/golang/go/blob/master/src/cmd/compile/int... ("ctx" is a "targets" struct containing two entries, one for which statement 'break' targets and another for which statement 'continue' targets).

[2] https://github.com/golang/go/blob/master/src/cmd/compile/int...

[3] https://github.com/golang/go/blob/master/src/cmd/compile/int... (the targets{s, s} passed to innerBlock is then passed to blockBranches - s is the for statement node itself).

As a sidenote, it is interesting how easy Go reads even though i never wrote a single line of Go myself (though i do write a lot of C and Object Pascal and the code patterns look similar).


Now that you have pointed where it is, it really looks easy to understand how it works. Thank you!


Context from higher up the tree can easily be passed down when recursively processing, as others mention.

As to why do that, there are several reasons.

It's fiddly to construct trees when you need to patch the children with parent references. The most natural thing to do is create the children (usually via recursive parse) then construct the parent. If you then need to patch the children with a reference to their parent, it's more work. It also means your nodes can't be immutable (but they're often not wholly immutable anyway for other reasons, like annotating during passes that add semantic information).

It's easier to reason about tree manipulations when you only have downward pointers. For example, maybe you want to rewrite a common subexpression with a reference to a temporary, and reuse one of the common trees as the RHS on the assignment to the temporary. It's more effort if you need to patch both ways on the link, rather than just grab the tree and slot it into the assignment.

(It's possible that you have DAGs rather than trees, and have children with shared parents, but I think this isn't worth any extra representative or compression that it gives you because passes will want to mutate those nodes, and meeting the same nodes more than once makes invariants more complex.)

Finally, more interesting traversals, following control or data flow, can cut across and jump between tree branches, so parent links don't necessarily help you there either.

It helps that most languages don't have parse trees which would stress the runtime stack when processing recursively, outside of machine-generated code (and correspondingly, it's not that hard to get a stack overflow error or equivalent "too much nesting" error if you generate code targeting that failure mode).


I don’t know how the Go compiler works but one solution is for each scope level to have a symbol table with a reference to the parent level’s table.

So you can traverse up for symbols.


i recently hacked together my first simple compiler and the way i handled stuff like break/continue is that every loop pushed a "context" onto a stack, so when the compiler encounters a break/continue it just looks into the context to find the label it should jump to. could be similar




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: