Compiler
A compiler takes in programs of a language, produces programs of a language.
If we think deeper, what are required of such a function?
A mapping! Different parts of the program should map to the output.
To get this different parts we use lexical analysis. Produces lexed IR which gives us our syntactic tokens, e.g. if, else, let
.
Then we use parsing to create an intermediate representation which has an interface we can work with. Produces parse tree IR. Note that loops are not considered because they still contain parse tree structure.
We then have semantic analysis to understand the semantics of our source program,
and finally use code generation to output program in target language.
More IRs may be produced through various phases: Typechecking, sem analysis, optimization, codegen.
Frontend - Machine independent parts. Backend - Machine dependent parts.
Phases
- lexical analysis
- parsing
- semantic analysis
- typechecking
- code generation
- Compiler optimization
- control-flow
- inlining of assembly as machine code
- Hardcodes memory allocations statically,
- Normal instructions get this during runtime
https://www.ibm.com/support/pages/what-does-it-mean-inline-function-and-how-does-it-affect-program
Common tools used
yacc, bison, (f)lex Not modern but stable. No inbuilt support for unicode.
ANTLR Modern
Definite clause grammars https://www.metalevel.at/prolog/dcg https://github.com/indocomsoft/aoc2020/blob/main/16/ans.pl http://csci431.artifice.cc/notes/prolog-parsing.html
Parser combinators Combining various parsers.
Related topics
Memory format compiling with continuations lambda calculus Executable and Linkable Format (ELF)