A compiler takes in programs of a language, produces programs of a language.

If we think deeper, what are required of such a function?

A mapping! Different parts of the program should map to the output.

To get this different parts we use lexical analysis. Produces lexed IR which gives us our syntactic tokens, e.g. if, else, let.

Then we use parsing to create an intermediate representation which has an interface we can work with. Produces parse tree IR. Note that loops are not considered because they still contain parse tree structure.

We then have semantic analysis to understand the semantics of our source program,

and finally use code generation to output program in target language.

More IRs may be produced through various phases: Typechecking, sem analysis, optimization, codegen.

Frontend - Machine independent parts. Backend - Machine dependent parts.


Common tools used

  • yacc, bison, (f)lex Not modern but stable. No inbuilt support for unicode.

  • ANTLR Modern

  • Definite clause grammars

  • Parser combinators Combining various parsers.

Memory format compiling with continuations lambda calculus Executable and Linkable Format (ELF)

See also

What is Flex?