UP | HOME

Modern Compilers

Modern compilers consist of a front-end and a back-end. The front-end usually maps a high-level source language to an intermediate representation (IR). The back-end performs optimizations on the IR and produces output in a low-level language such as Assembly.

1 Front-End

  1. Lexical analysis - a lexer takes a stream of characters as input and outputs a string of tokens. The lexer can also identify parts of speech, e.g. symbol name, reserved word, etc.
  2. Syntactic analysis - a parser takes a stream of tokens and outputs a syntax tree
  3. Semantic analysis - type checking

2 Intermediate Reprsentation

The syntax tree can be used as an IR. A more useful structure is a Control Flow Graph. Nodes in this graph are blocks of code and edges represent conditional branching.

3 Back-End

  1. Optimization - unused paths through the CFG can be removed; dead code can be removed; variables which are actually constants can be identified and filled in ahead of time
  2. Mapping to assembly: each block in the CFG can be translated into a labeled block of assembly. Variables can be loaded into registers

4 Useful links