Crafting A Compiler

Charles N. Fischer / Ron K. Cytron / Richard J. LeBlanc  
Total pages
October 2009
Related Titles


A practical yet thorough treatment of compiler construction.


Crafting a Compiler is an undergraduate-level text that presents a practical approach to compiler construction with thorough coverage of the material and examples that clearly illustrate the concepts in the book. Unlike other texts on the market, Fischer/Cytron/LeBlanc uses object-oriented design patterns and incorporates an algorithmic exposition with modern software practices. The text and its package of accompanying resources allow any instructor to teach a thorough and compelling course in compiler construction in a single semester. Crafting a Compiler  serves as an ideal reference and tutorial.


Use of modern object-oriented patterns (such as Visitor)

Many students are already familiar with object-oriented patterns, and these patterns tend to organize software in such a way that the resulting compliers are more clear and more easily understood, so their use throughout the book allows students to absorb the concepts more readily. For students who have not seen these patterns, their use in a compiler is a compelling example.


Example:  Code generation is cast as a visit of the AST


Use of pseudocode in algorithms

This book offers superior coverage of the algorithms used in the front- and back-end of the compiler.  Unlike other texts that can lead to frustration with a mix of detail and abstraction that can be confusing to students, this book patterns the pseudocode after popular algorithm texts, such as Cormen, Leiserson, Rivest that most students have seen before.


Example:  Many algorithms in a compiler involve sets.  Instead of using low-level data structures to represent sets, they are referenced in pseudocode as sets.  Most languages include implementation of set operations, and those could simply be used for implementing the algorithms.  Students and professors who wish to study the algorithms at  a deeper level can devise more efficient data structures after the basic concepts have been studied.


Exercises as Knowledge-based Inquiry

Most chapters include exercises that are meant to inspire professor- or student-directed exploration. The authors find that such “knowledge-based inquiry” is more fun for students and leads to better retention of concepts. The text forms the foundation for such exploration, but the student and professors are guided into further study by these exercises.


Abstract Syntax Tree (AST) is a pivotal element of the exposition

The book is neatly divided into the phases that construct the AST and those that process the AST.  In contrast with other books, the design and implementation of the AST is forefront in the text, and serves as a central element in type checking and code generation. Interfaces implemented by AST nodes trigger actions by the various compiler phases. As a result, students need not read or write as much code, and new elements are easily incorporated into the AST.


Example:  Type checking can be implemented by visiting the tree and processing interfaces that specify how type rules apply to a given subtree. Semantic changes or additions can be implemented by changing these interfaces and their behavior in the visitors that process the AST.


Supplementary materials

Various components of the compiler can be provided to students so they can focus on the elements of particular interest in a given course offering.


Example:  A front end could be given for parsing Java, and students can then focus on type checking or code generation.  Or, students could develop and incorporate language extensions into the front end, with the back end mostly provided by the instructor.

Table of Contents

Chapter 1 Introduction

1.1 Overview and History of Compilation

1.2 What Compilers Do

1.2.1 Distinguishing Compilers by the Machine Code Generated

1.2.2 Target Code Formats

1.3 Interpreters

1.4 Syntax and Semantics of Programming Languages

1.4.1 Static Semantics

1.4.2 Run-time Semantics

1.5 Organization of a Compiler

1.5.1 The Scanner

1.5.2 The Parser

1.5.3 The Type Checker (Semantic Analysis)

1.5.4 The Optimizer

1.5.5 The Code Generator

1.5.6 Compiler Writing Tools

1.6 Compiler Design and Programming Language Design

1.7 Architectural Influences of Computer Design

1.8 Compiler Variants

1.8.1 Debugging (Development) Compilers

1.8.2 Optimizing Compilers

1.8.3 Retargetable Compilers

1.9 Program Development Environment

Chapter 2 A Simple Compiler

2.1 An Informal Definition of the ac Language

2.2 Formal Definition of ac

2.2.1 Syntax Specification

2.2.2 Token Specification

2.3 Phases of a Simple Compiler

2.4 Scanning

2.5 Parsing

2.5.1 Predicting a Parsing Procedure

2.5.2 Implementing the Production

2.6 Abstract Syntax Trees

2.7 Semantic Analysis

2.7.1 Symbol Tables

2.7.2 Type Checking

2.8 Code Generation

Chapter 3 Scanning–Theory and Practice

3.1 Overview of a Scanner

3.2 Regular Expressions

3.3 Examples

3.4 Finite Automata and Scanners

3.4.1 Deterministic Finite Automata

3.5 The Lex Scanner Generator

3.5.1 Defining Tokens in Lex

3.5.2 The Character Class

3.5.3 Using Regular Expressions to Define Tokens

3.5.4 Character Processing Using Lex

3.6 Other Scanner Generators

3.7 Practical Considerations of Building Scanners

3.7.1 Processing Identifiers and Literals

3.7.2 Using Compiler Directives and Listing Source Lines

3.7.3 Terminating the Scanner

3.7.4 Multicharacter Lookahead

3.7.5 Performance Considerations

3.7.6 Lexical Error Recovery

3.8 Regular Expressions and Finite Automata

3.8.1 Transforming a Regular Expression into an NFA

3.8.2 Creating the DFA

3.8.3 Optimizing Finite Automata

3.9 Summary

Chapter 4 Grammars and Parsing

4.1 Context-Free Grammars: Concepts and Notation

4.1.1 Leftmost Derivations

4.1.2 Rightmost Derivations

4.1.3 Parse Trees

4.1.4 Other Types of Grammars

4.2 Properties of CFGs

4.2.1 Reduced Grammars

4.2.2 Ambiguity

4.2.3 Faulty Language Definition

4.3 Transforming Extended Grammars

4.4 Parsers and Recognizers

4.5 Grammar Analysis Algorithms

4.5.1 Grammar Representation

4.5.2 Deriving the Empty String

4.5.3 First Sets

4.5.4 Follow Sets

Chapter 5 Top-Down Parsing

5.1 Overview

5.2 LL(k) Grammars

5.3 Recursive-Descent LL(1) parsers

5.4 Table-Driven LL(1) Parsers

5.5 Obtaining LL(1) Grammars

5.5.1 Common Prefixes

5.5.2 Left-Recursion

5.6 A Non-LL(1) Language

5.7 Properties of LL(1) Parsers

5.8 Parse-Table Representation

5.8.1 Compaction

5.8.2 Compression

5.9 Syntactic Error Recovery and Repair

5.9.1 Error Recover

5.9.2 Error Repair

5.9.3 Error Detection in LL(1) Parsers

5.9.4 Error Recovery in LL(1) Parsers

Chapter 6 Bottom-Up Parsing

6.1 Introduction

6.2 Shift-Reduce Parsers

6.2.1 LR Parsers and Rightmost Derivations

6.2.2 LR Parsing as Knitting

6.2.3 LR Parsing Engine

6.2.4 The LR Parse Table

6.2.5 LR(k) Parsing

6.3 LR(0) Table Construction

6.4 Conflict Diagnosis

6.4.1 Ambiguous Grammars

6.4.2 Grammars that are not LR(k)

6.5 Conflict Resolution for LR(0) Tables

6.5.1 SLR(k) Table Construction

6.5.2 LALR(k) Table Construction

6.6 LR(k) Table Construction

Chapter 7 Syntax-Directed Translation

7.1 Overview

7.1.1 Semantic Actions and Values

7.1.2 Synthesized and Inherited Attributes

7.2 Bottom-Up Syntax-Directed Translation

7.2.1 Example

7.2.2 Rule Cloning

7.2.3 Forcing Semantic Actions

7.2.4 Aggressive Grammar Restructuring

7.3 Top-Down Syntax-Directed Translation

7.4 Abstract Syntax Trees

7.4.1 Concrete and Abstract Trees

7.4.2 An Efficient AST Data Structure

7.4.3 Infrastructure for Creating ASTs

7.5 AST Design and Construction

7.5.1 Design

7.5.2 Construction

7.6 AST Structures for Left and Right Values

7.7 Design Patterns for ASTs

7.7.1 Node Class Hierarchy

7.7.2 Visitor Pattern

7.7.3 Reflective Visitor Pattern

Chapter 8 Symbol Tables and Declaration Processing

8.1 Constructing a Symbol Table

8.1.1 Static Scoping

8.1.2 A Symbol Table Interface

8.2 Block-Structured Languages and Scope Management

8.2.1 Handling Scopes

8.2.2 One Symbol Table or Many?

8.3 Basic Implementation Techniques

8.3.1 Entering and Finding Names

8.3.2 The Name Space

8.3.3 An Efficient Symbol Table Implementation

8.4 Advanced Features

8.4.1 Records and Typenames

8.4.2 Overloading and Type Hierarchies
8.4.3 Implicit Declarations

8.4.4 Export and Import Directives

8.4.5 Altered Search Rules

8.5 Declaration Processing Fundamentals

8.5.1 Attributes in the Symbol Table

8.5.2 Type Descriptor Structures

8.5.3 Type Checking Using an Abstract Syntax Tree

8.6 Semantic Processing of Simple Declarations

8.6.1 Simple Variable Declarations

8.6.2 Handling Type Names

8.6.3 Name References

8.6.4 Type Declarations and References

8.6.5 Variable Declarations Revisited

8.6.6 Enumeration Types

8.7 Semantic Processing for Simple Names and Expressions: An Introduction to Type Checking

8.7.1 Handling Simple Identifiers and and Literal Constants

8.7.2 Processing Expressions

8.8 Type Declarations

8.9 Static and Dynamic Allocation

8.9.1 Initialization of Variables

8.9.2 Constant Declarations

8.10 Classes and Structures

8.10.1 Variant Records and Unions

8.11 Arrays

8.11.1 Static One-Dimensional Arrays

8.11.2 Multidimensional Arrays

8.12 Implementing Other Types

8.13 Key Idea Summary

Chapter 9 Expressions and Type Checking

9.1 Semantic Analysis for Control Structures

9.1.1 If Statements

9.1.2 While, Do and Repeat Loops

9.1.3 For Loops

9.1.4 Break, Continue, Return and Goto Statements

9.1.5 Switch and Case Statements

9.1.6 Exception Handling

9.2 Semantic Analysis of Calls

Chapter 10 Intermediate Representations

10.1 Overview

10.1.1 Examples

10.1.2 The Middle End

10.2 Java Virtual Machine

10.2.1 Introduction and Design Principles

10.2.2 Contents of a Class File

10.2.3 JVM Instructions

10.3 Static Single Assignment Form

10.3.1 Renaming and --functions

10.4 GCC ILs

Chapter 11 Code Generation for a Virtual Machine

11.1 Visitors for Code Generation

11.2 Class and Method Declarations

11.2.1 Class Declarations

11.2.2 Method Declarations

11.3 The MethodBodyVisitor

11.3.1 Constants

11.3.2 References to Local Storage

11.3.3 Static References

11.3.4 Expressions

11.3.5 Assignment

11.3.6 Method Calls

11.3.7 Field References

11.3.8 Conditional Execution

11.3.9 Loops

11.4 The LHSVisitor

11.4.1 Local References

11.4.2 Static References

11.4.3 Field References

Chapter 12 Runtime Support

12.1 Static Allocation

12.2 Stack Allocation

12.2.1 Accessing Frames at Run-Time

12.2.2 Handling Classes and Objects

12.2.3 Handling Multiple Scopes

12.2.4 Block-Level Allocation

12.2.5 More About Frames

12.3 Heap Management

12.3.1 Allocation Mechanisms

12.3.2 Deallocation Mechanisms

12.3.3 Automatic Garbage Collection

Chapter 13 Target Code Generation

13.1 Translating Bytecodes

13.1.1 Allocating memory addresses

13.1.2 Allocating Arrays and Objects

13.1.3 Method Calls

13.1.4 Example

13.2 Translating Expression Trees

13.3 Register Allocation and Temporary Management

13.3.1 On the Fly Register Allocation

13.3.2 Register Allocation Using Graph Coloring

13.3.3 Priority Based Register Allocation

13.3.4 Interprocedural Register Allocation

13.4 Code Scheduling

13.4.1 Improving Code Scheduling

13.4.2 Global and Dynamic Code Scheduling

13.5 Automatic Instruction Selection

13.5.1 Instruction Selection Using BURS

13.5.2 Instruction Selection Using Twig

13.5.3 Other Approaches

13.6 Peephole Optimization

13.6.1 Levels of Peephole Optimization

13.6.2 Automatic Generation of Peephole Optimizers

Chapter 14 Program Optimization 505

14.1 Introduction

14.1.1 Why Optimize?

14.1.2 Organization

14.2 Data Flow Analysis

14.2.1 Introduction and Examples

14.2.2 Formal Specification

14.2.3 Evaluation WARNING this subsection is incomplete

14.2.4 Application of Data Flow Frameworks

14.3 Advanced Optimizations

14.3.1 SSA Form

14.3.2 SSA-based Transformations

14.3.3 Loop Transformations