Maurice's Notes
Blog
Low Level Computing
Low Level Computing
  • Operating Systems
    • General Operating Systems
      • OS Structure
      • Main Memory
        • Basic Hardware
        • Address Binding
        • Memory Address Register
      • Booting
        • MBR (Master Boot Record)
        • Global Descriptor Table
      • Direct Memory Access (DMA)
        • DMA
      • Processes
        • Basics
        • Process Scheduling
    • Linux Operating System
      • Linker Scripts
      • Position Independent Code/Executable
      • Relocation
      • Understanding PLT and GOT
    • Windows Operating System
      • Page 1
    • Real-Time Embedded Systems
      • Real-Time Scheduling
        • Cyclic Executive
  • Computer Architecture
    • Architecture Fundamentals
      • Introduction
      • Cache Basics
      • Cache Memory
      • A Few CPU Formulas
    • RISC Architectures
      • ARM
        • ARM Design Philosophy
        • RISC Review
        • Exceptions, Interrupts, & Vector Table
        • ARM Pipelines
        • ARM Registers
        • ARM Branch Instructions
        • ARM CSPR (Instructions)
        • ARM Data Processing Instructions
        • Load/Store Instructions
        • Profiling Cycle Counter
        • Compiler Optimizations
      • RISCV
    • CISC Architectures
    • Cache Coherency
      • Basic Introduction
      • Memory Sequential Consistency
  • Exploits
    • Walkthrough: Return-to-Libc
    • Access Physical Memory
  • Compilers & Runtime Systems
    • Introduction
      • Programming Language Basics
      • Static Scope
    • Syntax Translation
      • Syntax Defined
      • Parsing
    • Algorithms
      • FIRST FOLLOW (Top-Down) Parsing
      • Building a Recursive Descent Parser
      • Construction: Regular Expression -> NFA
Powered by GitBook
On this page

Was this helpful?

  1. Compilers & Runtime Systems
  2. Syntax Translation

Parsing

PreviousSyntax DefinedNextAlgorithms

Last updated 8 months ago

Was this helpful?

Parsing is taking a string of terminals and figuring out how to derive it from the start symbol of the grammar. If it cannot be derived from the start symbol of the grammar, the parser must report syntax errors within the string.

Parse Trees

Parse trees pictorially represent how the start symbol of a grammar derives a string in a language. Consider the following A -> XYZ; a parse would look like this:

Parse Tree Rules:

  1. The root is labeled by the start symbol

  2. Each interior node is labeled by a nonterminal

list -> list + digit
list -> list - digit
list -> digit
digit -> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

From left to right, the leaves of a parse tree form the yield of the tree, which is the string generated or derived from the root of the parse tree.

The process of finding a parse tree for a given string of terminals is called parsing.

Each leaf is labeled by a terminal or by ϵ\epsilonϵ

If A is the nonterminal labeling some interior node and X1,X2,X3,...,XnX_1, X_2, X_3, ..., X_nX1​,X2​,X3​,...,Xn​ are the labels of the children of that node from left to right, then there must exist a production A -> X1X2...XnX_1X_2 ... X_nX1​X2​...Xn​ , where X1X2...XnX_1X_2 ... X_nX1​X2​...Xn​ are terminal or nonterminal symbols.

A -> ϵ\epsilonϵ is a special case requiring the interior node A to have a single child labeled ϵ\epsilonϵ.

Consider the example statement from the : 9 - 5 + 2. We created the following grammar productions:

Since list -> list + digit is the first production the nonterminal list (on the left side) is the start symbol, and also the root of the parse tree. The root will have three children list, +, and digit. Recall that interior nodes are labeled by a nonterminal; you should immediately recognize the the plus operator is a terminal label, and cannot be an interior node; it is a leaf. (leaf nodes are nonterminal or ϵ\epsilonϵ ).

previous page
Drawing
Drawing