Building a Recursive Descent Parser
This is a very basic overview and will not suffice to fully implement a recursive descent parser.
Productions
All production heads are non-terminals, so a function must exist for each non-terminal.
Production Body
For all terminals in the production body, call match(t), for all non-terminals call the associated function.
Example 1
Non-Terminals = {A, B}
Terminals = {a,c,b}
A() {
if (lookahead == 'a') {
match(a);
B();
B();
}
else if (lookahead == 'c') {
match(c);
A();
match(c);
B();
}
else {
error();
}
}
B() {
if (lookahead == 'b') {
match(b);
match(c);
match(a);
}
else if (lookahead == 'c') {
match(c);
}
else {
error();
}
}
match(token t) {
if (lookahead == t) {
lookahead = nexttoken();
}
else {
error();
}
}
Example 2
expr() {
if (lookahead == 'i') {
match(i);
cond();
match(t);
expr();
print("true");
match(e);
expr();
print("false");
}
// if epsilon, then do nothing!
}
cond() {
// some code here
}
match(token t) {
if (lookahead == t) {
lookahead = nexttoken();
}
else {
error();
}
}
Derivations
A syntax-tree can be leftmost or rightmost derived; left and right refer to the direction in which non-terminals are expanded.
Notation
reads "derives"
reads "derives in zero or more steps."
reads "derives in one or more steps."
If the sentential form contains no non-terminals, it is a Sentence.
The language of some Grammar G is the set of Sentences.
A language generated by a grammar is a context-free Grammar.
LL(1) Grammar
The first 'L' denotes the order in which input is scanned; in this case, from left to right. The second 'L' denotes the derivation direction and '1' represents the number of input characters viewed at a time.
Thus, LL(1) means we scan from left to right, building a leftmost derivation tree, while observing one character at a time.
LL Grammar is suitable for top-down parsers.
Last updated
Was this helpful?