Frenos - Evaluación y selección de alternativas

3.3 Evaluación y selección de alternativas

3.3.3 Frenos

The breadth-first solution to the top-down parsing problem is to maintain a list of all possible predictions. Each of these predictions is then processed as described in Sec- tion 6.2 above, that is, if there is a non-terminal in front, the prediction stack is replaced by several new prediction stacks, as many as there are choices for this non-terminal. In each of these new prediction stacks, the non-terminal is replaced by the corresponding choice. This prediction step is repeated for all prediction stacks it applies to (including the new ones), until all prediction stacks have a terminal in front. Then, for each of the prediction stacks we match the terminal in front with the current input symbol, and strike out all prediction stacks that do not match. If there are no prediction stacks left, the sentence does not belong to the language. So, instead of one prediction stack/analysis stack pair, our automaton now maintains a list of prediction stack/analysis stack pairs, one for each possible choice, as depicted in Figure 6.7.

matched input rest of input analysis1 prediction1 analysis2 prediction2

... ...

Figure 6.7 An instantaneous description of our extended automaton

The method is suitable for on-line parsing, because it processes the input from left to right. Any parsing method that processes its input from left to right and results in a left-most derivation is called an LL parsing method. The first L stands for Left to right, and the second L for Left-most derivation.

Now, we almost know how to write a parser along these lines, but there is one detail that we have not properly dealt with yet: termination. Does the input sentence belong to the language defined by the grammar when, ultimately, we have an empty prediction stack? Only when the input is exhausted! To avoid this extra check, and to avoid problems about what to do when we arrive at the end of sentence but haven’t fin- ished parsing yet, we introduce a special so-called end-marker ##, that is appended at the end of the sentence. Also, a new grammar rule SS’’-->>SS##is added to the grammar, where SS’’ is a new non-terminal that serves as a new start symbol. The end-marker behaves like an ordinary terminal symbol; when we have an empty prediction, we know that the last step taken was a match with the end-marker, and that this match suc- ceeded. This also means that the input is exhausted, so it must be accepted.

6.3.1 An example

Figure 6.8 presents a complete breadth-first parsing of the sentence aaaabbcc##. At first there is only one prediction stack: it contains the start-symbol; no symbols have been accepted yet (a). The step leading to (b) is a simple predict step; there is no other right-hand side forSS’’. Another predict step leads us to (c), but this time there are two possible right-hand sides, so we obtain two prediction stacks; note that the difference of the prediction stacks is also reflected in the analysis stacks, where the different suffixes of SS represent the different right-hand sides predicted. Another predict step with several right-hand sides leads to (d). Now, all prediction stacks have a terminal on top;

(a) aaaabbcc## (b) aaaabbcc## S S’’ SS’’₁₁ SS## (c) aaaabbcc## (d) aaaabbcc## SS’’₁₁SS₁₁ DDCC## SS’’₁₁SS₁₁DD₁₁ aabbCC## SS’’₁₁SS₂₂ AABB## SS’’₁₁SS₁₁DD₂₂ aaDDbbCC## S S’’₁₁SS₂₂AA₁₁ aaBB## S S’’₁₁SS₂₂AA₂₂ aaAABB## (e) aa aabbcc## (f) aa aabbcc## SS’’₁₁SS₁₁DD₁₁aa bbCC## SS’’₁₁SS₁₁DD₁₁aa bbCC## SS’’₁₁SS₁₁DD₂₂aa DDbbCC## SS’’₁₁SS₁₁DD₂₂aaDD₁₁ aabbbbCC## SS’’₁₁SS₂₂AA₁₁aa BB## SS’’₁₁SS₁₁DD₂₂aDaD₂₂ aaDDbbbbCC## SS’’₁₁SS₂₂AA₂₂aa AABB## S’S’₁₁SS₂₂AA₁₁aaBB₁₁ bbcc## S S’’₁₁SS₂₂AA₁₁aaBB₂₂ bbBBcc## S S’’₁₁SS₂₂AA₂₂aaAA₁₁ aaBB## S S’’₁₁SS₂₂AA₂₂aaAA₂₂ aaAABB## (g) aaaa bbcc## (h) aaaa bbcc## S S’’₁₁SS₁₁DD₂₂aaDD₁₁aa bbbbCC## SS’’₁₁SS₁₁DD₂₂aaDD₁₁aa bbbbCC## S S’’₁₁SS₁₁DD₂₂aDaD₂₂aa DDbbbbCC## SS’’₁₁SS₁₁DD₂₂aaDD₂₂aaDD₁₁ aabbbbbbCC## S S’’₁₁SS₂₂AA₂₂aAaA₁₁aa BB## SS’’₁₁SS₁₁DD₂₂aaDD₂₂aaDD₂₂ aaDDbbbbbbCC## S S’’₁₁SS₂₂AA₂₂aaAA₂₂aa AABB## SS’’₁₁SS₂₂AA₂₂aaAA₁₁aaBB₁₁ bbcc## S S’’₁₁SS₂₂AA₂₂aaAA₁₁aaBB₂₂ bbBBcc## S S’’₁₁SS₂₂AA₂₂aaAA₂₂aaAA₁₁ aaBB## S S’’₁₁SS₂₂AA₂₂aaAA₂₂aaAA₂₂ aaAABB## (i) aaaabb cc## (j) aaaabb cc## S S’’₁₁SS₁₁DD₂₂aaDD₁₁aabb bbCC## SS’’₁₁SS₁₁DD₂₂aaDD₁₁aabb bbCC## S S’’₁₁SS₂₂AA₂₂aAaA₁₁aaBB₁₁bb cc## SS’’₁₁SS₂₂AA₂₂aaAA₁₁aaBB₁₁bb cc## S S’’₁₁SS₂₂AA₂₂aaAA₁₁aaBB₂₂bb BBcc## SS’’₁₁SS₂₂AA₂₂aaAA₁₁aaBB₂₂bbBB₁₁ bbcccc## S S’’₁₁SS₂₂AA₂₂aaAA₁₁aaBB₂₂bbBB₂₂ bbBBcccc## (k) aaaabbcc ## (l) aaaabbcc## S S’’₁₁SS₂₂AA₂₂aAaA₁₁aaBB₁₁bbcc ## SS’’₁₁SS₂₂AA₂₂aAaA₁₁aaBB₁₁bbcc##

Figure 6.8 The breadth-first parsing of the sentenceaaaabbcc##

all happen to match, resulting in (e). Next, we again have some predictions with a non-terminal in front, so another predict step leads us to (f). The next step is a match step, and fortunately, some matches fail; these are just dropped as they can never lead to a successful parse. From (g) to (h) is again a predict step. Another match where, again, some matches fail, leads us to (i). A further prediction results in (j) and then two matches result in (k) and (l), leading to a successful parse (the predict stack is empty). The analysis is

Sec. 6.3] Breadth-first top-down parsing 127

SS’’

11SS22AA22aaAA11aaBB11bbcc##.

For now, we do not need the terminals in the analysis; discarding them gives

SS’’

11SS22AA22AA11BB11.

This means that we get a left-most derivation by first applying rule SS’’

1, then rule SS22,

then ruleAA

2, etc., all the time replacing the left-most non-terminal. Check:

SS’’ -->> SS## -->> AABB## -->> aaAABB## -->> aaaaBB## -->> aaaabbcc##.

The breadth-first method described here was first presented by Greibach [CF 1964]. However, in that presentation, grammars are first transformed into Greibach Normal Form, and the steps taken are like the ones our initial pushdown automaton makes. The predict and match steps are combined.

6.3.2 A counterexample: left-recursion

The method discussed above clearly works for this grammar, and the question arises whether it works for all context-free grammars. One would think it does, because all possibilities are systematically tried, for all non-terminals, in any occurring prediction. Unfortunately, this reasoning has a serious flaw that is demonstrated by the following example: let us see if the sentence aabb belongs to the language defined by the simple grammar

S -->> SSbb || aa

Our automaton starts off in the following state:

a abb## S S’’

As we have a non-terminal at the beginning of the prediction, we use a predict step, resulting in: a abb## S S’’₁₁ SS##

Now, another predict step results in:

a abb## S S’’₁₁SS₁₁ SSbb## S S’’₁₁SS₂₂ aa##

a abb## SS’’₁₁SS₁₁SS₁₁ SSbbbb## SS’’₁₁SS₁₁SS₂₂ aabb## S S’’₁₁SS₂₂ aa##

By now, it is clear what is happening: we seem to have ended up in an infinite process leading us nowhere. The reason for this is that we keep trying theSS-->>SSbbrule without ever coming to a state where a match can be attempted. This problem can occur whenever there is a non-terminal that derives an infinite sequence of sentential forms, all starting with a non-terminal, so no matches can take place. As all these sentential forms in this infinite sequence start with a non-terminal, and the number of non-terminals is finite, there is at least one non-terminal A occurring more than once at the start of those sentential forms. So, we have: A → . . . →Aα. A non-terminal that derives a sentential form starting with itself is called left-recursive. Left recursion comes in two kinds: we speak of immediate left-recursion when there is a grammar rule

A→Aα, like in the ruleSS-->>SSbb; we speak of indirect left-recursion when the recursion goes through other rules, for instance A→Bα, B→Aβ. Both forms of left-recursion can be concealed byε-producing non-terminals. For instance in the grammar

S S -->> AABBcc B B -->> CCdd B B -->> AABBff C C -->> SSee A A -->> εε

the non-terminalsSS,BB, andCCare all left-recursive. Grammars with left-recursive non- terminals are called left-recursive as well.

If a grammar has noε-rules and no loops, we could still use our parsing scheme if we use one extra step: if a prediction stack has more symbols than the unmatched part of the input sentence, it can never derive the sentence (noε-rules), so it can be dropped. However, this little trick has one big disadvantage: it requires us to know the length of the input sentence in advance, so the method no longer is suitable for on-line parsing. Fortunately, left-recursion can be eliminated: given a left-recursive grammar, we can transform it into a grammar without left-recursive non-terminals that defines the same language. As left-recursion poses a major problem for any top-down parsing method, we will now discuss this grammar transformation.

In document Análisis y desarrollo del sistema de frenos para un Fórmula Student (página 32-35)