Niveles de la comprensión lectora

2.2. Bases teóricas

2.2.10. Niveles de la comprensión lectora

We investigate a set ofequivalence properties(or law)pthat state when two non-identical expressions E1 and E2 are equivalent, denoted by E1 ∼p E2. Given p, we can either

reformulateE1 into E2 or E2 into E1. For example, given commutativity as property of

conjunction, expressionsA_∧B andB_∧Aare equivalent, which we denote(A_∧B)_∼comm (B∧A). Therefore, we can replace the former with the latter and vice versa.

Thus, we consider two different things: first, an equivalence property or law that states when two expressions are equivalent, e.g. commutativity. Second, we consider reformulations that rewrite one expression into another representation, exploiting the equivalence property. Note, that such reformulations need not be deterministic. As an example, the expressionA_∧B_∧C can be rewritten into 5 different representations exploiting commutativity, such asC _∧B _∧AorA_∧C _∧B. Furthermore, each such reformulation has an inverse. Note, that a reformulation is beneficial, if the chosen representation is not worse than the representation that is replaced.

To apply reformulations in a structured, efficient way, adetectionstep is required, where two (or more) equivalent but not identical subtrees are spotted in an instance. Naturally, the effort required to detect equivalences can be arbitrarily large, especially for very powerful reformulations. For instance, detecting a maximal clique of disequalities to match global constraintalldifferentis NP-complete. Therefore, we are interested in measures with low

detection effort but high node-reduction potential.

In summary, we can increase the number of identical subexpressions in an instance by exploiting an equivalence propertypbetween two distinct expressionsE1andE2and rewrit-

ing one representation into the other by applying a corresponding reformulationr. Ideally, this approach has the following properties:

• Thedetectionof two equivalent expressions is cheap and integrable into tailoring

• ThereformulationfromE1 toE2 (orE2 toE1) is cheap.

• It is easy to determine which of the two equivalent representationsE1orE2isprefer- ableso that the reformulation does not impair the instance.

In the following, we present an algorithm that applies a generic reformulationr (with re- spect to a generic equivalence propertyp) to rewrite subexpressions into identical representations. Subsequently, we discuss a set of equivalences property and their applicability for increasing the number of identical subexpressions.

An Algorithm for Reformulation

We consider an algorithm to increase the number of identical subexpressions by reformulating expressions using a reformulationrbased on an equivalence propertyp. The algorithm is embedded into flattening when common subexpressions are detected. Whenever a to- be-flattened expressionE has no common subexpression, then the algorithm performs the following steps:

1. ReformulateE toEr, using reformulationr

2. Generate the String representationStringEr from expressionEr

3. Check the hash-table for an entry ofStringEr

4. If successful, return the corresponding auxiliary variable,auxr, after adding another

entry into the hash-table:E _−→auxr

5. Otherwise continue flattening

Note, that step 4 is necessary in caseE has common subexpressions in the constraint instance: if another occurrence ofE is flattened, the hash-table check will be positive during standard CSE and the reformulation need not be repeated.

More formally, the algorithm is summarised in Alg. 4.3 that illustrates the extensions to the recursive flattening procedure that performs CSE, FLATTEN CSEinred font: whenever a non-leaf childeiofEhas no CS,eiis reformulated using reformulationr(line 9), yielding

the reformulated expressioner. Next,eris converted to the StringStringR(line 10) and the

hash-table is checked for an entry ofStringR(line 11). If the check is successful, the hash-

entry is added to the hash-table, linking the original expression toaux, i.e. Stringei −→

aux. This assures that ifei appears again in the instance, the hash-table will have an entry

and the whole reformulation process won’t be repeated. Otherwise, if the hash-table has no entry ofStringer, flattening proceeds as usual.

Algorithm 4.3 Reformulation for CS-Increase. The recursive procedure FLATTEN REF

(E,flatten2Aux) is based on the CSE-flattening procedure from Alg. 4.2, ( FLATTEN CSE),

and performs a general reformulationrin order to increase the number of identical common subexpressions. Extensions are given inred font.

1: if¬(all ofE’s children are leaves)then

2: for allei∈children(E)do 3: if ¬(ei.isLeaf)then

4: Stringei←toString(ei)

5: if hash-table.contains(Stringei)then

6: aux_←hash-table.get(Stringei)

7: else

8: if ris applicable toeithen

9: er←r(ei)

10: StringR←toString(er)

11: if hash-table.contains(StringR) then

12: aux_←hash-table.get(StringR);

13: hash-table.add(Stringei,aux);;

14: else

15: aux←FLATTEN REF(ei,true);

16: hash-table.add(Stringei,aux)

17: else

18: aux←FLATTEN REF(ei,true);

19: hash-table.add(Stringei,aux)

20: E.replaceChildWith(ei,aux) 21: ifflatten2Aux then

22: Aux←createNewVariable(E.lb,E.ub);auxVars.add(Aux)

23: constraintBuffer.add(‘Aux=E’) 24: return Aux

25: else

26: return E

Generic Time Complexity

Alg. 4.3 is very general and its complexity depends on two factors. First, theapplicability

of reformulation r matters: r is usually applicable only to a certain kind of expressions. For instance, de Morgan’s Law can only be applied to particular Boolean expressions that are composed by disjunction and conjunction. Hence the algorithm also depends on the frequency of occurrence of the expression type to which r is applicable. We denote mr

the number of subexpressions in instancen to which ris applicable, where n _≥ nr ≥ 0.

Furthermore, we denotenr,u the number of unique nodes to which r is applicable, i.e. if

nr −nr,u > 0then there exist common subexpressions amongst the nodes to which r is

Second, it depends on the cost of reformulating expressionEtoEr, i.e. the cost of applying

ronE, which we denotecost(r,k), wherekis the number of nodes in the expression tree. Since Alg. 4.3 is based on CSE-flattening (Alg. 4.2), we analyse the corresponding extensions to derive the time complexity:

Applicability Check First, the to-be-flattened expressioneiis tested for applicability, whose

cost we denoteapplicr(k) wherek represents the number of nodes in the tested expression. This test is performed on all n! subexpressions that have no previously flattened common subexpression. In the worst case (if the instance contains no common subexpressions)n! = n, hence performing the check lies in O(n)∗applicr(ˆ_k),

whereˆ_k_{is the maximum number of subexpressions of any expression in the instance}

(in the worst case, if the constraint instance has only one constraint,ˆ_k ₌_n).

Reformulation Second, the reformulationris applied to those nodes that pass the check,

which are all unique nodes to which r is applicable, i.e. nr,u nodes. Note, that the

othernr−nr,unodes to whichris applicable, are common subexpressions, and hence

have a match in the hash-table, to which we addeiandaux(line 13). We denote the

cost of the reformulationcostr(k), wherekis again the number of subexpressions in the reformulated expression. Therefore, the reformulation step lies inO(n)∗costr(ˆk).

since in the worst case nr,u = nr = n, and where ˆk is the maximum number of

subexpressions of any expression in the instance (note that if the constraint instance has only one constraint,ˆ_k ₌_n).

toString Operation . Third, the reformulated expressioner is converted to String format

StringR, an operation that lies inO(k)wherek is the number of subexpressions the to-be-flattened expressionE contains. This is performed for allnr,usubexpressions,

where in the worst case, nr,u = n, yielding a runtime of O(ˆkn), where ˆk is the

maximal number of subexpressions an expression contains.

Hash-Table Operations Finally, hash-table operations are performed in order to retrieve

a common subexpression. The first hash-table check (line 11) is performed on all nr,u nodes and the following two hash-table operations are performed on all those

subexpressions that have a common subexpression. All hash-table operations are constant in average, but require to read the String representation, so we summarise the complexity with O(ˆsn), since in the worst case, nr,u = n, where sˆdenotes the

maximal String length of a subexpression in instancen.

In summary, the additional runtime complexity of Alg. 4.3 compared to CSE-flattening (Alg. 4.2) is:

O(n)∗applicr(ˆk) +O(n)∗costr(ˆk) +O(ˆkn) +O(ˆsn) (4.1)

In the following, we investigate different equivalence properties on both integral and Boolean expressions: associativity, commutativity, negation, distributivity, Horn Clauses and De

Morgan’s Law. In each case, we will apply the reformulation in Alg. 4.3 (if necessary) and analyse the corresponding runtime from Eq. 4.1.

In document El neobook 5 4 como material didáctico para fomentar la comprensión lectora niños (as) del 2do grado D de la I E 37001 (página 52-64)