• No se han encontrado resultados

TESIS DOCTORALS DIPOSITADES EL CURS 1999-

In document Memòria del curs acadèmic 1999 2000 (página 126-130)

DEPARTAMENT DE QUÍMICA

5. ACTIVITAT DE RECERCA

5.2 TESIS DOCTORALS DIPOSITADES EL CURS 1999-

In this chapter we present an extension of S-FTs where transitions are allowed to con- sume more than one symbol, calledextendedS-FTs orS-EFTs.

Definition 2.1. ASymbolic Extended Finite Transducer (S-EFT)withinput typeσandout- put typeγis a tupleA= (Q,q0,R),

• Qis a finite set ofstates; • q0∈ Qis theinitial state;

• Ris a finite set ofrules,R=∆∪F, where

∆is a set oftransitions r = (p,`,ϕ,f,q), denotedp−−→ϕ/`f q, where

p∈Qis thestartstate ofr;

`1 is thelook-aheadofr;

ϕ, theguardofr, is aσ`-predicate;

f, theoutputofr, is aλ-terms fromσ`to sequences ofγ; q∈ Qis thecontinuationstate ofr;

Fis a set offinalizers r = (p,`,ϕ,f), denoted p −−→ϕ/`f •, with components as above and where`may be 0.

Thelook-aheadof Ais the maximum of all look-aheads of rules in R. An S-EFT where all the rules have outputεis aSymbolic Extended Finite Automaton(S-EFA).

A finalizer is a rule without a continuation state. A finalizer with look-ahead`is used

when the end of the input sequence has been reached with exactly ` input elements

remaining. A finalizer is a generalization of a final state. In the non-symbolic setting, finalizers can be avoided by adding a new symbol to the alphabet that is only used to mark the end of the input. In the presence of arbitrary input types, this is not always possible without affecting the theory, e.g., when the input type isIntthen that symbol would have to be outsideInt.

In the remainder of the section let A = (Q,q0,R), R = ∆∪F, be a fixed S-EFT with input typeσand output typeγ. The semantics of rules inRis as follows:

[[p−−→ϕ/`f q]] def= {p −−−−−−−−−−−−−−→[a0,...,a`−1]/[[`f]](a0,...,a`−1) q|(a0, . . . ,a`1)∈[[ϕ]]}.

The notation p [a0,...,a`−1]/[[f]](a0,...,a`−1)

−−−−−−−−−−−−−−→` q is a shorthand for a tuple

(p,`,(a0, . . . ,a`1),[[f]](a0, . . . ,a`1),q). Intuitively, a rule with look-ahead ` reads `

adjacent input symbolss = [a0, . . . ,a`1]and produces a sequence of output symbols that is a function of the consumed input f(s).

Let[[R]] def=SrR[[r]]. We writes1·s2for the concatenation of sequencess1ands2.

Definition 2.2. For u ∈ Σ∗,v Γ,q Q,q0 Q∪ {•}, defineq −−→u/vA q0 as follows: there existsn≥0 and{pi −−→ui/vi pi+1 |i≤n} ⊆[[R]]such that

u=u0·u1· · ·un, v=v0·v1· · ·vn, q= p0, q0 = pn+1.

Let alsoq−→ε→/εA qfor allq∈Q.

Definition 2.3. Thetransduction relationof Ais defined asTA(u)=def{v|q0−−→u/→ •}v .

The following example illustrates typical (realistic) S-EFTs over a label theory of linear modular arithmetic. We use the following abbreviated notation for rules, by omitting explicitλ’s. We write

p ϕ(x¯)/[f1(x¯),...,fk(x¯)]

−−−−−−−−−−−→` q for p λx¯.ϕ(x¯)/λx¯.[f1(x¯),...,fk(x¯)]

−−−−−−−−−−−−−−→` q,

whereϕand fi are terms whose free variables are among ¯x= (x0, . . . ,x`1).

Example 2.4. This example illustrates the S-EFTs Base64encode and Base64decode correspond-

Base64encode is an S-EFT with one state and four rules: p true/[pb72(x0)q,p(b10(x0)4)|b47(x1)q,p(b03(x1)2)|b67(x2)q,pb50(x2)q] −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→3 p p−−−→true0/[] • p true/[pb72(x0)q,pb10(x0)4q,‘=’,‘=’] −−−−−−−−−−−−−−−−−−−−→1 • p true/[pb72(x0)q,p(b10(x0)4)|b47(x1)q,pb30(x1)2q,‘=’] −−−−−−−−−−−−−−−−−−−−−−−−−−−−−→2 • where bm

n(x)extracts bits m through n from x, e.g., b23(13) = 3, x|y is bitwise OR of x and y,

xk is x shifted left by k bits, andpxqis the mapping

pxq= (def x25 ?x+65 :(x51 ?x+71 :(x61 ?x4 :(x=62 ?‘+’:‘/’))))

of values between0and63into a standardized sequence of safe ASCII character codes (‘en’ in

Figure 2.3). The last two finalizers correspond to the cases when the length of the input sequence is not a multiple of three. Observe that the length of the output sequence is always a multiple of

four. The character‘=’(61 in ASCII) is used as a padding character and it is not a BASE64

digit. i.e.,‘=’is not in the range ofpxq.

Base64decode is an S-EFT that decodes aBASE64encoded sequence back into the original byte

sequence. Base64decode has also one state and four rules: q V3i=0β64(xi)/[(xx0y2)|b5 4(xx1y),(b3 0(xx1y)4)|b5 2(xx2y),(b1 0(xx2y)6)|xx3y] −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→4 q q−−−→true0/[] • q β64(x0)∧β064(x1)∧x2=‘=’x3=‘=’/[(xx0y2)|b5 4(xx1y)] −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→4 • q β64(x0)∧β64(x1)∧β0064(x2)∧x3=‘=’/[(xx0y2)|b5 4(xx1y),(b3 0(xx1y)4)|b5 2(xx2y)] −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→4

The functionxyyis the inverse ofpxq, i.e.,xpxqy= x, for0 ≤ x≤ 63. The predicateβ64(y)

is true iff y is a valid BASE64 digit, i.e., y = pxqfor some x,0 ≤ x ≤ 63. The predicates

β064(y)andβ0064(y)are restricted versions ofβ64(y). Unlike Base64encode, Base64decode does

not accept all input sequences of bytes, and sequences that do not correspond to any encoding are rejected.4

The following subclass of S-EFTs captures transductions that behave as partial func- tions fromΣ∗toΓ.

Definition 2.5. A functionf : X → 2Y issingle-valued if|f(x)| ≤ 1 for allx X. An S-EFTAis single-valued ifTAis single-valued.

A sufficient condition for single-valuedness is determinism. We defineϕfψ, where ϕ

is a σm-predicate and ψ a σn-predicate, as the σmax(m,n)-predicate

λ(x1, . . . ,xmax(m,n)).ϕ(x1, . . . ,xm)∧ψ(x1, . . . ,xn). We defineequivalence of f and g with

respect toϕ, f ≡ϕ g, as: IsValid(λx¯.(ϕ(x¯)⇒ f(x¯) =g(x¯))).

Definition 2.6. Aisdeterministicif for allp−−→ϕ/`f q,p−−−→ϕ0`/0f0 q0 R: (a) Assumeq,q0 Q. IfIsSat(ϕfϕ0)thenq=q0,`=`0 and f

ϕfϕ0 f0. (b) Assumeq= q0 =. IfIsSat(ϕfϕ0)and`=`0then f

ϕfϕ0 f0. (c) Assumeq∈ Qandq0 =. IfIsSat(ϕfϕ0)then` > `0.

Intuitively, determinism means that no two rules may overlap. It follows from the definitions that ifAis deterministic thenAis single-valued. Both S-EFTs in Example 2.4 are deterministic.

Thedomainof a functionf : X 2YisD(f) =def {x ∈ X | f(x) 6= ∅}and for an S-EFT

A, D(A) def= D(TA). When A is single-valued and u ∈ D(A), we treat Aas a partial

function from Σ∗ to Γand write A(u) for the value v such that T

A(u) = {v}. For example,Base64encode("Foo") ="Rm9v"andBase64decode("QmFy") ="Bar".

In document Memòria del curs acadèmic 1999 2000 (página 126-130)