• No se han encontrado resultados

PARTE I. ENFOQUES TEÓRICO Y METODOLÓGICO EN LA INVESTIGACIÓN DE LA

Capítulo 1. Aproximaciones teóricas

2.2. Técnicas de análisis

2.2.2. Análisis de estadísticos

2.2.2.2. Rho de Spearman

For the simple calculus above we were able to define semantics in terms of simple sets. However, more generally we need spaces with additional structure, such as when the calculus includes computation with side-effects. Note that side-effects are basically unavoidable, since all useful programs need to at least produce some output.

A common side effect of computation involves updating values of mutable variables. It is customary to include mutable variables in the lambda calculus in a form of references to mutable cells with the syntax shown below.

t, s, r ::= terms . . . as before

| ref t allocate a mutable cell | t := s assign to a mutable cell | !t dereference a mutable cell | t; s sequence operations

A reference to a mutable cell holding values of typeτ is itself given a typeτ ref. On top

of constructs for working with mutable cells we add a sequencing operations, so that we can return a result after updating a cell.

With this extension the equivalence shown at the end of the previous subsection no longer holds. To see that, consider the function

f = (λr : N ref.(λx : N.r :=!r + x; !r))(ref 0).

Now f (0) + f (1) evaluates to 1 while f (1) + f (0) evaluates to 2, assuming left-to-right evaluation order. Once we have introduced computation with global effects into the calculus, the order of evaluation becomes important. To reason about equivalences of programs using mutable cells we need to somehow account for the state of those cells in the semantics. The standard solution is to associate each type with a function that additionally updates a global state.

Let G be a set of possible global states, with a state being a partial functions from locations in a set L to values - the details of this construction are not important and we skim over the fact that it would require restrictions on the types for which references can be constructed to avoid divergence. We associate types with functions that produce a value and a new state based on the current state. For example,⟦N⟧ := G → N×G and ⟦τref⟧ := G → L × G.

Recall that for the basic calculus giving semantics to terms was trivial. Now with the extension to reference cells defining the semantics requires carefully feeding the state through, which is cumbersome and repetitive. For example,the semantics of addition would be

⟦t + s⟧(ρ)(g) := let (x, g′) =⟦t⟧(ρ)(g) in let (y, g′′) =⟦s⟧(ρ)(g′) in (x + y, g′′).

We do not show the other rules here, but in almost all them we would find the pattern of feeding through the updates to the global state. In the simple calculus we have shown doing this manually is manageable, but the effort would quickly get too much once we started adding more constructs and other effects. Worse yet, performing any kind of reasoning about the program would be nearly impossible due to all the clutter in the semantics. Fortunately, the pattern in question is very repetitive so we can hope to use additional abstractions to hide it.

At this point we step back a bit and consider more generally the problem of defining denotational semantics in a reusable fashion. In the construction presented above we map types to sets, but we might want to use spaces with more structure instead. Those could be measurable spaces for probabilistic programs or domains if the calculus allowed for

non-terminating computation. Despite this change, the semantics of the constructs from the basic calculus would be essentially the same. We would therefore like to define their semantics more abstractly, assuming only that the collection of spaces in question has certain properties. Conveniently such a framework is provided by category theory.

Categories are defined in terms of objects and morphisms which generalise sets and functions respectively. Semantics of specific language constructs are then defined in terms of abstract properties of a category. For example, to accommodate product types the category needs to have products that satisfy certain axioms, and for function types it needs exponentials. A category with these and several more properties can be used to give semantics to a simple lambda calculus and is called a Cartesian-closed category. To recover the construction presented above we simply choose the Set category where objects are sets and morphisms are functions between them.

To give semantics to programs with side effects, such as mutable reference cells, we need a categorical structure called a (strong) monad [59]. For example, the construction above is the state monad for the global store in the Set category. More generally a monad defines two operations: how to embed pure computation in a monad and how to chain computations in the monad where one computation uses the result of the other. These operations are traditionally called return and >>= (pronounced ‘bind’) and for a monad M they have the types

return : X → MX >>=: MX → (X → MY ) → MY.

Going back to the concrete case of the calculus with reference cells, we can define a monad S such that SX B G → X × G and the required operations are defined as

return(x)(g) := (x, g)

(t >>= f )(g) := let (x, g′) = t(g) in f (x)(g′).

This construction captures precisely the pattern of feeding through the global state as the computation proceeds. We can use the monadic interface to provide an equivalent definition of semantics for addition in our calculus, that is

⟦t + s⟧(ρ)(g) :=⟦t⟧(ρ) >>=λx. ⟦s⟧(ρ) >>=λy. return(x + y) .

The advantage of this definition is that it is stated in terms of an abstract monad, so it can be ported to a different setting without any change. For example, we could construct

a probability monad of measurable spaces and use exactly this definition for semantics of addition in a probabilistic program.

Readers unfamiliar with denotational semantics may be overwhelmed with various constructions introduced in this section. Our goal is not so much to teach the techniques for constructing semantics as it is to provide the readers with some understanding of why our developments in this dissertation make use of so many esoteric concepts. In essence, this is because constructing the semantics directly would be very complicated, likely to the point where neither the authors nor the readers would have a good grasp of what is going on. Thus in the subsequent chapters we take advantage of the standard abstractions established in the programming languages community to separate our construction into different levels that to a large extent can be appreciated separately.

Generally to construct denotational semantics for a calculus with effects we need to do the following:

1. construct a suitable category in which the programs can be interpreted,

2. show that the category has the required properties to interpret the standard constructs used in the calculus,

3. construct a suitable monad to interpret the effects.

This is precisely the approach we take in Chapter 4 to define the semantics for our probabilistic calculus. We do not review any category theory in this chapter, recognising that readers familiar with it do not need this review and readers who are not are unlikely to find such a review sufficient. All the category theoretic content in this dissertation is isolated to Chapter 4 and can be skipped without significantly affecting the understanding of the other parts of the dissertation.

As a final remark on monads, observe that we can take the definition of the semantics for addition above and introduce some line breaks to arrive at the following form:

⟦t + s⟧(ρ)(g) :=

⟦t⟧(ρ) >>=λx.

⟦s⟧(ρ) >>=λy.

return(x + y).

We can now apply a line-by-line transformation to introduce a popular syntactic sugar for monads known as the do syntax

⟦t + s⟧(ρ)(g) := do

x←⟦t⟧(ρ)

y←⟦s⟧(ρ)

return(x + y).

We make extensive use of this notation, both in the calculus and in the implementation.