• No se han encontrado resultados

1.3. Objetivos de la investigación

3.2.2. Dimensionamiento y cálculos justificatorios

In this section, we conduct careful examination of the Haskell code and try possible improvement to get better performance. For the detailed description of the analysis and improvement using the heap profiling tool [Rundman92a, Rundman93a] that comes with hbc [Augustsson92a], see Appendix B.

8.4.1 A ttaching 'le n g th ' to the interval structure (Version 2)

In the H askell code shown in Figure 8-2 (Version 1), there is the i p function which is responsible for the whole sequence from file input, image processing, to file output, and the following code is in the subdefinitions of ip :

im21 (0,1) = (map chr . concat) (rep o bgr ++ map flatten 1 ++

rep (height-o- length 1) bgr) flatten (o,1) = (rep o bgp)++1++(rep (width-o- length 1) bgp)

Input and output streams are long lists of characters, but image processing is defined on a 2D interval structure, a pair of an origin and a list of elements (Section 3.2.2). irti21 is designed to convert a 2D interval structure to a plain list, and f l a t t e n converts a ID interval to a plain

- 148-

list. The need for these functions comes from the intention to have equal sizes of input and output/ so that the results of the three versions can be made identical. Thus, in the original implementation in C hapter 3, this issue was not considered at all. In the im 21 function, there is the code fragment " le n g t h 1". Usually l e n g t h itself consumes a list but does not keep

the w hole l i s t But in the above code, because the list *1' is shared between f l a t t e n and

le n g t h , and the l e n g t h function is not evaluated until the end, the content of the whole list

is kept until l e n g t h is evaluated. And here, this is the whole image!

As discussed in Chapter 23 of [Peyton Jones87a] and in [Hughes84a], there is a lot of

subtlety in behaviour of lazy functional programs. Our code may be a good example of space

leaks caused by the scheduling problem. Because the Haskell code handles an image as a list of

lists of pixels and lists are treated lazily through input, process and output, and the process is a simple local neighbourhood operation that does not rely on data at distant positions, it should not cause any space leaks. However, as shown in Appendix B., our code had a space leak for the above reason.

The code has been modified to overcome the problem of accumulating the whole image. The interval structure has been modified to become a triple of (origin, length, list). The length was considered to be redundant in the original code in Chapter 3, but now it seems a good idea to add the new member to the data structure. The length of a now and the length of an image are provided as w id th and h e i g h t respectively in the header information of a rasterfile. So, there is almost no overhead in obtaining this information. The new interval structure in Haskell is:

type Interval a = (Int,Int,[a])

The subdefinitions for the improved i p function are shown as follows:

112im 11 = (0,height,map fn 11) where fn x = (0,width,x)

im21 (o,In,1) = (map chr . concat) (rep o bgr ++ map flatten 1 ++ rep (height-o- In) bgr) flatten (o,ln,l) = (rep o bgp)++1++(rep (width-o- In) bgp)

The other functions which need modification are the unary and binaiy pointwise operations, and the translation of intervals:

unarylnterval f (o,ln,p) = (o,in, map f p) binarylnterval f (ol,lnl,pl) (o2,ln2,p2)

I (ol < o2) = (o. In, zipWith f (drop (o2-ol) pi) p2)

I otherwise = (o. In, zipWith f pi (drop (ol-o2) p2))

where o = max ol o2

In = max 0 ((min (ol+lnl) (o2+ln2)) - o) translatelnterval d (o,ln,p) = (o+d,ln,p)

Also modifications of c o n v l and makeMask are necessary, as well as related functions such as dom ain, e le m e n t, etc. A new definition of these functions is as follows:

convl mul add mask im

» accum [prod (element mask p) (shiftlnterval p) I p<-domain mask]

where

accum " foldll add

prod X - unarylnterval (mul x)

shiftlnterval p = translatelnterval (p-((second mask) 'div' 2)) im

domain = subscripts.third

element = (!.'). third

second (a,b,c) = b third (a,b,c) = c

makeMask n = fn (map fn (rep n (rep n 0))) where fn x = (0,n,x)

In any of the above functions, it can be said that the modifications are relatively trivial. There is no change in the fundamental algorithm, but just addition of an extra parameter to save calculations.

8.4.2 E lim inate identical operations (Version 3)

The basic algorithm of our median filtering is to generate a list image, in which the length of each pixel list is equal to the square of the mask size. Then, the m edian of each pixel (list) is taken as a unary pointwise operation. Therefore, the function to take a median is the same all through the list image, but a compiler does not spot this fact. Lazy evaluation shares the same expressions only when these occur in the same function call. In our original code (Version 1) the index of a median is calculated every time, e.g. 65536 times for a 256x256 image. The following is the original code to take a median:

median list = (sort list)!!((length list) 'div' 2)

where the length does not change, e.g. if a median mask is 3x3 the index, i.e. " ( l e n g t h l i s t ) ' d i v ' 2", is 4 and is constant all through the operations. Utilising this knowledge, it is

- 150-

possible to modify the code, so that the index is calculated only once before the median function is called. The new definition follows;

medlanlmage n

= (unaryPolntwlse (rankFilter m)).(localHistImage (makeMask n)) where m = n*n 'div' 2

rankFilter m list - (sort list)!!m

where n is the mask size. Since Ihe new function takes a rank order as a parameter, it works as a general rank filter rather than only a median. Hence, the name has been changed.

8.4.3 Whether to fold up a list from left or right? (Version 4)

Wedefined thehigher-orderconvolutionfunction(convl) to take m ultiplicative and additive operations as parameters and accumulation is defined within the function as " f o l d ! add". As discussed in Chapter 6 of [BirdSSa], fold operations behave very subtly; for functions, such as (+) or (*), that are strict in both arguments and can be computed in constant time and space, f o l d l is more efficient. Whereas for functions, such as (& ) or (++), that are non-strict in some argument, f o l d r is often more efficient. Therefore, it may be a mistake to hard-code the direction of accumulation within the function definition of convolution.

Based on the above consideration, the new definition of c o n v l below takes an accumulation instead of an additive operation. Since an append operator (++) is passed as an argument in order to produce a list image, accumulate from the right should be more efficient. The modified code is the following:

convl mul accum mask im

= accum [prod (element mask p) (shiftInterval p) I p<-domain mask]

where

prod X = unarylnterval (mul x)

shiftlnterval p = translatelnterval (p-((second mask) div' 2)) ira

localHistImage = convl f accum

where f = localHistRow

accum = f o l d r l (b in a r y P o in tw is e (++))

localHistRow = convl f accum

where f a b = [b]

8.4.4 Use of cons instead of append (Version 5)

It is generally quicker to use cons (:) instead of append C++) to attach an element to a list, because in order to append two lists, the one in front should be traversed. In the median filtering it is possible to use cons in the ID local histogramming operation. The modified function is:

lo c a lH is tR o w

* conv3 f accum

where f a b = b

accum (r:rs) * foldr (binaryRow (:)) (unaryRow (:[]) r) rs

8.4.5 Insertion sort instead of quick sort (Version 6)

As we discussed in optimising C code (Section 8.3.1), insertion sort is normally regarded as one of the best choices for sorting small lists like this. We implemented insertion sort in Haskell as follows and replaced the library quick sort with this insertion sort:

— insertion sort

sort :: Ord a => [a] -> [a]

sort [] “ []

sort (x:ys) * insert x (sort ys)

insert Ord a => a -> [a] -> [a]

insert x [] = [x]

insert x ys0(y:ys')

I X <= y = x:ys

Documento similar