Test Suite Minimization Problem
FranisoChiano,JavierFerrer,andEnriqueAlba
UniversityofMalaga,Spain,
fhiano,ferrer,albagl.u ma.e s
Abstrat. Landsapetheoryprovidesaformalframeworkinwhih
om-binatorialoptimizationproblemsanbetheoretiallyharaterizedasa
sumofaspeialkindoflandsapealledelementarylandsape.The
de-omposition of theobjetive funtionof aproblem into itselementary
omponentsprovidesadditional knowledgeonthe problemthatanbe
exploited to reate new searh methods for the problem. We analyze
the Test Suite Minimization problem in Regression Testing from the
pointofviewoflandsapetheory.Wendtheelementarylandsape
de-ompositionoftheproblem andproposeapratialappliationofsuh
deompositionforthesearh.
Keywords: Fitnesslandsapes,testsuiteminimization,regression
test-ing,elementarylandsapes
1 Introdution
Thetheory oflandsapesfouseson theanalysisof thestruture ofthe searh
spaethatisinduedbytheombinedinuenesoftheobjetivefuntionofthe
optimization problem andthe hoie neighborhood operator [8℄.In theeld of
ombinatorialoptimization,thistheoryhasbeenusedtoharaterize
optimiza-tionproblems and to obtainglobalstatistis ofthe problems[11℄. However,in
reent years, researhershavebeeninterested in the appliations of landsape
theorytoimprovethesearhalgorithms[5℄.
Alandsape for aombinatorial optimization problem is atriple(X;N;f),
where f :X 7!R denestheobjetivefuntion and theneighborhoodoperator
funtion N(x) generates the set of points reahable from x 2 X in a single
appliationoftheneighborhoodoperator.Ify2N(x)theny isaneighborofx.
Thereexistsaspeialkindoflandsapes,alledelementarylandsapes,whih
areofpartiularinterestduetotheirproperties[12℄.Wedeneandanalyzethe
elementarylandsapes in Setion 2, but we anadvanethat theyare
hara-terizedbytheGrover's waveequation:
avgff(y)g
y2N(x)
=f(x)+
d
f f(x)
where d is thesize ofthe neighborhood, jN(x)j, whih weassume is the same
y2N(x)
averageoftheobjetivefuntionf omputedin itsneighborhood:
avgff(y)g
y2N(x) =
1
jN(x)j X
y2N(x)
f(y) (1)
For a given problem instane whose objetive funtion is elementary, the
values
f and an be easily omputed in an eÆient way, usually from the
problemdata.Thus,thewaveequationmakesitpossibletoomputetheaverage
value of the tness funtion f evaluated over all of the neighbors of x using
onlythe valuef(x),without evaluating any ofthe neighbors.Thismeans that
in elementary landsapes we get additional information from a singlesolution
evaluation. We get an ideaof what is the quality of the solutions around the
urrentone.Thisinformationanbeusedtodesignmoreleversearhstrategies
andoperatorswhiheetivelyusetheinformation.
Luetal.[5℄provideanieexampleoftheappliationofthelandsape
anal-ysisto improvetheperformaneof asearhmethod. Intheirwork,the
perfor-maneoftheSamplingHillClimbing isimprovedbyavoidingtheevaluationof
non-promising solutions. The average tness value in the neighborhood of the
solutionsomputedwith(1)isattheoreoftheirproposal.
Whenthelandsapeis notelementaryitis alwayspossibleto writethe
ob-jetivefuntionasasumofelementaryomponents,alledelementarylandsape
deompositionofaproblem[1℄.Then,Grover'swaveequationanbeappliedto
eahelementaryomponentandalltheresultsaresummedtogivetheaverage
tnessintheneighborhoodofasolution.Furthermore,forsomeproblemsthe
av-erageannotbelimitedtotheneighborhoodofasolution,butitanbeextended
to theseond-orderneighrbors (neighborsof neighbors),third-orderneighbors,
and, in general, to any arbitraryregion aroundagivensolution, inludingthe
whole searh spae. Sutton et al. [10℄ show how to ompute theaverages over
spheres andballsofarbitraryradiusaroundagivensolutioninpolynomialtime
using theelementary landsapedeompositionof real-valuedfuntions over
bi-nary strings. In [9℄ they propose a method that uses these averages over the
balls around a solution to esape from plateaus in the MAX-k-SAT problem.
The empirial results notied an improvement when the method wasapplied.
Langdon[4℄alsoanalyzedthespheresofarbitraryradiusfromthepointofview
of landsapetheory, highlighting that the Walsh funtions are eigenvetorsof
thespheresandthemutation matrixin GAs.
Ifweextendthelandsapeanalysisoftheobjetivefuntionf totheirpowers
(f 2
,f 3
, et.), Grover'swaveequation allowsoneto ompute higher-order
mo-mentsofthetnessdistributionaroundasolutionand,withthem,thevariane,
theskewnessandthekurtosis ofthisdistribution.Suttonet al.[10℄ providean
algorithmforthisomputation.
Weanalyzeherethe Test SuiteMinimization problem in regressiontesting
from thepointof viewoflandsape theory.This softwareengineering problem
the mathematial tools required to understand the rest of the paperand
Se-tion3formallydenestheTestSuiteMinimizationproblem.Setion 4presents
thetwomainontributions:theelementarylandsapedeomposition ofthe
ob-jetivefuntionoftheproblemanditssquare.Weprovidelosed-formformulas
forbothf andf 2
.Inthemathematialdevelopmentweinludeanovel
applia-tionof the Krawthouk matriesto thelandsape analysis.Setion 5proposes
an appliation of thedeompositions of f and f 2
and presents a short
exper-imental study showing the benets (and drawbaks) of the proposal. Finally,
withSetion 6weonludethepaper.
2 Bakground
Inthissetionwepresentsomefundamentalresultsoflandsapetheory.Wewill
only fous on the relevant information required to understand the rest of the
paper.Theinterestedreaderandeepenonthistopiin [7℄.
Let(X;N;f)bealandsape,whereX isanitesetofsolutions,f :X !R
isareal-valuedfuntion denedonX andN :X !P(X)istheneighborhood
operator.TheadjaenyanddegreematriesoftheneighborhoodN aredened
as:
A
xy =
1ify2N(x)
0otherwise
; D
xy =
jN(x)jifx=y
0 otherwise
(2)
Werestrit our attention to regular neighborhoods, where jN(x)j =d >0
for aonstantd, forallx 2X. Then,the degreematrixisD =dI, where I is
theidentitymatrix.TheLaplaianmatrixassoiatedtotheneighborhoodis
dened by=A D.Intheaseof regularneighborhoodsitis =A dI.
Any disrete funtion, f, dened over the set of andidate solutions an be
haraterizedasavetorin R jXj
.AnyjXjjXjmatrixanbeinterpretedasa
linearmap that ats on vetorsin R jXj
.Forexample,the adjaenymatrix A
atsonfuntionf asfollows
Af = 0
B
B
B
P
y2N(x1) f(y)
P
y2N(x
2 )
f(y)
.
.
.
P
y2N(x
jXj )
f(y) 1
C
C
C
A
; (Af)(x)= X
y2N(x)
f(y) (3)
Thus, the omponent x of (A f) is the sum of the funtion value of all
theneighborsofx.Stadlerdenesthelassofelementary landsapes wherethe
funtionf isaneigenvetor(oreigenfuntion)oftheLaplaianuptoanadditive
onstant[8℄.Formally,wehavethefollowing
Denition1. Let(X;N;f) bea landsape and the Laplaian matrixof the
onguration spae. The funtion f is said to be elementary if there exists a
onnetedneighborhoods(theonesweonsider here)theosetbis theaverage
value of thefuntion overthe whole searh spae: b =
f. Taking into aount
basi results of linear algebra, it an be proved that if f is elementary with
eigenvalue,af+bisalsoelementarywiththesameeigenvalue.Furthermore,
inregularneighborhoods,ifgisaneigenfuntionof witheigenvaluethen
g is also aneigenvalueof A, theadjaeny matrix,with eigenvalued .The
average valueof the tness funtion in the neighborhood of a solutionan be
omputedusingtheexpressionavgff(y)g
y2N(x) =
1
d
(Af)(x).Iff isan
elemen-taryfuntion witheigenvalue,thentheaverageisomputedas:
avgff(y)g
y2N(x)
= avg
y2N(x) ff(y)
fg+
f = 1
d (A(f
f))(x)+
f
=
d
d (f(x)
f)+
f =f(x)+
d (
f f(x))
andwegetGrover'swaveequation.Inthepreviousexpressionweusedthefat
that f
f isaneigenfuntionofAwitheigenvalued .
The previous denitions are general onepts of landsape theory. Let us
fous now on the binary strings with the one-hange neighborhood, whih is
therepresentation andtheneighborhood weusein the testsuiteminimization
problem.InthisasethesolutionsetX isthesetofallbinarystringsofsizen.
Twosolutionsxandy areneighboringifoneanbeobtainedfromtheotherby
ipping abit, that is, iftheHammingdistane betweenthesolutions,denoted
with H (x;y), is1.We denethesphereof radiusk aroundasolutionx asthe
set ofallsolutionslyingatHammingdistanek from x[10℄.Aball ofradius k
isthesetof allthesolutionslyingat Hammingdistane lowerorequalto k.In
analogytotheadjaenymatrixwedenethesphereandballmatriesofradius
kas:
S (k )
xy =
1ifH (x;y)=k
0otherwise
; B
(k )
xy =
k
X
=0 S
()
xy =
1ifH (x;y)k
0otherwise
(4)
Sinetheballmatriesarebasedonthespherematriesweanfousonthe
latter.Thespherematrixofradiusoneistheadjaenymatrixoftheone-hange
neighborhood,A,andthespherematrixofradiuszeroistheidentitymatrix,I.
Following[10℄,thematriesS (k )
anbedenedusingthereurrene:
S (0)
=I; S (1)
=A; S (k +1)
= 1
k+1
AS (k )
(n k+1)S (k 1)
(5)
With thehelp ofthe reurreneweanwrite allthematries S (k )
as
poly-nomials in A, the adjaeny matrix. For example,S (2)
= 1
2 A
2
nI
. As we
previouslynoted,theeigenvetorsoftheLaplaianmatrixareeigenvetorsof
theadjaeny matrixA. Ontheother hand,iff is eigenvetorofA, then itis
alsoaneigenvetorofanypolynomialinA.As aonsequene,allthefuntions
theexpression forS (k )
. Thesameistruefor theballmatries B (k )
, sinethey
areasumofspherematries.Letusdenethefollowingseriesofpolynomials:
S (0)
(x)=1 (6)
S (1)
(x)=x (7)
S (k +1)
(x)= 1
k+1
xS (k )
(x) (n k+1)S (k 1)
(x)
(8)
Weusethesamename forthepolynomialsandthe matriesrelatedto the
spheres.Thereadershouldnotie,however,thatthepolynomialswillbealways
presentedwiththeirargumentandthematrieshavenoargument.Thatis,S (k )
isthematrixandS (k )
(x)isthepolynomial.Usingthepreviouspolynomials,the
matrixS (k )
an bewrittenasS (k )
(A)(the polynomialS (k )
(x)evaluatedinthe
matrixA)andanyeigenvetorgofAwitheigenvalueisalsoaneigenvetorof
S (k )
(A)witheigenvalueS (k )
().
OnerelevantsetofeigenvetorsoftheLaplaianinthebinaryrepresentation
is that of Walsh funtions [11℄.Furthermore,the Walsh funtions form an
or-thogonalbasisofeigenvetorsintheongurationspae.Thus,theyhavebeen
used to nd the elementary landsape deomposition of problems with a
bi-naryrepresentationliketheSAT[6℄.Wewillusethesefuntions toprovidethe
landsapedeompositionoftheobjetivefuntionofthetestsuiteminimization
problem. Giventhespaeofbinary stringsof lengthn, B n
,a(non-normalized)
Walshfuntion withparameterw2B n
is denedas:
w (x)=
n
Y
i=1 ( 1)
wixi
=( 1) P
n
i=1 wixi
(9)
Twouseful propertiesof Walsh funtions are
w
v =
w+v
where w+v
is thebitwise sumin Z
2
ofw and v; and 2
w =
w
w =
2w =
0
=1.We
dene the order of a Walsh funtion
w
asthe valueh wjwi = P
n
i=1 w
i , that
is, thenumberofones in w. A Walshfuntion withorder piselementarywith
eigenvalue=2p[8℄. TheaveragevalueofaWalsh funtion of orderp>0is
zero,that is,
w
=0ifw hasatleast one1.TheonlyWalshfuntion of order
p=0is
0
=1,whih isaonstant.
In the mathematial development of Setion 4 we will use, among others,
Walsh funtions of order 1 and 2. Thus, we present here a speial ompat
notation forthose binary stringshaving onlyone ortwobits set to 1.We will
denote with i the binary stringwith position i set to 1 and the rest set to 0.
Wealsodenotewithi;j (i6=j)thebinarystringwithpositionsiandjsetto1
and therestto 0.Weomitthe lengthofthestringn, but itwill belearfrom
the ontext. For example, if we are onsidering binary strings in B 4
we have
1=1000and2;3=0110.Usingthisnotationweanwrite
i
(x)=( 1) x
i
=1 2x
i
(10)
in W and u, that is, W ^u = fw ^ujw 2 Wg. For example, B ^0101 =
f0000;0001;0100;0101g.
Sine the Walsh funtions form an orthogonal basis of R 2
n
, any arbitrary
pseudobooleanfuntion anbewrittenasaweightedsumofWalshfuntionsin
thefollowingway:
f = X
w2B n
a
w w
(11)
where the values a
w
are alled Walsh oeÆients. We angroup together the
Walshfuntionshavingthesameordertondtheelementarylandsape
deom-positionofthefuntion.Thatis:
f (p)
= X
w2B n
hwjwi=p a
w w
(12)
whereeahf (p)
isanelementaryfuntionwitheigenvalue2p.Thefuntionfan
bewrittenasasumofthen+1elementaryomponents,thatis:f = P
n
p=0 f
(p)
.
Thus,anyfuntionan bedeomposed inasumofatmostnelementary
land-sapes,sineweanaddtheonstantvaluef (0)
to anyoftheotherelementary
omponents.
Oneweknowthatthepossibleeigenvaluesoftheelementaryomponentsof
anyfuntion f are2pwith0pn,weanomputethepossibleeigenvalues
ofthespherematries.Sinethesizeoftheneighborhoodisd=n,weonlude
that the only possible eigenvalues for the spheres are S (k )
(n 2p) with p 2
f0;1;:::;ng.WiththehelpofEqs.(6)to(8)weanwriteareurreneformula
fortheeigenvaluesofthespherematrieswhosesolutionisS (k )
(n 2p)=K (n)
k ;p ,
whereK (n)
k ;p
isthe(k;p)elementofthen-thKrawthoukmatrix[10℄,whihisan
(n+1)(n+1) integermatrix.Wewill useKrawthoukmatries tosimplify
the expressions and redue the omputation of the elementary omponents of
the test suite minimization. The interestedreader andeepen on Krawthouk
matries in [3℄. One importantpropertyof the Krawthoukmatries that will
beusefulin Setion4is:
(1+x) n p
(1 x) p
= n
X
k =0 x
k
K (n)
k ;p
(13)
Eahomponentf (p)
oftheelementarylandsapedeomposition off isan
eigenfuntion of the sphere matrixof radius r with eigenvalueS (r)
(n 2p)=
K (n)
r;p
. Thus, we an ompute the average tness valuein a sphere of radius r
aroundasolutionxas:
avgff(y)g
yjH(y;x)=r =
n
r
1 n
X
K (n)
r;p f
(p)
rifweknowtheelementarylandsapedeompositionoff
:
=avgff
(y)g
yjH(y;x)=r =
n
r
1 n
X
p=0 K
(n)
r;p ( f
) (p)
(x) (15)
3 Test Suite Minimization Problem
When a piee of software is modied, the new software is tested using some
previoustest asesin order to hekif newerrorswereintrodued. This hek
isknownasregressiontesting.In[14℄YooandHarmanprovideaveryomplete
surveyonsearh-basedtehniquesforregressiontesting.Theydistinguishthree
dierent relatedproblems: test suite minimization,test ase seletionand test
aseprioritization.Theproblemwefaehereisthetestsuiteminimization [13℄.
We dene theproblem asfollows.LetT =ft
1 ;t
2 ;:::;t
n
gbeaset of tests for
aprogramand letM=fm
1 ;m
2 ;:::;m
k
gbeaset ofelementsof theprogram
that wewanttooverwiththetests.AfterrunningallthetestsT wendthat
eah test an overseveral program elements. This information is stored in a
matrixT thatisdenedas:
T
ij =
1ifnodem
i
isoveredbytest t
j
0otherwise
(16)
Wedenetheoverageof asubsetoftestsX T as:
overage(X)=jfij9j 2X;T
ij
=1gj (17)
Theproblem onsists in ndinga subset X T suh that the overage is
maximizedwhile thenumberoftestsasesin theset jXjisminimized. Wean
denetheobjetivefuntionoftheproblemastheweightedsumoftheoverage
andthenumberoftests. Thus, theobjetivefuntionanbewrittenas:
f(X)=overage(X) jXj (18)
whereisaonstantthatsettherelativeimportaneoftheostandoverage.It
anbeinterpretedastheostofatestmeasuredinthesameunitsasthebenet
ofanewoveredelementinthesoftware.Weassumeherethatalltheelements
inMtobeoveredhavethesamevaluefortheuserandtheostoftestingone
test in T is the samefor all of them. We defer to future work the analysis of
theobjetivefuntionwhenthis assumptionisnottrue.Although thefuntion
proposedisaweightedsum,whihsimpliesthelandsapeanalysis,non-linear
funtions anbealsousedand analyzed.
Inthefollowingwewill usebinary stringsto representthesolutionsof the
problem. Thus, we introdue the deision variables x
j
2 B for 1 j n.
objetivefuntionf an bewrittenas:
overage(x)= k
X
i=1 n
max
j=1 fT
ij x
j
g; ones(x)= n
X
j=1 x
j
(19)
f(x)= k
X
i=1 n
max
j=1 fT
ij x
j
g ones(x) (20)
4 Elementary Landsape Deomposition
Inthissetionwepresenttwoofthemainontributionsofthiswork:the
elemen-tarylandsapedeompositionoff andf 2
.Inordertosimplifytheequationslet
usintroduesomenotation.LetusdenethesetsV
i =fjjT
ij
=1g.V
i
ontains
theindiesofthetestswhihovertheelementm
i
.Wealsouseinthefollowing
the termT
i
to referto thebinary stringomposed of theelementsof the i-th
rowofmatrixT.T
i
isabinarymaskwith1sinthepositionsthatappearinV
i .
4.1 Deompositionof f
The goal of this setion is to nd theWalsh deomposition of f. We rst
de-omposethefuntionsoverage(x)andones(x)intoelementarylandsapesand
then we ombine the results. Let us start by analyzing the overage funtion
and,inpartiular,letuswritethemaximuminitsdenitionasaweightedsum
ofWalshfuntions withthehelp of(10).
n
max
j=1 fT
ij x
j g=1
n
Y
j=1 (1 T
ij x
j )=1
Y
j2V
i (1 x
j )
=1 Y
j2V
i 1+
j (x)
2
=1 2 jV
i j
Y
j2V
i (1+
j
(x)) (21)
Wean expandtheprodutof Walshfuntions in (21)using
u v =
u+v
to gettheWalshdeompositionofmax n
j=1 .
n
max
j=1 fT
ij x
j
g=1 2 jVij
Y
j2Vi (1+
j
(x))=1 2 jVij
X
W2P(Vi) Y
j2W j
(x) (22)
=1 2 jV
i j
X
w2B n
^T
i w
(x)
Using the Walsh deomposition we an obtain that elementary landsape
deomposition. The elementary omponents are the sums of weighted Walsh
ofmax n j=1 is: n max j=1 fT ij x j g (0) =1 1 2 jV i j (23) n max j=1 fT ij x j g (p) = 1 2 jV i j X
w2B n
^T
i
hw;wi=p w
(x) wherep>0 (24)
Eqs.(23)and(24)aretheelementarylandsapedeompositionofthe
over-ageofonesinglesoftwareelement.Wejusthavetoaddalltheomponentsofall
the kelementsto get theelementarylandsape deomposition ofoverage(x).
However,weshould highlight that thepreviousexpression is notveryeÆient
to ompute theomponentsof the maximum. Weanobserve that it requires
to ompute a sumof
jVij
p
Walsh funtions. Before ombiningall the piees
to gettheelementarylandsapedeomposition oftheobjetivefuntion ofthe
problem, we need rst to nd a simplerand more eÆient expression for the
elementaryomponentsoftheoverageofonesingleelement.
Up to the best of our knowledge, this is the rst time that the following
mathematial development is performed in the literature. The essene of the
development, however,isusefulby itselfand anbeapplied toother problems
withbinaryrepresentationinwhihtheWalshanalysisanbeapplied(likethe
Max-SATproblem).Wewillfousonthesummationof(24).Letusrewritethis
expressionagainas:
X
w2B n
^T
i
hw;wi=p w
(x)= X
W 2P(V
i )
jWj=p Y
j2W j
(x) (25)
Nowwean identifytheseondmemberofthepreviousexpressionwiththe
oeÆientofapolynomial.Letus onsiderthepolynomialQ (i)
x
(z)dened as:
Q (i) x (z)= Y j2V i (z+ j (x))= jVij X l=0 z l 0 B B B X
W2P(V
i )
jWj=jV
i j l Y j2W j (x) 1 C C C A = jVij X l=0 q l z l (26)
From(26)weonludethatthesummationin(25)istheoeÆientofz jV
i j p
in the polynomial Q (i)
x
(z), that is, q
jV
i j p
. Aording to (10)and (26)wean
writeQ (i)
x
(z)=(z+1) n (i) 0 (z 1) n (i) 1 wheren (i) 0 andn (i) 1
arethenumberofzeros
andones,respetively,in thepositionsx
j
ofthesolutionwith j2V
i
.Itshould
belearthat n (i) 0 +n (i) 1 =jV i
j.Nowweanprotfrom thefatthat, aording
to (13), the polynomials Q (i)
x
(z) are related to the Krawthouk matries by
Q (i)
x
(z)=( 1) n (i) 1 P jV i j l=0 K jV i j l;n (i) z l
and we anwrite q
l
X
w2B n
^T
i
hw;wi=p w
(x)= X
W2P(V
i )
jWj=p Y
j2W j
(x)=q
jV
i j p
=( 1) n (i) 1 K jVij jVij p;n (i) 1 (27)
TherstN KrawthoukmatriesanbeomputedinO(N 3
).Furthermore,
theyan beomputedone andstored inale forfuture use. Thus, we
trans-formthesummationoveralargenumberofWalshfuntionsintoaountofthe
numberofonesinabitstringandareadofavaluestoredinmemory,whihhas
omplexity O(n).Eq. (27) is animportant resultthat allows us to provide an
algorithm forevaluatingtheelementarylandsapedeomposition ofour
obje-tivefuntion.ThisalgorithmismoreeÆientthantheoneproposedbySutton
etal.in[10℄.Weannowextendtheelementarylandsapedeompositiontothe
ompleteoverageof alltheelements.That is:
overage (0) (x)= k X i=1 n max j=1 fT ij x j g (0) = k X i=1 1 1 2 jVij (28) overage (p) (x)= k X i=1 n max j=1 fT ij x j g (p) = k X i=1 1 2 jV i j ( 1) n (i) 1 K jV i j jV i j p;n (i) 1 (29)
wherep>0.Thepreviousexpressionsanbeomputedin O(nk).
Wenowneedthedeompositionof thefuntion ones(x):
ones(x)= n X j=1 x j = n X j=1 1 j (x) 2 = n 2 1 2 n X j=1 j (x) (30)
Then,weanwrite:
ones (0) (x)= n 2 ; ones (1) (x)= 1 2 n X j=1 j
(x)=ones(x) n
2
(31)
whihistheelementarylandsapedeompositionofones(x).Finally,weombine
this resultwith thedeomposition ofoverage(x)to obtainthedeomposition
off:
f (0) (x)= k X i=1 1 1 2 jVij n 2 (32) f (1) (x)= k X i=1 1 2 jV i j ( 1) n (i) 1 K jV i j jV i j 1;n (i) 1 ones(x) n 2 (33) f (p) (x)= k X 1 2 jVij ( 1) n (i) 1 K jVij jVij p;n (i) 1
mumnumber ofelementaryomponentsis equalto n,weanobtainthe ev
al-uationof all theelementary omponents of anarbitrarysolution x in O(n 2
k).
We found analgorithm with omplexity O(nk) to omputeall theelementary
omponents of f. This omplexity is lower than the O(n n
) omplexity of the
algorithmproposedin[10℄.
4.2 Deompositionof f 2
Intheprevioussetionwefoundtheelementarylandsapedeompositionoff.In
thissetionweareinterestedin theelementarylandsapedeompositionoff 2
,
sineitallowstoomputethevarianeinanyregion(sphereorball)aroundany
arbitrarysolutionx.Thederivationoftheelementarylandsapedeomposition
off 2
isbasedagainintheWalshanalysisofthefuntion.CombiningtheWalsh
deomposition in (22) withthe oneof(30) andthedenition off in (20),the
funtion f 2
anbewrittenas:
f 2 (x)= 2 4 k n 2 k X i=1 0 1 2 jV i j X
w2B n ^T i w (x) 1 A + 2 n X j=1 j (x) 3 5 2
Weneedtoexpandtheexpressioninordertondtheelementarylandsape
deomposition. Due to spae onstraints we omit the intermediate steps and
presentthenalexpressionsoftheelementaryomponentsoff 2 : f 2 (0) (x)= 2 + 2 4 n k X i=1 jV i j+2
2 jV i j + k X i;i 0 =1 1 2 jV i [V i 0j (35) f 2 (1)
(x)=(n 2ones(x)) k
X
i=1 (jV
i
j+2)( 1) n (i) 1 2 jV i j K jV i j jV i j 1;n (i) 1 ! + k X i;i 0 =1 ( 1) n (i_i 0 ) 1 2 jV i [V i 0 j K jV i [V i 0 j jV i [V i 0 j 1;n (i_i 0 ) 1 ! k X i=1
n 2ones(x) jV
i j+2n
(i) 1 2 jV i j (36) f 2 (2) (x)= 2 2 ( 1) ones(x) K n n 2;ones(x) k X i=1 (jV i
j+2)( 1) n (i) 1 2 jVij K jVij jVij 2;n (i) 1 ! + k X i;i 0 =1 ( 1) n (i_i 0 ) 1 2 jVi[V i 0j K jVi[V i 0j jVi[V i 0j 2;n
(i_i 0 ) 1 ! k X ( 1) n (i) 1 2 jVij K jVij jVij 1;n (i) 1
n 2ones(x) jV
i j+2n
(i)
1
f 2
(p)
(x)= k
X
i=1 (jV
i
j+2)( 1) n
(i)
1
2 jV
i j
K jV
i j
jV
i j p;n
(i)
1
+ k
X
i;i 0
=1 ( 1)
n (i_i
0
)
1
2 jV
i [V
i 0
j K
jV
i [V
i 0
j
jV
i [V
i 0
j p;n (i_i
0
)
1 !
k
X
i=1 ( 1)
n (i)
1
2 jV
i j
K jV
i j
jV
i j p+1;n
(i)
1
n 2ones(x) jV
i j+2n
(i)
1
(38)
where = k n=2, n (i_i
0
)
1
are the numberof ones in the positions x
j of the
solutionwithj2V
i [V
i 0
andp>2.Theelementaryomponents(36),(37)and
(38) an be omputed in O(nk 2
). Furthermore, we found an algorithm whih
omputesall(not onlyone)theomponentsinO(nk 2
).
5 Appliation of the Deomposition
InSetion4wehavederivedlosed-formformulasforeahelementaryomponent
of f and f 2
. Using this deompositions we an ompute the average
1 and
the standarddeviation of the tnessdistribution in thespheresand balls of
arbitraryradiusaroundagivensolutionx.One wehavetheevaluation ofthe
elementaryomponents,therstandseondordermomentsoff,
1 and
2 ,an
beomputed from Eqs. (32)-(34) and (35)-(38) in O(n)for anyball orsphere
aroundthe solutionusing(15). Thestandarddeviationanbeomputedfrom
thetworstmomentsusing theequation = p
2
2
1 .
Howan weuse this information?We propose herethe following operator.
Givenasolutionxomputethe
1
and ofthetnessdistributionaroundthe
solutioninallthespheresandballsuptoamaximumradiusr.Weandothisin
O(nk 2
),assumingthatrisxed.Usingtheaveragesandthestandarddeviations
omputed,wehekifthereisahighprobabilityofndingasolutioninaregion
aroundxthatisbetterthanthebestsofarsolution.Thishekisbasedonthe
expression
1
+d best,wheredisparameterandbestisthetnessvalueofthe
bestsofarsolution.Thehigherthevalueofthepreviousexpression,thehigher
the probability of ndinga solutionin the orresponding regionthat isbetter
thanthe best solution.Thepreviousexpressionisbasedonthe ideathat most
of thesamplesofadistributionanbefoundaroundtheaverageat adistane
that is a few times the standard deviation. For example, at least 75%of the
samplesanbefoundintheinterval[
1 2;
1
+2℄.Intheaseofthenormal
distribution,theperentageis95%.Inouroperator,if
1
+d >best,thenit
islikelythat asolutionbetterthanthebestfoundanbeinsidetheonsidered
region.Ifthathappens,thenaloalsearhisperformedintheregion.Thisloal
searhevaluatesallthesolutionsin thatregionandreplaestheurrentoneby
thebestsolutionfound. Thepseudoodeoftheoperatorisin Algorithm1.
We all this operator Guarded Loal Searh (GLS) beause it applies the
1: best=bestsofarsolution;
2: bestRegion=none;
3: quality= 1;
4: forr2all theonsideredregionsdo
5: (1,)=omputeAvgStdDev(x,r);
6: if
1
+d best>quality then
7: quality=1+d best;
8: bestRegion=r;
9: endif
10: endfor
11: y=x;
12: if quality>0then
13: y=applyLoalSearhInRegion(x,bestRegion)
14: endif
15: return y
bettersolutionwouldbefound,thusminimizingtheomputationostofaloal
searhinalargerregion.Weexpetourproposedoperatortohaveanimportant
intensiation omponent.Thus, apopulation-based metaheuristi would bea
goodomplementtoinreasethediversiationoftheombinedalgorithm.The
operator animprovethequalityofsolutionsofthealgorithmitisinludedin,
butitalsowillinreasetheruntime.However,thisruntimeshouldbequitelower
thanthe oneobtainediftheloal searh would beapplied at everystepofthe
algorithm.
5.1 ExperimentalStudy
Asaproofof onept,weanalyzetheperformaneoftheproposedoperatorin
this setion. For this experimental study we use a steady-state Geneti
Algo-rithm(GA) with10individualsinthepopulation,binarytournamentseletion,
bit-ipmutation withprobabilityp=0:01ofippingabit, one-pointrossover
andelitistreplaement.Thestoppingonditionistoreate100individuals(110
tnessevaluations).WeomparethreevariantsoftheGAthatdierinhowthe
loalsearhisapplied.Therstvariantdoesnotinludeanyloalsearh
opera-tor.Intheseondvariant,denotedwithGLSr,theGLSoperatorofAlgorithm1
isappliedtotheospringafterthemutation.Theregionsonsideredareallthe
spheresandballsuptoradiusr.Thethirdvariant,LSr,alwaysappliestheloal
searhafter themutationinaballofradiusr.
Forthe experimentsweseleted six programsfrom theSiemens suite. The
programsareprinttokens,printtokens2,shedule,shedule2,totinfoand
replae.They are available from the Software-artifat Infrastruture
Reposi-tory [2℄.Eah program hasalargenumberof available test suites,from whih
weseletthe rst100tests overingdierentnodes. Thus, in ourexperiments
exeutionsandweshowinTable1theaveragevaluesobtainedforthetnessof
thebestsolutionfoundand theexeutiontimeofthealgorithms,respetively.
Table 1.Fitnessofthebestsolutionfoundandomputationtime(inseonds)ofthe
algorithms(averagesover30independentruns)
Alg.
printtokens printtokens2 shedule shedule2 totinfo replae
Fit. Ses. Fit. Ses. Fit. Ses. Fit. Ses. Fit. Ses. Fit. Ses.
GA 89.20 0.03103.13 0.10 84.57 0.07 78.70 0.10 86.87 0.0371.90 0.03
GLS2105.17 37.93119.63 69.73101.60 21.10 93.60 52.63102.30 39.0788.13 37.30
LS2 113.27 10.67129.00 20.73111.07 3.80103.10 3.17110.00 3.0397.67 5.53
GLS3106.33 136.97120.87 84.10103.40 31.80 95.30 29.90103.03 33.4090.73 60.73
LS3 113.63 159.30129.80 141.33111.80 298.07103.97 90.67110.00 88.1398.00 141.37
GLS4105.27 390.03121.47 363.53103.40 237.17 96.37 212.70104.33 206.5091.13 368.97
LS4 114.003107.47129.972943.03112.002098.00104.001875.67110.001823.8098.003602.47
Weanobservein Table1that theorderingofthe algorithmsaordingto
the solutionsquality is LSr > GLSr >GA. This is the expeted result, sine
LSralwaysappliesadepthloalsearhwhileGLSrappliestheloalsearhonly
insomefavorableirumstanes.Ananalysisoftheevolutionofthebesttness
valuerevealsthatthisorderingis keptduringthesearhproess.
Ifwefousontheomputationtimerequiredbythealgorithms, weobserve
that GAisalwaysthefastestalgorithm.Whenr3,GLSris fasterthanLSr.
However,ifr=2then LSr isfaster thanGLSr. Thismeansthat theomplete
explorationofaballofradius r=2isfaster thandeterminingifaloal searh
should be applied in theGLS operator.Although weshowhere the
omputa-tiontimes, itshould benotedthatthis depends ontheimplementationdetails
andthemahinesused.Forthisreasonthestoppingonditionisthenumberof
evaluations.Thegreatamountoftimerequiredtoomputetheelementary
om-ponentsisthemaindrawbakoftheGLS operator.However,this omputation
an be parallelized, aswell as the appliation of the loal searh. In
partiu-lar,GraphiProessingUnits (GPUs)anbeused to omputetheelementary
omponentsin parallel.
6 Conlusion
Wehaveappliedlandsapetheorytondtheelementarylandsape
deomposi-tionoftheTestSuiteMinimizationproblemin regressiontesting.Wehavealso
deomposed thesquaredobjetive funtion.Using the losed-formformulas of
the deomposition wean omputethe average and thestandard deviation of
thetnessvaluesaroundagivensolutionxinaneÆientway.With thesetools
weproposedanoperator toimprovethequalityof thesolutions.This operator
appliesaloalsearharoundthesolutiononlyiftheprobabilityofndingabest
solutionishigh.Theresultsofanexperimentalstudyonrmsthattheoperator
improvesthesolutionsrequiringamoderateamountofomputation.Ablind
lo-alsearhoutperformstheresultsofourproposedoperatorbutrequiresalarge
amountofomputationasthesizeoftheexploredregioninreases.
Thefuture work shouldfousonnewappliationsof thetheorybut alsoon
aproblem.
Aknowledgements
Wethanktheanonymousreviewersfortheirinterestingandfruitfulomments.
ThisresearhhasbeenpartiallyfundedbytheSpanishMinistryofSieneand
InnovationandFEDERunderontratTIN2008-06491-C04-01(theM
projet)
and the Andalusian Government under ontrat P07-TIC-03044 (DIRICOM
projet).
Referenes
1. Chiano,F.,Whitley,L.D.,Alba,E.:Amethodologytondtheelementary
land-sapedeompositionof ombinatorialoptimizationproblems.Evolutionary
Com-putationInpress(doi:10.1162/EVCOa00039)
2. Do, H., Elbaum, S., Rothermel, G.: Supporting ontrolled
experi-mentation with testing tehniques: An infrastruture and its
poten-tial impat. Empirial Softw. Engg. 10, 405{435 (Otober 2005),
http://portal.am.org/itation.fm?id=1089922.10 89 928
3. Feinsilver, P.,Koik, J.: Krawthoukpolynomials and krawthouk matries.In:
Baeza-Yates,R.,Glaz,J.,Gzyl,H.,Husler,J.,Palaios,J.(eds.)ReentAdvanes
inAppliedProbability,pp.115{141.SpringerUS(2005)
4. Langdon,W.B.:Elementarybitstringmutationlandsapes.In:Beyer,H.G.,
Lang-don, W.(eds.)FoundationsofGeneti Algorithms.pp.25{41. ACM,
Shwarzen-berg,Austria(5-9Jan2011)
5. Lu,G., Bahsoon,R.,Yao,X.:Applyingelementarylandsapeanalysis to
searh-based softwareengineering. In:Proeedingsof the 2ndInternationalSymposium
onSearhBasedSoftware Engineering(2010)
6. Rana,S.,Hekendorn,R.B.,Whitley,D.:AtratablewalshanalysisofSATandits
impliationsforgenetialgorithms.In:ProeedingsofAAAI.pp.392{397.Menlo
Park,CA,USA(1998)
7. Reidys,C.M.,Stadler,P.F.:Combinatoriallandsapes.SIAMReview44(1),3{54
(2002)
8. Stadler, P.F.: Towarda theoryof landsapes. In: Lopez-Pe~na, R.,Capovilla, R.,
Gara-Pelayo, R.,H.Waelbroek, Zertruhe,F. (eds.)Complex Systemsand
Bi-naryNetworks.pp.77{163.Springer-Verlag(1995)
9. Sutton,A.M.,Howe,A.E.,Whitley,L.D.:DiretedplateausearhforMAX-k-SAT.
In:ProeedingsofSoCS.Atlanta,GA,USA(July2010)
10. Sutton,A.M.,Whitley,L.D.,Howe,A.E.:Computingthemomentsofk-bounded
pseudo-booleanfuntionsoverHammingspheresofarbitraryradiusinpolynomial
time.TheoretialComputerSieneInpress(doi:10.1016/j.ts.2011 .02 .00 6)
11. Sutton, A.M., Whitley, L.D., Howe, A.E.: A polynomial time omputation of
the exat orrelation struture of k-satisability landsapes. In: Proeedings of
GECCO.pp.365{372.ACM,New York,NY,USA(2009)
12. Whitley,D.,Sutton,A.M.,Howe,A.E.:Understandingelementarylandsapes.In:
ProeedingsofGECCO.pp.585{592.ACM,NewYork,NY,USA(2008)
13. Yoo,S.,Harman,M.:ParetoeÆient multi-objetivetest ase seletion.In:
Pro-eedings ofthe 2007 InternationalSymposium onSoftware Testingand Analysis
(ISSTA'07).pp.140{150.ACM,London,England(9-12July2007)
14. Yoo,S.,Harman,M.:Regressiontestingminimisation,seletionandprioritisation: