• No se han encontrado resultados

CONCLUSIONES DE LA PROPUESTA DE INTERVENCIÓN

According to the above description, the proposed algorithm for mining both membership functions and fuzzy association rules is described below.

The proposed mining algorithm:

INPUT: A body ofnquantitative transaction data, a set ofmitems, a max- imum possible numberT of linguistic terms, a support thresholdα, a confidence thresholdλ, and a population sizeP.

OUTPUT: A set of fuzzy association rules with its associated set of member- ship functions.

STEP 1: Randomly generatempopulations, each for an item; Each individ- ual in a population represents a possible set of membership functions for that items.

STEP 2: Encode each set of membership functions into a string representa- tion in the way mentioned above.

STEP 3: Calculate the fitness value of each chromosome in each population by the following substeps:

STEP 3.1: For each transaction datum Di, i = 1 to n, and for each item Ij,j = 1 tom, transfer the quantitative valuev

(i)

j into a fuzzy

set fj(i) represented as:

fj(1i) Rj1 ,f (i) j2 Rj2 ,· · ·,f (i) jl Rjl ,

using the corresponding membership functions represented by the chromosome, where Rjk is the k-th fuzzy region (term) of

itemIj,f

(i)

jl isv

(i)

j ’s fuzzy membership value in regionRjk, and l(=|Ij|) is the number of active linguistic terms forIj.

STEP 3.2: For each item regionRjk, calculate its scalar cardinality on the

transactions as follows: countjk= n i=1 fjk(i).

STEP 3.3: For each Rjk, 1 ≤j ≤m and 1≤k ≤ |Ij|, check whether its countjkovernis larger than or equal to the minimum support

threshold α. If Rjk satisfies the above condition, put it in the

set of large 1-itemsets (L1). That is:

L1={Rjk|countjk/n≥α,1≤j≤m and1≤k≤ |Ij|}.

STEP 3.4: Set the fitness value of the chromosome as the sum of the fuzzy supports (the scalar cardinalities / n) of the fuzzy regions inL1

divided by suitability(Cq). That is:

f(Cq) =

X∈L1f uzzy support(X) suitability(Cq)

.

STEP 4: Execute crossover operations on each population. STEP 5: Execute mutation operations on each population.

STEP 6: Using the selection criteria to choose individuals in each population for the next generation.

STEP 7: If the termination criterion is not satisfied, go to Step 3; otherwise, do the next step.

STEP 8: Gather the sets of membership functions, each of which has the highest fitness value in its population.

The sets of the best membership functions gathered from each population are then used to mine fuzzy association rules from the given quantitative data- base. Our fuzzy mining algorithm proposed in [5] is then adopted to achieve this purpose. It first transforms each quantitative value into a fuzzy set of linguistic terms using the derived membership functions. It then calculates the scalar cardinality of each linguistic term on all the transaction data. The mining process based on fuzzy counts is then performed to find fuzzy asso- ciation rules. The details of the fuzzy mining algorithm [5] are described as follows.

The algorithm for mining fuzzy association rules:

INPUT: A set ofnquantitative transaction data, each withmitem values, a set of membership functions, a predefined minimum support thresh- oldα, a predefined confidence thresholdλ, and the large 1-itemset

OUTPUT: A set of fuzzy association rules.

STEP 1: IF L1 is not null, then do the next step; otherwise, exit the algo-

rithm.

STEP 2: Setr= 1, whereris used to represent the number of items kept in the current large itemsets.

STEP 3: Join the large itemsets Lr to generate the candidate set Cr+1 in

a way similar to that in the a priori algorithm except that two regions (linguistic terms) belonging to the same attribute can not simultaneously exist in an itemset inCr+1. Restated, the algorithm

first joinsLr andLr under the condition thatr-1 items in the two

itemsets are the same and the other one is different. It then keeps in Cr+1 the itemsets which have all their sub-itemsets of r items

existing inLr and do not have any two itemsRjp andRjq (p=q)

of the same attributeRj.

STEP 4: Do the following substeps for each newly formed (r+1)-itemset s

with items (s1, s2, . . . , sr+1) inCr+1:

STEP 4.1: Calculate the fuzzy value of each transaction dataD(i) ins as fs(i) = f (i) s1 ∧f (i) s2∧, . . . ,∧f (i) sr+1, where f (i) sj is the membership value of D(i) in regions

j. If the minimum operator is used for

the intersection, then:

fs(i)=M inrj=1+1fs(ij)

STEP 4.2: Calculate the scalar cardinality counts of s in the transac-

tions as: counts= n i=1 f(i) s .

STEP 4.3: If counts is larger than or equal to the predefined minimum

support valueα, putsin Lr+1.

STEP 5: IFLr+1is null, then do the next step; otherwise, setr=r+ 1 and

repeat Steps 2–4.

STEP 6: Construct association rules for each large q-itemset s with items (s1, s2, . . . , sq),q≥2, using the following substeps:

STEP 6.1: Form each possible association rule as follows:

s1 . . . ∧sk−1∧sk+1 . . . ∧sq →sk

where k=1 toq.

STEP 6.2: Calculate the confidence values of all association rules using:

n i=1f (i) s n i=1f (i) s1 . . . ∧f (i) sk−1, f (i) sk+1 . . . ∧f (i) sq

STEP 7: Output the association rules with confidence values larger than or equal to the predefined confidence thresholdλ.

Documento similar