• No se han encontrado resultados

Los Tributos Atípicos 3.1 Concepto:

Two common metrics are used to validate association rules resulting from each dataset; these are support and confidence, measuring the accuracy of the association rules. Support and confidence are calculated using the following two formulas:

Equation 6.1. Support calculation

𝑆𝑢𝑝𝑝𝑜𝑟𝑡 (X → Y) = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑒𝑐𝑜𝑟𝑑𝑠 𝑐𝑜𝑛𝑡𝑎𝑖𝑛𝑖𝑛𝑔 𝑥 𝑎𝑛𝑑 𝑦 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑒𝑐𝑜𝑟𝑑𝑠

Equation 6.2. Confidence calculation

𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 (X → Y) = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑒𝑐𝑜𝑟𝑑𝑠 𝑐𝑜𝑛𝑡𝑎𝑖𝑛𝑖𝑛𝑔 𝑥 𝑎𝑛𝑑 𝑦

𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑒𝑐𝑜𝑟𝑑𝑠 𝑐𝑜𝑛𝑡𝑎𝑖𝑛𝑖𝑛𝑔 𝑥

In addition, the association rules are evaluated using a k-fold cross-validation method, which utilises the idea of training and testing. In k-fold cross-validation, the dataset is equally

146

partitioned into a K number of folds, where each fold is used once for testing and the remaining dataset is used for training. This process is performed K number of times, by setting the next fold for testing and the remaining dataset for training, until each fold is used for testing exactly once (Witten et al., 2011). This procedure ensures achieving reliable validation of the generated association rules. K-fold cross-validation is commonly used in DM to calculate the average error rate of each run of the training and testing process. However, each association rule mined is independent from other rules and, therefore, the training and testing procedures are performed differently. That is, association rules that do not appear in both training and testing datasets with the same support value are filtered out. Most DM projects utilise a 10-fold cross-validation method as a perfect way to estimate performance errors. Nevertheless, this method is more suitable with large datasets (Ordonez, 2006). During our experiment, two-fold cross-validation is implemented to evaluate association rules extracted from our two datasets (Tomovic and Stansic, 2011). The two-fold cross-validation used in this study is described below:

1. Each dataset, DFemale and DMale, are equally portioned into two subsets: DFemale1 and DFemale2 for DFemale and DMale1 and DMale2 for DMale, where DFemale= DFemale1 U DFemale2 and DFemale1 ∩ DFemale2 = ∅ and DMale= DMale1 U DMale2 and DMale1 ∩ DMale2 = ∅. The number of transactions in each subset is given in Table 6.6.

Table 6.6. Number of transactions in training and testing subsets

DFemale1 DFemale2 DMale1 DMale2

Number of transactions 51 51 87 88

2. DFemale1 and DMale1 are used are used for training. Their respective association rules are extracted with respect to the minimum support and confidence values, 0.2 and 0.9 respectively. Extracted rules are denoted with RFemale1 and RMale1.

3. Utilisation of DFemale2 and DMale2 for testing. Association rules in RFemale1 and RMale1 are validated with DFemale2 and DMale2 respectively. That is, any association rule in RFemale1 and RMale1 that do not satisfy the minimum metrics in RFemale and RMale are filtered out.

147

4. DFemale2 and DMale2 are used for training. Their respective association rules are extracted with respect to the minimum support and confidence values. Extracted rules are denoted with RFemale2 and RMale2.

5. Utilisation of DFemale1 and DMale1 for testing. Association rules in RFemale2 and RMale2 are validated with DFemale1 and DMale1 respectively. That is, any association rules in RFemale2 and RMale2 that do not satisfy minimum metrics in RFemale and RMale are filtered out.

6. Valid rules in RFemale1 and RFemale2 and RMale1 and RMale2 are combined. In general,

RFemale1 U RFemale2 = RFemale and RMale1 U RMale2 = RMale

The two-fold cross-validation is used to evaluate the extracted rules from RFemale and RMale. The number of association rules that resulted in R1 and R2 do not satisfy the minimum metrics utilised in R. The association rules which do not satisfy specified metrics are removed from our final examination. Table 6.7 and Table 6.8 show the validation results of the filtered-out rules which are in bold.

With respect to RFemale, there are three rules (1, 4, 6) whose average confidence level from RFemale1 and RFemale2 do not satisfy the minimum confidence threshold in RFemale. These rules are eliminated as follows:

 The highest confidence rule in RFemale (1) relating to the complication of remembering

when to take medication does not satisfy the confidence level in RFemale (0.95), and therefore, is filtered out from the list of complications encountered by female diabetic citizens in Saudi Arabia.

 The second eliminated rule associates the difficulty of identifying the appropriate diet to females whose education level is high school or below and who are unemployed (4) as its average confidence does not satisfy the confidence level in RFemale (0.92). However, this complication is still valid in the other two rules, where it is associated with the education level of high school or below, type 1 diabetes mellitus and unemployed female citizens (3); it is also associated with females who are single and have type 1 diabetes mellitus (11) as their confidence exceeded the confidence level in RFemale (0.93) and (0.92) accordingly.

148

 The third eliminated rule is related to the difficulty identifying the right medication

amount in the holy month of Ramadan (6). In this rule, the complication is associated

with the high school or below education level, type 1 diabetes mellitus and unemployed employment status, as the two-fold validation procedure did not extract the same rule, either in RFemale1 or in RFemale2. Similar to the previous complication, the difficulty

identifying the right medication amount in the holy month of Ramadan is validly

149 Table 6.7. Two-fold cross-validation results of RFemale

Rule

No Best rules (RFemale)

Minimum support Minimum confidence

RFemale1 RFemale2 Average RFemale1 RFemale2 Average

1

EDUCATION LEVEL=High school or below Do you have difficulties remembering when to take your medication?=Yes 22 ==> EMPLOYMENT STATUS=Unemployed 21

- 0.2 0.2 - 0.93 0.93

2

MARITAL STATUS=Single In Ramadan, I have difficulties in identifying the right medication amount.=Yes 32 ==> TYPE OF DIABETES=Type1

- 0.2 0.2 - 1 1

3

EDUCATION LEVEL=High school or below TYPE OF DIABETES=Type1 Do you have difficulties identifying the diet appropriate to your health condition?=Yes 25 ==> EMPLOYMENT STATUS=Unemployed 23

- 0.2 0.2 - 0.93 0.93

4

EDUCATION LEVEL=High school or below Do you have difficulties identifying the diet appropriate to your health condition?=Yes 33 ==> EMPLOYMENT STATUS=Unemployed 3

0.2 0.2 0.2 0.92 0.90 0.91

5

EDUCATION LEVEL=High school or below TYPE OF DIABETES=Type1 Do you have difficulties to coexist with the aspects of diabetes lifestyle?=Yes 26 ==> EMPLOYMENT STATUS=Unemployed 24

- 0.2 0.2 - 0.94 0.94

6

EDUCATION LEVEL=High school or below TYPE OF DIABETES=Type1 In Ramadan, I have difficulties in identifying the right medication amount.=Yes 25 ==> EMPLOYMENT STATUS=Unemployed 23

- - - - - -

7

MARITAL STATUS=Single EMPLOYMENT STATUS=Unemployed In Ramadan, I have difficulties in identifying the right medication amount.=Yes 24 ==> TYPE OF DIABETES=Type1 22

- 0.2 0.2 - 1 1

8

EDUCATION LEVEL=High school or below TYPE OF DIABETES=Type1 In Ramadan, I have difficulties in identifying the right medication time.=Yes 22 ==> EMPLOYMENT STATUS=Unemployed 20

- 0.2 0.2 - 0.92 0.92

9

EDUCATION LEVEL=High school or below In Ramadan, I have difficulties in identifying the right medication amount.=Yes 32 ==> EMPLOYMENT STATUS=Unemployed 2

0.2 - 0.2 0.92 - 0.92

10

EDUCATION LEVEL=High school or below TYPE OF DIABETES=Type1 In Ramadan, I have difficulties in identifying the right diet for my health condition. =Yes 22 ==> EMPLOYMENT STATUS=Unemployed 20

- 0.2 0.2 - 0.92 0.92

11

MARITAL STATUS=Single Do you have difficulties identifying the diet appropriate to your health condition?=Yes 31 ==> TYPE OF DIABETES=Type1 28

- 0.2 0.2 - 0.92 0.92

The validation of RMale indicated that there are only two rules that are not valid and therefore filtered out from the final results of this study. The first rule to be filtered out associates the

150

are employed and married (10) as its average confidence level from RMale1 and RMale2 does not satisfy the confidence level in RMale (0.94). However, the difficulties in identifying the right medication dosage is extracted in other rules showing strong validity (1, 4, 9). The second rule that is filtered out associates difficulties in identifying the diet appropriate to

diabetics’ health condition with males who are employed and married (16) as its average

confidence level from RMale1 and RMale2 does not satisfy the confidence level in RMale (0.92). Similar to previous complications, difficulties in identifying the diet appropriate to diabetics’ health is extracted in another rule that shows strong validity but with a different combination of profile information (15).

151 Table 6.8. Two-fold cross-validation results of RMale

Rule

No Best rules (RMale)

Minimum support Minimum confidence

RMale1 RMale2 Average RMale1 RMale2 Average

1

EMPLOYMENT STATUS=Employed TYPE OF DIABETES=Type2 Do you have difficulties identifying the right dosage for your medication type?=Yes 45 ==> MARITAL STATUS=Married 44

0.2 0.2 0.2 1 0.96 0.98

2

EMPLOYMENT STATUS=Employed TYPE OF DIABETES=Type2 Do you have difficulties undertaking any physical activity?=Yes 37 ==> MARITAL STATUS=Married 36

0.2 0.2 0.2 0.94 1 0.97

3

TYPE OF DIABETES=Type2 Do you have difficulties remembering when to take your medication?=Yes 36 ==> MARITAL STATUS=Married 35

0.2 0.2 0.2 1 0.95 0.97

4

EMPLOYMENT STATUS=Employed Do you have difficulties identifying the right dosage for your medication type?=Yes 69 ==> MARITAL STATUS=Married 66

0.2 0.2 0.2 0.97 0.95 0.96

5

EDUCATION LEVEL=Bachelor degree EMPLOYMENT STATUS=Employed Do you have difficulties undertaking any physical activity?=Yes 39 ==> MARITAL STATUS=Married 37

- 0.2 0.2 - 1 1

6

EMPLOYMENT STATUS=Employed Do you have difficulties identifying the right medication?=Yes 65 ==> MARITAL STATUS=Married 62

0.2 0.2 0.2 0.97 0.94 0.95

7

EMPLOYMENT STATUS=Employed TYPE OF DIABETES=Type2 Do you have difficulties identifying the right medication?=Yes 41 ==> MARITAL STATUS=Married 39

0.2 0.2 0.2 1 0.91 0.95

8

EMPLOYMENT STATUS=Employed Do you have difficulties remembering when to take your medication?=Yes 41 ==> MARITAL STATUS=Married 39

0.2 - 0.2 0.96 - 0.96

9

TYPE OF DIABETES=Type2 Do you have difficulties identifying the right dosage for your medication type?=Yes 53 ==> MARITAL STATUS=Married 50

0.2 0.2 0.2 1 0.91 0.95

10

EDUCATION LEVEL=Bachelor degree EMPLOYMENT STATUS=Employed Do you have difficulties identifying the right dosage for your medication type?=Yes 47 ==> MARITAL STATUS=Married 44

0.2 0.2 0.2 0.95 0.92 0.93

11

EDUCATION LEVEL=Bachelor degree EMPLOYMENT STATUS=Employed In Ramadan, I have difficulties in identifying the right medication amount.=Yes 49 ==> MARITAL STATUS=Married 46

0.2 0.2 0.2 0.95 0.93 0.94

12

EDUCATION LEVEL=Bachelor degree EMPLOYMENT STATUS=Employed Do you have difficulties identifying the right medication?=Yes 43 ==> MARITAL STATUS=Married 40

0.2 0.2 0.2 0.95 0.91 0.93

13

EMPLOYMENT STATUS=Employed TYPE OF DIABETES=Type2 In Ramadan, I have difficulties in identifying the right medication amount.=Yes 44 ==> MARITAL STATUS=Married 41

0.2 0.2 0.2 0.95 0.91 0.93

14

EMPLOYMENT STATUS=Employed In Ramadan, I have difficulties in identifying the right medication amount.=Yes 72 ==> MARITAL STATUS=Married 67

152

15

EMPLOYMENT STATUS=Employed TYPE OF DIABETES=Type2 Do you have difficulties identifying the diet appropriate to your health condition?=Yes 49 ==> MARITAL STATUS=Married 45

0.2 - - 0.96 - 0.96

16

EMPLOYMENT STATUS=Employed Do you have difficulties identifying the diet appropriate to your health condition?=Yes 84 ==> MARITAL STATUS=Married 77

0.2 0.2 0.2 0.93 0.90 0.91

17

TYPE OF DIABETES=Type2 Do you have difficulties undertaking any physical activity?=Yes 48 ==> MARITAL STATUS=Married 44

0.2 - 0.2 0.95 - 0.95

18

TYPE OF DIABETES=Type2 Do you have difficulties identifying the right medication?=Yes 49 ==> MARITAL STATUS=Married 45

0.2 - 0.2 1 - 1

19

TYPE OF DIABETES=Type2 In Ramadan, I have difficulties in identifying the right medication amount.=Yes 54 ==> MARITAL STATUS=Married 49

0.2 - 0.2 0.96 - 0.96

20

EDUCATION LEVEL=Bachelor degree Do you have difficulties identifying the right medication?=Yes 50 ==> MARITAL STATUS=Married 45

0.2 - 0.2 0.92 - 0.92

6.2.6.Discussion

The results of our DM pilot study reveal interesting associations among different diabetes non-health-related complications and different profile information of female and male diabetic citizens in Saudi Arabia.

The results show common complications that are encountered by female and male diabetic citizens in Saudi Arabia. These complications are the difficulties of identifying the right

medication amount in Ramadan and the difficulties of identifying the diet appropriate to their health condition. In relation to the profile information, complications encountered by female

diabetics are associated with single marital status, type 1 diabetes mellitus, being unemployed and/or high school or below education level. On the other hand, the profile characteristics associated with complications encountered by male diabetics are employed, type 2 diabetes, married and/or bachelor degree education level. The results from both females and males do not indicate associations of complication to any age group, nor the region of birth of diabetic citizens.

The association rules from our DM study were intended to be compared with previous literature for further validation. However, a limited number of publications that discuss non- health-related complications of diabetes mellitus in Saudi Arabia were found. Only one

153

study, conducted by Memon et al. (2017), revealed that individuals who have type 1 diabetes in Saudi Arabia follow physician’s exercises more than those with type 2 diabetes. Our study shows consistency with this finding as the complication of undertaking any physical activity is only associated with type 2 diabetes mellitus. This is also confirmed by the work of Sharaf et al. (2013) who stated that failure to follow strict diet, exercise and medications are the leading causes of type 2 diabetes mellitus.

The newly discovered associations are to be utilised by providing useful diabetes self- management and education recommendations and guidelines to overcome extracted complications with respect to the associated profile information.

Documento similar