Jerarquía de valores que manifiestan actualmente los

5. ANÁLISIS Y DISCUSIÓN DE LOS RESULTADOS

5.6. Jerarquía de valores que manifiestan actualmente los

The general setup of suggesting privacy polices for individual profile attributes resembles traditional recommender systems described in Chapter2. As shown in Figure4.2, there are three classes of entities, namelySMP users, which corre-spond to the RS’s users;profile attributes, which correspond to items; andprivacy policies, which correspond to user-item ratings. Users are characterised by a set of demographic traits like age, education, etc.

Guided by this general setup, and in order to suggest privacy policies for pro-file attributes in the PAP, this study follows ademographic-based approachto rec-ommender systems, where the motivating assumption is that demographically similar people have similar privacy policies for their profile attributes. The advantage of choosing a demographic-based approach over other approaches

Privacy Policies Privacy Policies Users' Features

(Demographics)

Attribute Features

Users

Users ^Profile

Attributes Profile Attributes

Figure 4.2:General setup for suggesting privacy policies for profile attributes.

(e.g. collaborative filtering), is that demographic-based approaches do not re-quire any prior input from the target user, as they rely only on the target user’s demographical information, which is already available on the target users’ pro-file.

The process by which the PAP suggests privacy policies for an individual pro-file attribute saya_i ∈A, consists of the following three phases.

4.3.1.1 Phase I: Data Collection

The first phase is the data-collection phase. In this phase, training data is col-lected from the profiles of existing experienced users. Since the recommender is designed to be a server-side solution, it is expected to have direct access to users’ profile data. Therefore, from each existing user’s profile, demographic information is extracted, as well as the privacy policy that the user (i.e. pro-file owner) has set for thea_i attribute. This collected information is stored in a datasetD, in which every record is of the form(F , l~ _j), whereF~ =hf₁, f₂, . . . , f_mi is the user’s demographic information, representing the features, and l_j is at-tributea_i’s privacy policy, and it represents the class label.

4.3.1.2 Phase II: Classifier Training

The second phase is the classifier-training phase. In this phase, a machine learn-ing classifier is trained (i.e. built) to predict the privacy policies that the target users set for the a_i attribute. For this task a decision tree learning algorithm is selected because decision tree learning algorithms are capable of handling classification of categorical, noisy, and incomplete data [49], which are the char-acteristics of the training dataD.

In order to train this classifier, a decision tree algorithm is applied to the dataset D, which carries out the task as follows. First, the algorithm looks at the train-ing datasetDand finds the featuref_∗ that tells us the most about users’ choice of privacy policies for the a_i attribute. It does this using a statistical measure called information gain (gain for short), which is given by equation (4.1) below.

Gain(D, f∗) = H(D)−H(D|f∗)

whereH(x)is the entropy ofX

and is given byH(x) = −p(x) log(p(x))

(4.1)

The algorithm then uses the feature with the highest gain (i.e. f∗) to create the root node. Next, the training dataset D is partitioned into several partitions, such that each partition contains records that have the same value forf∗. Next, for each partitionp, the algorithm looks for the featuref∗p that has the highest gain withinpand uses it (i.e. f∗p) to create a child node. The partitionpitself is then re-partitioned and the process is repeated, until every partition has almost the same privacy policy, or the algorithm runs out of training records.

The final classifier is represented as a decision tree, wherein each tree node represents a test for a specific featuref_i ∈F~ (i.e. demographical trait), and each

edge branching from the node corresponds to one of possible values off_i. The leaf nodes correspond to class labels (i.e. privacy policies).

Figure 4.3is an example of such a tree. This decision tree predicts the privacy policies of theaiattribute, by testing the target user’s features at the root node, and moving down the tree to a child node that corresponds to the value of the tested feature. The target user’s features are then retested at the child node and moved down the tree progressively. The process is repeated until a leaf is reached. The privacy policy associated with this leaf is the predicted privacy policy for thea_i attribute.

:Age

f

₁

:Education

f

₂

f

_:Gender₃

:Married

f

₄

F M

>30

<=30

Y N Tertiary

l

₂

l

₃

l

₁

l

₁

A Privacy Policy (Leaf) A Feature

(Node)

Figure 4.3:An example of a decision tree.

4.3.1.3 Phase III: Privacy Policy Suggestion

The third and final phase is the suggestion phase. This phase is triggered by the arrival of a target user. When a target user registers on the SMP, the fea-tures F_target~ = hf₁, f₂, . . . , f_mi (i.e. demographical information) are extracted form his/her newly created profile. Next, the target user’s features F_target~ = hf₁, f₂, . . . , f_miare passed to the decision tree classifier, which in turn predicts

a privacy policy for the a_i attribute. This privacy policy is then simply sug-gested to the target user. In Figure4.4below, the process of suggesting privacy policies for individual profile attributes is depicted, as described above.

(B) Privacy Policy Suggestion Phase

(A) Data Collection & Model Construction Phases

Users Features

+

Demographics

Profiles of Existing Users

Classifier Classifier

The Target User's Profile

Demographics Privacy

Policy Training Dataset (D)

Decision Tree Algorithm Privacy Policies

Figure 4.4:Suggesting privacy policies for individual profile attributes.

In document Valores y estilo de vida de los adolescentes de 13 y 14 años de edad, estudio realizado en el colegio fiscal "Fernando Daquilema" de la ciudad de Riobamba, provincia de Chimborazo en el año lectivo 2012 - 2013 (página 107-175)