Conclusiones retomando los objetivos del Trabajo Fin de Grado

7. Conclusiones y limitaciones del Trabajo Fin de Grado

7.1. Conclusiones retomando los objetivos del Trabajo Fin de Grado

In this paper, we propose two new notions of histogram privacy, Sensitive Location Hiding and Target Avoidance/Resemblance, which lead to the following optimization problems: the Sensitive Location Hiding problem (SLH), which seeks to enforce the notion of Sensitive Location Hiding with optimal quality, and the Target Avoidance/Resemblance (TA/TR) prob- lem, which seeks to enforce Target Avoidance/Resemblance with bounded quality loss. We also propose optimal algorithms for each problem, as well as an efficient heuristic for the

TA/TR problem. Our experiments demonstrate that our methods are effective at preserving

the distribution of locations in a histogram, as well as the quality of recommendations based on these locations, while being fairly efficient.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which

permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visithttp://creativecommons.org/licenses/by/4.0/.

A Appendix

A.1 Proof of weak NP-hardness for theSLH problem

We first reduce the weakly NP-hard Multiple Choice Knapsack (MC K ) problem [24] to the special case of SLH, where|L| = 1. The MC K problem is defined as follows4:

min

i∈[1,m]

j∈Ci

ci j· xi j (A.1)

4_{The problem also appears with}_{≥ in constraint I [}₅₂_{]. This variation is referred to as MC K}

≥and can be transformed to MC K in polynomial time [24].

subject to: (I) _i_∈[1,m]_j_∈C

iwi j· xi j = b, (II)

j∈Cixi j= 1, i = 1, . . . m, and (III)

xi j∈ {0, 1}, i = 1, . . . , m, j∈ Ci.

In MC K , we are given a set of elements subdivided into m, mutually exclusive classes,

C1, . . . , Cm, and a knapsack. Each class Ci has|Ci| elements. Each element j ∈ Ci has a

cost ci j ≥ 0 and a weight wi j. The goal is to minimize the total cost (Eq.A.1) by filling

the knapsack with one element from each class (constraint II), such that the weights of the elements in the knapsack satisfy the constraint I, where b≥ 0 is a constant. The variable xi j

takes a value 1, if the element j is chosen from class Ci and 0 otherwise (constraint III).

We map a given instanceIMC Kto an instanceIS L Hof the special case of SLH in poly-

nomial time, as follows:

(I) Each class Ci, i∈ [1, m], is mapped to a location Li /∈ Lwhose count f(Li) in H is

arbitrary.

(II) A sensitive location Lm+1∈ L(without loss of generality) is considered. The count

of Lm+1in H is set to f(Lm+1) = b. Thus, H = ( f (L1), . . . , f (Lm), b).

(III) Each element xi jwith weightwi jand cost ci jis mapped to an operation on H , which

decreases f(Lm+1) by wi j and increases f(Li) by wi j (i.e., transferswi j visits from Lm+1to Li) and incurs q(H, H[i]) = ci j. If there are multiple operations such that q(H, H[i]) = ci j (e.g., when q is the L1distance), we select one arbitrarily. When

xi j = 1, its corresponding operation is applied to H. The result of applying all opera-

tions on H is referred to as the sanitized histogram H.

We prove the correspondence between a solutionStoIMC Kand a solution HtoIS L H,

as follows: we first prove that, ifS is a solution toIMC K, then His a solution toIS L H.

Since_i_∈[1,m]_j_∈C

iwi j· xi j = b, f (Lm+1) is decreased by b. Thus, H

_{[m + 1] = 0} (i.e., all visits to Lm+1are transferred to nonsensitive locations) and

i∈[1,m+1]H[i] =

i:Li/∈LH

_{[i] = |H|}

1 (i.e., Hhas the same size with H ). By construction, Hhas also the same length with H . Since_i_∈[1,m]_j_∈C_ici j· xi j is minimum,

i∈[1,m]q(H, H[i])

is minimum. Also, q(H, H[m + 1]) (i.e., the loss for transferring all visits from Lm+1to nonsensitive locations) is constant. Thus, dq(H, H) =

i∈[1,m+1]q(H, H[i]) is minimum.

Therefore, His a solution toIS L H.

We now prove that, if H is a solution toIS L H, thenS is a solution toIMC K. Since

i s.t. Li/∈LH

_{[i] = |H|}

1, it holds that H[m + 1] = 0. Thus, f (Lm+1) is decreased by b (all visits to Lm+1were transferred to nonsensitive locations) and_i_∈[1,m]_j_∈C

iwi j ·

xi j = b. Since dq(H, H) = (i∈[1,m]q(H, H[i])) + q(H, H[m + 1]) is minimum and q(H, H[m + 1]) is constant, _i_∈[1,m]q(H, H[i]) is minimum. This implies that

i∈[1,m]

j∈Cici j· xi jis minimum. Thus,Sis a solution toIMC K.

Therefore, the special case of the SLH problem with|L| = 1 is weakly NP-hard, and, clearly, the SLH problem with|L| ≥ 1, is also weakly NP-hard.

A.2 Proof of weak NP-hardness for theTR problem

We reduce the weakly NP-hard Multiple Choice Knapsack (MC K_≥) problem [24,52] to the

TR problem. The MC K_≥problem is defined as follows:

min

i∈[1,n]

j∈Ci

ci j· xi j (A.2)

subject to: (I)_i_∈[1,n]_j_∈C

iwi j · xi j ≥ b, (II)

j∈Cixi j = 1, i = 1, . . . n, and (III)

n, mutually exclusive classes, C1, . . . , Cn, and a knapsack. Each class Cihas|Ci| elements.

Each element j∈ Ci has a cost ci j ≥ 0 and a weight wi j. The goal is to minimize the total

cost (Eq.A.2) by filling the knapsack with one element from each class (constraint II), such that the weights of the elements in the knapsack satisfy the constraint I, where b ≥ 0 is a constant. The variable xi j takes a value 1, if the element j is chosen from class Ci and 0

otherwise (constraint III).

We map a given instanceIMC K≥to an instanceITRof TR in polynomial time, as follows:

(I) Each class Ci, i ∈ [1, n], is mapped to a location Li, which has an arbitrary count in H and a possibly different, arbitrary count in H.

(II) The constant is set to n −_max b

i∈[1,n], j∈Ciwi j.

(III) We choose q() and p() to be normalized in [0, 1] such that each element xi j with

weightwi jand cost ci jis mapped to a value ki jsuch that the following conditions hold: q(H, H[i] + ki j) = 1 −_max_i_{∈[1,n], j∈Ci}wi j _w_{i j} and p(H[i] + ki j, H) = _max_i_{∈[1,n], j∈Ci}ci j _c_{i j}.

The normalization of q() and p() can be done in polynomial time because q() and

p() can take O(N · n) values. If there are multiple values of ki j satisfying the two

conditions, one of these values is selected arbitrarily. When xi j = 1, ki jis added into H[i], obtaining H[i] = H[i] + ki j.

We prove the correspondence between a solutionStoIMC K≥ and a solution HtoITR:

We first prove that, ifS is a solution toIMC K≥, then H is a solution toITR. Since

i∈[1,n]j∈Cici j· xi jis minimum, i∈[1,n] (maxi∈[1,n], j∈Cici j) · p(H[i], H) is mini- mum. Thus, dp(H, H) =i∈[1,n]p(H[i], H) is minimum. Sincei∈[1,n]j∈Ciwi j·

xi j ≥ b and b = maxi∈[1,n], j∈Ciwi j · (n − ), it holds that

i∈[1,n](1 − q(H, H[i] + ki j)) ≥ n − . This implies n − dq(H, H) ≥ n − and dq(H, H) ≤ .

Therefore, H is a solution to ITR. We now prove that, if H is a solution to ITR,

then S is a solution to IMC K≥. Since dp(H, H) =

i∈[1,n]p(H[i], H) is min-

imum, _i_∈[1,n](maxi∈[1,n], j∈Cici j) · p(H[i], H[i])

is minimum. This implies that i∈[1,n] j∈Cici j· xi jis minimum. Since dq(H, H _{) =} i∈[1,n]q(H, H[i]) ≤ and = n−_max b

i∈[1,n], j∈Ciwi j, it holds that n−

i∈[1,n]q(H, H[i]) ≥ n − = maxi∈[1,n], j∈Cib wi j. This

implies_i_∈[1,n](1−q(H, H[i]+ki j)) ≥_max b

i∈[1,n], j∈Ciwi j and

i∈[1,n]j∈Ciwi j·xi j ≥ b.

Thus,Sis a solution toIMC K≥. Therefore, TR is weakly NP-hard. A.3 Reduction from theTA to the TR problem

The TA problem can be reduced to TR in polynomial time: given an instanceIT A of TA,

we can construct an instance ITR of TR in polynomial time, by mapping H and H to

histograms HTR= H and H_TR = H, defining dp(H_TR , H_TR) = _d_p_(H1+1_,H₎₊₁ ∈ (0, 2] and

dq(HTR, HTR ) = dq(H, H), and setting TR= . Given a feasible solution HTR ofITR, we

can map it back to a feasible solution HofIT Awith cost dp(H, H) = _d 1+1

p(HTR ,HTR)−1 ≥ 0

in polynomial time. This requires constructing H= H_TR , which clearly is a solution toIT A.

In document Estudio de las ideas previas del alumnado de infantil sobre ocho temáticas relacionadas con la ciencia escolar (página 55-57)