• No se han encontrado resultados

Towards a Human-Centric Data Economy

N/A
N/A
Protected

Academic year: 2023

Share "Towards a Human-Centric Data Economy"

Copied!
192
0
0

Texto completo

Unsurprisingly, unlocking the value of data has become a central policy of the European Union, which also estimated the size of the data economy in 827 C billion for the EU27 in the same period. Then we present a first-of-its-kind measurement study that sheds light on the pricing of data in the market using a new methodology. The data economy will develop rapidly in the coming years, and researchers from different disciplines will work together to unlock the value of data and make the most of it.

Nevertheless, we hope that our work will illuminate the value of data and contribute to greater transparency in data pricing and, eventually, a shift to a human-centric data economy. Estimating the value of data and pricing a dataset become more difficult tasks due to the elusive nature of the "commodity" being traded. In the era of protecting the privacy of end users, tracking the use of personal data and calculating the fair value of data in the market, there are still significant economic and technical challenges.

In the business-to-business (B2B) market, entities with different business models are responding to the demand for data and looking for ways to operationalize data trading. According to the EU [62], this is the size of data markets, meaning 'the marketplace where digital data is exchanged as 'products' or 'services'.

Objectives

There is also a significant multiplier,×5to×10, depending on the study, between the size of data markets and the impact of data on the economy. Unlocking the value of data is central to the European Strategy for Data [186], which aims to create a digital single market around data in the EU enabling the enormous potential and opportunities of data for European to unlock citizens. What features of data products drive their prices in the market. a) Data pricing and selection problems.

If data consumers were forced to pay people for their data, they would be required to choose between a large number of data sources of thousands of individuals to feed data for ML tasks. First, a price for the data must be agreed upon by the parties involved in the exchange, therein lies the problem of data pricing. Second, data consumers will look forward to minimizing the number of individuals whose information they purchase by carefully selecting those that suit their needs, as opposed to the current practice of indiscriminate collection of large amounts of data. data.

Furthermore, this thesis addresses the pricing problem from a market perspective by defining a methodology and developing technical components to measure the prices of data products in commercial markets. In Part IV, we leverage this research and some of its technical components to present a high-level design of a data pricing tool to answer the question, “How do I price my data?” .

Contributions

This may be simple if an individual price is set by each seller, but it is not trivial in the more realistic case that the market determines a price for data sets by combining data from different sources. Moreover, data exchange and trading processes must be efficient to avoid hindering the benefits of a human-centric data economy. How can data consumers select only data appropriate for their task from a variety of acceptable sources in an efficient manner.

Structure

Related Works 11

The value of data

  • Data as an economic good
  • Measuring the value of data

It is precisely the elusive nature of data as an economic good and asset that has inspired a number of industry metaphors in recent years. He also touches on the value of data for specific purposes, stating that it is not necessarily tied to price. Uniqueness also affects the value of data: the more you share it, the lower the value becomes.

Its value is affected by external factors, such as the privacy implications of data sharing. Furthermore, data characteristics and prior considerations about their value influence how they are marketed [97, 164] and shape the data economy and new data markets. Regardless of its nature, data is becoming a cornerstone in the digital economy and is now considered a key asset of data-driven companies.

Due to the wide spectrum of data and use cases that can be found in the market [118], many different methods and works attempt to estimate their value, often resulting in seemingly contradictory estimates. Other works have assessed the value of data for concrete tasks from a perspective in the border between microeconomics and computer science.

Data pricing

Some authors argue that privacy should be taken into account in the pricing of personal data, and they have defined pricing strategies and marketplaces based on differentiated loss of privacy for consumers [76] and also pricing inquiries [117]. Some authors have developed privacy-preserving data marketplaces [137], which propose to compensate sellers regardless of whether their data is included in the final set of traded data. Quality-based pricing prices different versions of data by evaluating and assigning weights to certain quality features [86].

Some query pricing mechanisms support history-based pricing (depending on customer purchase history) [53, 107]. The pricing of personal data has received considerable attention from the privacy and measurement community. Versions that produce and offer different versions of a data product with different utility and price actively use different pricing mechanisms.

The freshness, history, features, extent, volume, format, resolution, or accuracy of data are used to provide different versions of a data product. Finally, some authors have studied the pricing of "bundles" of data products, which makes sense as long as bundling contributes to increasing the willingness to pay for information goods [164].

Data marketplaces in the research community

We analyzed the information at the level of entities that trade data on the Internet and at the level of the data products they offer in the market. Similarly, for a subset of the vendorsK ∈ S, we will denote byd(K) their aggregate data set, and byd(d(K)) the maximum accuracy that can be achieved by all or a subset of the data ind (K) to use. In the base scenario, we will also assume that the mean test value is that of the test set, i.e. τ =d(S).

Since the value of a subset of players K (v(K), K ⊆S) depends only on the elements in K and not on the order of the elements in this case, v(S) must compute for the2|S become |only possible combinations of the data sources. In the base case scenario, such a target is aligned with the average age of the population in S, i.e. τ = d(S). Finally, the Shapley values ​​also depend on the data that was already available to the buyer at the beginning of the purchase process.

Value function: To compare the model prediction with the actual travel time given in the test set, we resort to the R2 result. At first, you might think that the value of the data coming from the provider is determined by its quantity. For each of them, we calculated the Shapley value of 16 companies using the Shapley formula from Eq.

Referencias

Documento similar

Complemento de régimen (C.Rég) Completa el significado de algunos verbos que se construyen con preposición. Desempeña esta función. Un grupo nominal con preposición. Se apoderaron