Data quality

Top PDF Data quality:

Monitoring and data quality assessment of the ATLAS liquid argon calorimeter

Monitoring and data quality assessment of the ATLAS liquid argon calorimeter

The most common issue encountered during data taking is a trip of one HV line, i.e. a sudden drop of voltage due to a current spike. When a current spike occurs, the HV module automatically reduces the voltage in that sector. The HV line is usually ramped up automatically directly after- wards. If the automatic ramp-up procedure fails (or before automatic ramping was used, e.g. early 2011), the HV line can either be ramped up manually or left at zero voltage until the end of the run; in the latter case, thanks to the redundant HV supply, the affected regions remain functional although with a worse signal/noise ratio. During data acquisition, the calibration factors associated with the HV settings are stored in registers of the ROD boards [6] and cannot be changed without a run stop; therefore they remain constant during a run, even if the effective HV value changes. As reduced HV settings induce a reduced electron drift speed, the energy computed online is under- estimated and impacts the trigger efficiency near the trigger threshold. Given the limited size of a sector and the rare occurrence of such a configuration, this had a negligible impact. As previously described, the HV trips are recorded by the DCS data quality flag, but a dedicated HV database including all the trip characteristics is also filled daily by an automated procedure.
Mostrar más

56 Lee mas

ANÁLISIS DE LA CALIDAD DEL DATO Y GENERALIZABILIDAD DE UN SISTEMA DE OBSERVACIÓN DEL CONTRAATAQUE EN EL BALONMANO DE ÉLITE  [Analysis of data quality and generalizability observing system counter in elite handball]

ANÁLISIS DE LA CALIDAD DEL DATO Y GENERALIZABILIDAD DE UN SISTEMA DE OBSERVACIÓN DEL CONTRAATAQUE EN EL BALONMANO DE ÉLITE [Analysis of data quality and generalizability observing system counter in elite handball]

An ad hoc tool observation system comprising a field formats and a set of mutually exclusive and exhaustive categories (E/ME) is presented for studying counterattack in handball. Data quality has a qualitative aspect faced through the agreed matching. The quantitative aspect of this analysis is performed using the Kappa index Cohen, three correlation coefficients Kendall Tau b, Pearson and Spearman and analysis of Generalizability. For these quantitative analyzes sequential analysis program SDIS-GSEQ and generalizability program SAGT v were used. 1.0. Generalizability analysis not only allows to determine the reliability of observers, but also estimate the goodness of fit of the categories and optimize the design of measurement to calculate the minimum number of sessions needed to
Mostrar más

14 Lee mas

Data quality in a big data context: about Twitter’s data quality

Data quality in a big data context: about Twitter’s data quality

As mentioned before, the report on Data Quality of the Data Warehousing Institute estimates that Data Quality problems cost U.S. businesses more than 600 billion dollars a year [4], and some companies in the tech industry are rising their voices on this issue. IBM’s acquisition of Ascential Software (2005), a data integration tools company, highlights the role of Data Quality in one of their reports: it stays Data Quality and security issues as the leading inhibitors (55% of respondents in a multi-response survey) to successful data integration projects [5]. SAP has set up a project for testing in the area of Data Quality with important savings in several internal business processes [25]. Informatica, another leading company on data integration, and Oracle are also relying on Data Quality tools in their products [22] [24].
Mostrar más

38 Lee mas

Data quality in a big data context

Data quality in a big data context

In spite of the relevance of the topic, there has been not much work so far, in particular regarding the implementation of quality processes over Big Data sources. This paper tackles this issue. More concretely, the contributions of this work are: (1) The definition of DQ dimensions and metrics in a Big Data scenario where data arrive as unstructured documents and in real time. Traditional DQ dimensions are redefined, to address those particular characteristics. This general scenario is instantiated to study the concrete case of Twitter feeds. (2) The implementation of a system that acquires tweets in real time, and computes the quality of each tweet, applying the quality metrics defined formally in the paper. The implementation includes a web user interface that allows filtering the tweets e.g., by keywords, computing their data quality, and visualizing the DQ, not only the overall one, but also along each dimension. (3) An experimental study of the quality of the feeds, using the tool described above. This study is aimed at showing how DQ can be used to determine the attributes that characterize the different quality of the tweets, filter out bad quality data, or validate the conclusions drawn in the data analysis phase.
Mostrar más

14 Lee mas

Data quality management and evolution of information systems

Data quality management and evolution of information systems

In a Peer to Peer information system (usually abbreviated into P2P), the traditional distinction, typical of distributed systems, between clients and servers is disappearing. Every node of the system plays the role of a client and a server. The node pays its participation in the global exchange community by providing access to its computing resources, without no obligation on the quality of its services, and data. A P2P system can be characterized by a number of properties: no central coordination, no central database, no peer has a global view of the system, global behavior emerges from local interactions, peers are autonomous, and peers and connections are unreliable. It is clear that P2P systems are extremely critical from the point of view of data quality, since no obligation exists for agents participating in the system, and it is costly and risky for a single agent to evaluate the reputation of other partners.
Mostrar más

12 Lee mas

A flexible framework for assessing the quality of crowdsourced data

A flexible framework for assessing the quality of crowdsourced data

The focus of this paper is to present a framework for validating and assessing the quality of data contributed by citizens with a geographic component. Proactive data improvement through stimulation of authoritative data and metadata is utilised increase accuracy and reduce uncertainty. The standards described for data quality (ISO 19157) and for geospatial metadata (ISO 19115) (together with additional GeoViQua elements) are relevant as the stakeholder overseeing crowdsourcing activities acts as a data producer, but does not fully control the data measurement process. Additionally the stakeholder is able to make judgements and evaluate the data from their own perspective and can also harness dynamic interaction with the user to influence the way the data are captured. Therefore, additional quality elements incorporating a stakeholder model are needed to fully qualify the collected data. These elements derive from assessment concerning the user, like sensor accuracy linked to calibration measures, data captured in relation to other knowledge (Pawlowicz et al. 2011), or their interaction seen as sources of uncertainty (Rousell et al. 2014).
Mostrar más

7 Lee mas

Environment for the evaluation and certification of data products quality

Environment for the evaluation and certification of data products quality

During the planning phase, it was important to identify the common Data Quality issues of the Stakeholders when managing their data products in a production environment. Therefore, part of the investigation consisted in analyzing those issues and classifying them using existing Data Quality models from the literature. These models —that were obtained from the literature review depicted in section II.2— contain common characteristics referring to Data Quality, and have been applied to different use cases. Unfortunately, lots of them do not define specific ways to quantify the characteristics included, leaving the decision to each use case. For that reason, the results of the measurement obtained in different use cases are not comparable. Another issue detected on the existing Data Quality models is that they usually address the quality of data models, metadata, data dictionaries, etc. From the researcher and JRG points of view, the majority of organizations do not update the data models or data dictionaries, neither metadata nor its use are correctly defined, etc. Thus, the challenge was to define a Data Quality model without these flaws.
Mostrar más

203 Lee mas

←
				
											Volver a los detalles del artículo
									
				Compresión de la mortalidad

← Volver a los detalles del artículo Compresión de la mortalidad

Figure 1 and Table A1 present the completeness of death counts for each country, period and sex in lac countries. The quality of mortality data improved steadily over the last half–cen- tury as observed in other studies (Palloni and Pinto-Aguirre, 2011). In more recent years, intercensal years of 2000 to 2010, most of the countries in our analysis show near complete death count registration. For earlier decades, we observed through diagnostic plots, results not shown, that the points for both males and females at young ages are very irregular and lie off the fitted line leading to very unstable estimates of completeness. Also, the estimate of census coverage indicates better coverage in the first census compared to the second one, which is consistent with problems arising from low data quality, net emigration and errors in age declaration. Overall, the fit of the observations (death rates) improved over time and are relatively complete for the most recent periods. The estimates are more precise when fitting the models only for age groups 35 years and older.
Mostrar más

27 Lee mas

Characteristics of Citizen contributed Geographic Information

Characteristics of Citizen contributed Geographic Information

Current Internet applications have been increasingly incorporating citizen-contributed geographic information (CCGI) with much heterogeneous characteristics. Nevertheless, despite their differences, several terms are often being used interchangeably to define CCGI types, in the existing literature. As a result, the notion of CCGI has to be carefully specified, in order to avoid vagueness, and to facilitate the choice of a suitable CCGI dataset to be used for a given application. To address the terminological ambiguity in the description of CCGI types, we propose a typology of GI and a theoretical framework for the evaluation of GI in terms of data quality, number and type of contributors and cost of data collection per observation. We distinguish between CCGI explicitly collected for scientific or socially-oriented purposes. We review 27 of the main Internet-based CCGI platforms and we analyse their characteristics in terms of purpose of the data collection, use of quality assurance and quality control (QA/QC) mechanisms, thematic category, and geographic extents of the collected data. Based on the proposed typology and the analysis of the platforms, we conclude that CCGI differs in terms of data quality, number of contributors, data collection cost and the application of QA/QC mechanisms, depending on the purpose of the data collection.
Mostrar más

7 Lee mas

Evaluacion de la Calidad Linked Open Data basada en un Modelo de Lógica Difusa

Evaluacion de la Calidad Linked Open Data basada en un Modelo de Lógica Difusa

Abstract. Linked Open Data has been one of the most widely used online data publishing methods in recent years. This growth means that the quality of this data is required for the benefit of consumers and people who wish to use this data. There are approaches based on classical mathematical models, however, most of these results are too linear; that is, they use conventional evaluators to de fi ne both quality aspects and results. In response, a new approach based on fuzzy logic is constructed as an application, which aims to complement and compare traditional models without the need to restrict the quality aspects with which it can be measured. As a methodology, it is done by obtaining data from each dataset through the SPARQL Endpoints provided by high category data- sets, classifying them within accessibility and trust dimensions, represented in 4 values: response time, scalability, trustworthiness and timeliness. This analysis is done internally for the values within the accessibility dimension, and exter- nally for the values within the confidence dimension. In this way, it is possible to know or determine a better general quality approximation of the Linked Open Data according to a large number of quality evaluation variables, or even parameterize its own aspects in the model as a complement to the already established models, through the concept of fuzzy logic.
Mostrar más

12 Lee mas

Defining a set of standardised outcome measures for newly diagnosed patients with multiple myeloma using the Delphi consensus method: the IMPORTA project

Defining a set of standardised outcome measures for newly diagnosed patients with multiple myeloma using the Delphi consensus method: the IMPORTA project

The participants in the discussion group considered that the side effects of MM treatments were important outcomes since they commonly cause considerable morbidity and low HRQoL in patients with MM. 5 13 14 To facilitate data collection, the health professionals proposed a simpli- fied version of the Common Terminology Criteria for Adverse Events V.4, 15 clustering them into general cate- gories (bone marrow suppression, constitutional, cardio- vascular, hepatic, renal, neurological, gastrointestinal, skin, infection and others). The Delphi panellists agreed to collect each completed treatment (with or without dosage reduction) and those side effects that hamper the patient’s daily activities or those that imply changes in the treatment pattern (table 1). Consensus was achieved to collect this information monthly during treatment and every 2 or 3 months during periods without treatment (table 1 and figure 1).
Mostrar más

8 Lee mas

A Semantic Data Grid for Satellite Mission Quality Analysis

A Semantic Data Grid for Satellite Mission Quality Analysis

The biggest gain is that it is much more robust in the face of changing data. We can continue to use these "semantic level" queries about instruments even if we add new event types which use this instrument or change the unique identifiers for individual DMOP event records. If further data in the system contained new sorts of events planned and carried out by the Radar Altimeter then our queries would automatically match them. In any of these extended cases a simple statement associates the new event type with an existing instrument or new events with an existing event type. The exact same query (for use of an instrument) will then also report about these new events. We shifted from talking about details of identifiers to the actual objects which the user is concerned about, i.e. we moved to a more semantic level. This process is shown in more detail in an OntoGrid demonstration video [14].
Mostrar más

15 Lee mas

Knack S Aid Dependence & Quality

Knack S Aid Dependence & Quality

Aid dependence is measured above by country mean values over the 1982-95 period. If aid is highly variable over time within a country, dependence might be lessened in the sense that aid cannot be relied on as a stable source of funds. This reduced reliance could diminish the harmful impact of aid on the quality of governance. In Svennson's (forthcoming) model, the expectation of aid increases rent seeking and corruption. On the other hand, high aid variability in a country may indicate that donors have a shorter term, project oriented emphasis that disrupts existing institutions, replacing them with new ones that collapse when funding ends (Meyer, 1992).
Mostrar más

39 Lee mas

Definitions of quality in higher education: A synthesis of the literature

Definitions of quality in higher education: A synthesis of the literature

This article has several implications for institutions and quality assurance practitioners. As discussed in the beginning of this article, some have argued that quality is indefinable; however, given the increasing public and governmental interest in quality in higher education, this argument may no longer be acceptable. Institutions must be able to provide evidence to support claims of quality, which often includes systematic assessment of quality. One must be able to define quality in order to assess it. As shown in Table 3, the authors have recommendations for defining quality and quality assurance depending on the existing state of quality initiatives at an institution. The aim of the recommendations for definition quality and quality assurance is to meet institutions and quality assurance practitioners where they are in an effort to help them bring greater clarity and alignment to existing quality assurance practices. In addition, the recommendations must be considered in the context of institutional mission and existing cultural, regulatory, and political environments.
Mostrar más

11 Lee mas

Reputation and quality of the host in Airbnb Mallorca

Reputation and quality of the host in Airbnb Mallorca

perception as emerged from past reviews in order to provide a new and more exhaustive indicator of product quality. As is generally known one of the risk that has Airbnb is the asymmetric information that arise when someone is going to book a room or double room, to avoid one of the main dilemmas and as consequence of the recent news and the boom that Airbnb had suffered this two years has been one of the motivations to done this final project focused on the quality that our hosts from Mallorca give to guests that come to visit this island. The method taken for the analysis for review was sentiment analysis, and for the host characteristics was following some steps explained above. This study try to append to the literature a new index of quality that could be used to more cities, even countries, as this one studied exclusively the Island of Mallorca. To be more informed about the quality and some factors that affect host’s quality, has been read it from the literature significant articles and news about this kind of things that concerned Airbnb and more hosts and guests that use this
Mostrar más

19 Lee mas

Safely managed drinking-water

Safely managed drinking-water

A growing number of nationally representative household surveys have integrated direct testing of drinking water quality with support from the JMP. In these surveys, field teams test for an indicator of faecal contamination, E. coli, using mem- brane filtration and dehydrated growth plates. The results can be used to assess the level of risk for different water sources and across population groups, to identify inequalities. Water is tested from a glass of drinking water as well as directly from the place where the water was collected. Intensive training and field supervision are combined with ‘blank’ tests to provide quality control and quality assurance. Drinking water has also been tested for chemicals such as arsenic and fluoride, either in the field or by sending samples to a laboratory.
Mostrar más

56 Lee mas

Measuring the quality of management in education. Review article

Measuring the quality of management in education. Review article

The objective of this study is to perform a review and contextualize the existing definitions of educational quality from the managerial point of view. We will be presenting also the factors that have been considered to support managerial decision making within educational institutions. Relevant research related to the different models for measuring educa- tional quality and the different factors that affect this quality are discussed. The existing methodological gap of the sta- tistical processes, the theoretical evidences and the number of investigations in every level of education are identified. The results provide a framework for future research and can become the basis for the design and construction of mul- tidimensional models for educational management quality measurement needs of educational institutions. The results evidence also the lack of a single criterion to build the indicators, as well as the fact that there is a strong of subjectivity in the measuring processes.
Mostrar más

14 Lee mas

Análisis de la calidad de datos en fuentes de la suite ABCD

Análisis de la calidad de datos en fuentes de la suite ABCD

Los sistemas de ficheros son una parte fundamental en una arquitectura big data, ya que varias herramientas están construidas sobre ellos. Además, el hecho de trabajar con datos no estructurados los hace aún más importantes ya que son el medio principal para trabajar con este tipo de información. Adicionalmente, un objetivo que buscan los sistemas big data es la escalabilidad (scalability), es decir, un sistema que pueda variar su tamaño (ya sea aumentándolo o disminuyéndolo) según las necesidades y de manera que no afecte el rendimiento general de todo el sistema. Esta necesidad fue la que motivó la aparición de los sistemas de ficheros distribuidos, que consisten en una red o clúster de ordenadores (o nodos) interconectados entre sí y configurados para tener un sólo sistema de ficheros lógico. Un ejemplo de estos ficheros y uno de los más usados es el sistema de ficheros distribuidos de Hadoop (HDFS, por sus siglas en inglés), el cual está diseñado especialmente para ejecutarse en hardware asequible o de bajo costo, y para ser tolerante a fallos (Morros and Picañol, 2013).
Mostrar más

119 Lee mas

An autonomic framework for enhancing the quality of data grid services

An autonomic framework for enhancing the quality of data grid services

The 1 GB read operation bandwidth in each mode is noticeably different. The ran- dom mode client is clearly penalized because it is equally likely to select either the best or the worst resources. On the other hand, both proposed decision-based mode approaches guarantee the use of optimal resources because decisions are focused on obtaining high-performance. Thus, although the random mode client does not contact the broker, eliminating the extra overhead, it takes around 38% more time than decision mode client and around 46% time more than prediction & decision mode client. In the case of 1 GB read operations, the difference observed between the two decision-based modes is due to the prediction enabling more constant data access in the long term. The decision mode client is penalized by the grid changes because write operations do not take into account the behavior prediction. The prediction & decision mode client achieves an improvement of around 10% with regard to the decision mode client. As in 10 MB and 100 MB file size accesses, the best resource mode obtains the worst perfor- mance since clients only access a single grid resource. From a general perspective, both decision-based modes (the fundamental part of this contribution) obtain significantly better results.
Mostrar más

26 Lee mas

Infering Air Quality from Traffic Data using Transferable Neural Network Models

Infering Air Quality from Traffic Data using Transferable Neural Network Models

In addition to ANN, fuzzy sets provide another tool in dealing with uncer- tainty in dispersion modelling. Fisher [3] reviewed the various uncertainties exist- ing in dispersion modelling, and highlighted the feasibility of fuzzy approaches to environmental decisions. Another fuzzy based system for predicting modelling air quality was proposed by [16], where the use of a trapezoidal membership function was proposed. The model is based on data collected over a year using 5 locations in Tehran.

12 Lee mas

Show all 6520 documents...