PDF superior Information Quality Evaluation for Grid Information Services

Information Quality Evaluation for Grid Information Services

Information Quality Evaluation for Grid Information Services

The work described in this paper has several objectives. First, we want to obtain a fair sys- tematic approach to measure information quality of different Grid information services, so that we can compare them and provide guidelines related to the circumstances in which each of them can be used. The main challenge here is related to the fact that different Grid informa- tion services have different information models to represent the same type of Grid resources: some of them use LDAP to represent that information and others use relational models, and the information that they store about each resource may also differ. Unlike information quality evaluation in other domains (such as Web search, where precision and recall measurements can be obtained by counting numbers of documents), the information objects in our evaluation are heterogeneous, both in the information model used and in its access API, what makes it hard to compare the outputs. We have proposed the use of a common information model to allow comparisons between these outputs. We explain the details in Section 3.2.
Mostrar más

11 Lee mas

An autonomic framework for enhancing the quality of data grid services

An autonomic framework for enhancing the quality of data grid services

Although, as far as the authors are aware, no previous work has followed the pro- posed approach, many projects have tried to model grid resources to be able to make decisions on how and where to replicate data. The first approach, widely used in grids, is based on measured performance values. In these systems, the load is measured once (peak value), periodically or every time an event occurs. This observed performance serves as the basis for making decisions [4, 12, 22, 31, 39]. This approach is based on the assumption that the future will be like the measured state, i.e. that the system is somehow static. A second approach consists in modeling using analytical models [1, 17, 27, 30, 38]. Another family of models is based on correlation between events. The idea is to be able to predict events by its correlation to previous events [7] or workload characteristics [40]. As an evolution of these models, there have also been proposals going beyond building the model, by checking the accuracy of the model and deciding to remodel whenever needed [37]. The proposed models are not adequate for our objective because they cannot predict the behavior at a given time when this time is not known a priori, making it impossible to dynamically adapt to changes in the system behavior.
Mostrar más

26 Lee mas

Quality of bus services performance: benefits of real time passenger information systems

Quality of bus services performance: benefits of real time passenger information systems

The public transport network in Majadahonda is composed of 18 intercity bus lines, 2 urban lines and a suburban rail station. Urban bus service is made up of circular lines 1 and 2. This is a unique circular route and each line runs in either direction. This service complements the service offered by intercity bus lines that pass through the city centre with alternative routes and that run through urban main streets. The main intercity bus to Madrid travels the corridor A-6, reaching the Moncloa interchange, in the capital. The approximate travel duration form Majadahonda to Moncloa Interchange is 30 minutes. Altogether, they constitute an important service with some 30,000 passengers on a working day. Regarding suburban rail station, it is well connected with urban and suburban bus lines. It is served by two suburban train lines connecting Majadahonda with 5 metropolitan interchangers and 8 connections with the underground network. The daily demand for the station is more than 10,000 passengers, including boarding and alighting passengers. The station has a park and ride facility of 1,200 places.The parking fees are applied to users of public transport according to their transport ticket (CRTM 1 Travel Pass, Monthly Travel Pass Cercanias and round trip tickets).
Mostrar más

12 Lee mas

GEOGRAPHIC INFORMATION SYSTEM FOR EVALUATION OF ROAD TRAFFIC NOISE ALONG THE ROAD

GEOGRAPHIC INFORMATION SYSTEM FOR EVALUATION OF ROAD TRAFFIC NOISE ALONG THE ROAD

Basic Data The basic data used in the System is shown in Table 1. The geographic data is data to draw a map of Fukuoka City. It consists of a background (boundary and district of Fukuoka City, etc.), building data (shape, coordinates, type, etc.), and road data (name, kind, road center, and road edge, etc.). These are obtained from topographical maps of a scale of 1 to 2,500 in 1998 version issued by Geographical Survey Institute of Ministry of Land, Infrastructure and Transport. The investigated data is obtained from the investigation that Fukuoka City performed deliberately on the purpose of the System. It consists of road information (width of sidewalk, number of lanes, etc.), measured noise levels at roadside, traffic data (traffic volume in each type of vehicles) and others (photograph of the current state of investigation points, etc.). The data for estimation is data which are necessary to estimate the noise level in each building located within 50m from the road. It consists of the town block which is a base unit for noise estimation, noise parameters (prospective angle to the road from a predicting point and the density of buildings on the area, etc.) and others (the environmental quality standards, etc.).
Mostrar más

6 Lee mas

TítuloUsing score distributions to compare statistical significance tests for information retrieval evaluation

TítuloUsing score distributions to compare statistical significance tests for information retrieval evaluation

Query splitting methods are limited to assess the consistency of a significance test with itself. But the results of the test can be consistently wrong over the splits (rejecting a true H0 –type I error– or failing to reject a false H0 –type II error–). With no knowledge of the truth or falseness of H0, we should not just equate consistency with reliability. Comparing significance tests based on the permutation test, as done in (Smucker et al., 2007), is not exempt from problems either. The permutation test can compute a good approximation to the exact p-values, but we should not produce miss or false alarm rates from such p-values. Given a certain significance level (α) and the p-values estimated by permutation, the miss and false alarm rates of each significance test are measured based on the agreement between the test’s decisions and the permutation test’s decisions. Such an approach implicitly assumes that those cases where the p-value produced by permutation is above α are cases where H0 is true and, conversely, those cases below α are cases where H0 is false. This rule compromises the quality of the analysis because the permutation test is not error-free. For example, with α = .05, the permutation test would be making an average of 5% type I errors (no difference between the systems, but the permutation test says otherwise). As such, in 5% of cases, giving blind faith to the permutation test unfairly penalises any significance test that makes the correct decision (any significance test that is skeptical about the difference is actually right!). Likewise, the permutation test makes some type II errors and accepting such permutation test’s decisions is unfair to those tests that detect the difference. In summary, the main flaw of existing methods to assess the reliability of significance tests is that they make strong assumptions about the truth value of the null hypothesis. Although the previous studies substantially contributed to analysing the use of significance tests in IR, we believe that a more robust methodology, based on actual knowledge about H0, can be designed. This is precisely the primary aim of our paper.
Mostrar más

22 Lee mas

An autonomic framework for enhancing the quality of data grid services

An autonomic framework for enhancing the quality of data grid services

For this purpose, the weights of the different stages are examined step by step: (i) analysis of the active grid resources to obtain their predictions and the historical information of[r]

12 Lee mas

Estimation of Perceived Quality in Convergent Services

Estimation of Perceived Quality in Convergent Services

A model for the estimation of quality as perceived by the users (i.e., the user Quality of Experience, QoE) in Triple- Play (3P) and Quadruple-Play (4P) services has been presented. The model is based on a matrix framework defined in terms of user types, service components, and user perceptions on the user side, and agents, agent capabilities, and performance indicators on the network side. A Global Quality Evaluation process, based on several layers of evaluation functions, has been described, that allows to estimate the overall quality of a set of convergent services, as perceived by the users, from a set of performance and/or Quality of Service (QoS) parameters of the convergent IP transport network. The model has been refined for the particular case of residential (domestic) users with a specific information flow where the content server is external to the ISP and there is no content caching outside the content provider. The full sets of services, user perceptions, valuation factors, agents and agent capabilities have been provided, as well as the full matrix of matching points between agent capabilities and user perceptions. Performance indicators, as well as valuation and parameterization functions for some representative services (Digital Video Broadcast in IPTV, Voice Call in IP Telephony, and Web Browsing in Internet Access) have been provided. For Global Service Quality evaluation, weights for
Mostrar más

8 Lee mas

Methodology for Data Loss Prevention Technology Evaluation for Protecting Sensitive Information

Methodology for Data Loss Prevention Technology Evaluation for Protecting Sensitive Information

The authors considered suitable the used of the inductive method [20] in order to test the capabilities of DLP technology. Since technological solutions are based on the requirements of the industry, the author considered viable to develop the methodology using a specific necessity of all private and public companies [21]. If the investigator does not have a clear idea of what to protect or which law has to be complied, testing will be complex and it may face resource limitation problems since it is aiming to test everything. The private information related to employees or customers has been chosen due to support the new productive matrix of the Republic of Ecuador, where it is imply that services should be improved, so information security will be the starting point to assure that the service will be delivered correctly [22]. Consequently, the authors will proposed a methodology to test the capabilities of DLP technology by checking how it can prevent the loss of sensitive person’s data, which according to the “Ley de Comercio Electronico, Firmas y Mensajes de Datos” from Ecuador[19], must be protected because it is private information. The methodology will provide the capabilities of the DLP technology for this specific scenario, but the capabilities of the DLP technology can be generalized using the inductive method [23].
Mostrar más

10 Lee mas

Active Ontology: An Information Integration Approach for Dynamic Information Sources

Active Ontology: An Information Integration Approach for Dynamic Information Sources

As for the detailed evaluation, we have analysed the average and worst case time re- sponses of our system with respect to other configurations where no metadata cache is used, as well as the accuracy of the information provided to the requestor with different alterna- tives, such as those based only on materialising information (with or without updates) and those based only on virtual information access on demand. We also designed experiments for information quality measurement and conducted them on the EGEE Grid testbed. The experiment results show : 1) BDII has bad precision results for complex queries because of its weak query (LDAP-based query) ability; 2)RGMA is very sensitive to the registering and availability of information providers at a given point in time; 3) Some complex queries cannot be answered by BDII or RGMA in isolation; 4) the ActOn-based information ser- vice has the ability to adopt existing information sources as its information providers, and aggregate information from these information sources to answer such complex queries. The details are presented in [36].
Mostrar más

15 Lee mas

An ActOn-based Semantic Information Service for EGEE

An ActOn-based Semantic Information Service for EGEE

The main limitations of existing information services are that they do not provide enough information about large-scale distri- buted systems, since they only focus on a few specific aspects of such systems, and that they do not always provide accurate infor- mation about the actual status of the Grid resources that they refer to. As aforementioned, BDII [4] and MDS2 [5] capture information about hardware and software resources, but do not provide infor- mation about data sources, networking connections, services and running environments. Furthermore, in some cases the informa- tion models used by existing information services are ill-defined or cannot be handled easily to solve general-purpose queries. For example, in MDS4 [7] the keyword ‘‘MPI’’ is used to describe that a site is ‘‘MPI-enabled’’, but this does not necessarily mean that the MPI configuration is ok in that site, what is missing from that in- formation model. Our experience in Crossgrid, LCG Grid, Open Sci- ence Grid, and EGEE Grid shows that this can lead to failures or inadequate behaviours in other middleware services that heavily depend on information services, like resource brokers, job sched- ulers, etc.
Mostrar más

14 Lee mas

Data quality management and evolution of information systems

Data quality management and evolution of information systems

In a Peer to Peer information system (usually abbreviated into P2P), the traditional distinction, typical of distributed systems, between clients and servers is disappearing. Every node of the system plays the role of a client and a server. The node pays its participation in the global exchange community by providing access to its computing resources, without no obligation on the quality of its services, and data. A P2P system can be characterized by a number of properties: no central coordination, no central database, no peer has a global view of the system, global behavior emerges from local interactions, peers are autonomous, and peers and connections are unreliable. It is clear that P2P systems are extremely critical from the point of view of data quality, since no obligation exists for agents participating in the system, and it is costly and risky for a single agent to evaluate the reputation of other partners.
Mostrar más

12 Lee mas

Information Reconciliation for Quantum Key Distribution

Information Reconciliation for Quantum Key Distribution

Although linear codes are a good solution for the reconciliation problem, since they can be tailored to a given error rate, their efficiency degrades when it is not known beforehand. This is the case in QKD, where the error rate is an a priori unknown that is estimated for every ex- change. The QBER might vary significantly in two consecutive key exchanges, specially when the quantum channel is transported through a shared optical fibre that can be used together with several independent classical or quantum channels that can add noise. To address this problem there are two different options: (i) it is possible to build a code once the error rate has been estimated, and (ii) a pre-built code can be modified to adjust its information rate. The computational overhead would make the first option almost unfeasible except for very stable quantum channels, something difficult to achieve in practise and impossible in the case of a shared quantum channel in a reconfigurable network environment [11]. In this paper we propose the use of the second strategy as the easiest and most effective way to obtain a code for the required rate, for which we describe a protocol that adapts pre-built codes in real time while maintaining an efficiency close to the optimal value.
Mostrar más

13 Lee mas

Participation in voluntary organizations

Participation in voluntary organizations

Inspection of Table 10 indicates some differences on the effect of institutions across sectors. First, economic freedom has a negative impact on participation in the labor services group. Participation in professional organizations and labor unions appears to diminish with more economic freedom, that is, with less regulated markets (including labor markets). The effect is statistically significant and is also intuitively reasonable, since fewer regulations, more privatization and less government intervention may imply that employers have increased their relative bargaining power over unions leading to a decrease in unionization rates. On the other hand, if no regulations exist, then the relative power of unions to act in defense of those regulations is hurt, and their effectiveness diminished. Moreover, the direction of the effect on this group is the same as the effect on all groups, indicating the negative impact of deregulation is more important in terms of participation in labor groups than in community, human capital and political groups.
Mostrar más

61 Lee mas

Water quality and health - Review of turbidity: Information for regulators and water suppliers

Water quality and health - Review of turbidity: Information for regulators and water suppliers

High levels of turbidity in source water may limit the effectiveness of household treatment methods; for example, by overloading and clogging filters, or reducing the effectiveness of chlorination or solar disinfection (WHO, 2011). While high turbidity is not desirable, chlorination can still provide benefits. Free chlorine residuals can be produced in the presence of turbidities ranging from above 1 NTU to above 100 NTU (Crump et al., 2005; Lantagne, 2008; Mohamed et al., 2015), resulting in inactivation of bacterial indicators and reductions in diarrhoeal disease (Crump et al., 2005; Elmaksoud et al., 2014). Based on the available evidence, while water should ideally be chlorinated at turbidities less than 1 NTU, if this cannot be achieved (e.g. through pre-treatment or settling), disinfection should still be practiced with higher disinfection doses or contact times (Table 1).
Mostrar más

10 Lee mas

Size and structure of the chilean information economy = Tamaño y estructura de la economía de la información chilena

Size and structure of the chilean information economy = Tamaño y estructura de la economía de la información chilena

Industries with intensive use of natural resourc- es such as “Mining,” “Agriculture and Forestry” and “Fishing” also showed a faster growth of information activities than their respective aggregate sectors. Tra- ditionally, Chile’s competitive advantage has been the exports of products with intensive use of natural re- sources, competing primarily by low costs. Recent stud- ies indicate that economies with intensive use of natural resources tend to grow less in the long term than those that develop technologically unless they innovate to for- tify their advantages around these resources or build new ones (Tokman and Zahler, 2004). Important chal- lenges for the Chilean economy include incorporating
Mostrar más

16 Lee mas

LOST to follow up Information in Trials (LOST IT): a protocol on the potential impact

LOST to follow up Information in Trials (LOST IT): a protocol on the potential impact

extreme and in most cases unrealistic assumption. How- ever, it is valuable in demonstrating the robustness of a trial results when the effect estimate remains statistically significant even under the assumption of a worst-case sce- nario. The remaining 2 assumptions (assuming that LTFU participants have different event incidences than the observed group) are more plausible and designed (in terms of differential extent and direction between inter- vention and control) to test the robustness of effect esti- mates. The challenge is to choose plausible values for RI LTFU/FU and RD LTFU/FU . The values we use in our dummy tables are just illustrative and we are continuing to explore the literature for evidence to support our choices. For example, a recent report on antiretroviral therapy scale-up programs in Africa found that the incidence of death among participants LTFU was 5 times as high as the inci- dence in those followed-up [14]. While other imputation methods such as regression models and multiple imputa- tions might be preferable, they are not feasible because they would require raw data from each included study. The results of this study should have important implica- tions for trialists. Evidence of vulnerability will call for improving both trial design and implementation, to min- imize LTFU. The results of our study will also have impor- tant implications for clinicians interpreting the findings of RCTs. Our findings will uncover the potential impact of plausible assumptions about the outcome of participants LTFU on the results of those positive studies that are most likely to affect clinical practice. They might provide reas- surance that these results are usually – at least for reports in five prestigious journals – robust or, on the contrary, suggest that many high-profile trials are vulnerable. We may also find that results vary in robustness and users of the literature will have to evaluate them on a case by case basis using clinically plausible assumptions. Our fondest hope is to help create a culture in which investigators uni- formly test alternative assumptions regarding LTFU and discuss the extent of the vulnerability of their findings to varying assumptions regarding LTFU.
Mostrar más

11 Lee mas

Complex Data-intensive Systems and Semantic Grid: Applications in Satellite Missions

Complex Data-intensive Systems and Semantic Grid: Applications in Satellite Missions

Ontologies make feasible the partial annotation and identification (by several criteria) of relevant pieces of information within the files, instead of keeping an actual copy of all the contents of those files. As long as the amount of data produced by every Satellite Mission is enormous, it is a tremendous advantage, on a routine basis, to just keep track of the relevant information in order to be able to access, at a later step, the actual information needed by the users, and at that time, request the full content of the product from the archive. Access on demand to Satellite system information requires retrieving files from several areas in the system (planning, MCMD generation, etc) that can be spread in several physical locations. This is in accordance with the current evolution of the Processing Data Centers for Satellite products with distributed computations and selective on demand processing.
Mostrar más

8 Lee mas

The dicode workbench: A flexible framework for the integration of information and web services

The dicode workbench: A flexible framework for the integration of information and web services

Workbenches are Web-based applications that integrate - at the level of the user interface - various data mining and collaboration support services and make them available to the users[r]

10 Lee mas

Yemen

Yemen

The infant mortality rate has declined considerably, reaching 74.8 per 1000 live births in 2003. The neonatal mortality rate is 37.3 per 1000 live births, and the under-five mortality rate is 101.9 deaths per 1000 live births. The infant mortality rate is higher in rural areas (86.3) then in urban areas (70.6). Similarly, the under-5 mortality rate is much higher in rural (117.6) than urban (87.3) areas. Infants with low birth weight comprise 32% of all infants, and the prevalence of underweight children under 5 years of age is 46%. Low weight among children is one of the major contributing factors to the high infant and under-5 mortality rates. Other contributing factors are: high fertility; illiteracy; young age of mother at first birth; high parity; closely spaced pregnancies and limited breastfeeding compounded with poverty; low coverage with quality health services and low access to safe water and sanitation; low immunization levels among children aged 12–23 months (56% in urban areas and 20% in rural areas); and limited availability of treatment for acute respiratory infection and diarrhoea in health facilities.
Mostrar más

64 Lee mas

Designing information for organizations

Designing information for organizations

when the managers do not acquire information, the best response of the CEO is very sensitive to different levels of need of cooperation and levels of bias of managers towards their own division. Here, three conditions are required to hold. First of all, condition 3.7 for each manager is required to guarantee that managers do not acquire information. As discussed in the last section, condition 3.7 is feasible to hold when the local condition does not vary too much, when the managers are not too biased towards their own division and when the need for cooperation is relatively near to −1. Second of all, condition 3.10 must hold. Such condition depend on the term φ 2 +4δφ+3δ 2(φ+2δ) 2 2 −δφ 2 , which is an implicit function of δ and φ. Once again, the use of simulations
Mostrar más

35 Lee mas

Show all 10000 documents...