• No se han encontrado resultados

Efecte de l’alçada des les canyes en els resultats de l’aplicació d’herbicida

4. Anàlisi de noves tècniques per a l’eradicació de l’ A donax

4.6 Efecte de l’alçada des les canyes en els resultats de l’aplicació d’herbicida

We use WiredTiger [140] to store the data items on disk. WiredTiger is a high- performant key-value store providing transactional access to data stored on disk with support for snapshot isolation semantics. It uses a log-structured storage layout taking advantage of lock-free data structures to enable high transaction rates even with higher degrees of concurrency.

4.7

Chapter Summary

In this chapter we looked at the REST+T extensions to HTTP that enable web and service-oriented applications to be written reliably against web-service and other endpoints. We described the challenges faced with current HTTP based ap- proaches to provide transaction guarantees and propose a set of additional HTTP methods that enable an HTTP interface to be used for better transaction guar- antees. In Chapter 6 we describe the implementation of Tora in detail and show how easy it is to extend the Apache HttpClient library to support these additional methods. Finally, the REST+T API is evaluated against standard HTTP with conditional writes using a simple micro-benchmark in Chapter 8.

Chapter 5

Cherry Garcia - The Protocol

In this chapter, we describe the design of Cherry Garcia1[40], a client-coordinated

transaction processing protocol, that enables application defined transactions in- volving multiple data items that may reside in separate, possibly heterogeneous, data store instances. Applications can use a library implementing the Cherry Gar- cia protocol to access one or more data items stored in one or more heterogeneous data stores with transactional semantics.

The library implementing the protocol exposes an API that abstracts the data store instances as a class called Datastore. The applications access data items in the data stores using this interface via a transaction coordinator abstraction, a class called Transaction.

Each data record is addressable using its key, and its value can be accessed using an object of a class called Record. For simplicity, we assume that keys identifying data items are strings. However, in practice the key can be extended to support other simple or composite types.

We begin this chapter by describing the challenges in providing transactional access to multiple data items that reside across distributed data stores in Section 5.1. Next, in Section 5.2, we describe the intuition behind our proposed approach. This is followed by Section 5.3 in which we describe a typical user application that performs operations on multiple data items. In Section 5.4 through Section 5.10, we define the Cherry Garcia protocol, define the prerequisites, and describe it in detail. Later, Section 5.11 describes techniques used to detect and avoid

1Cherry Garcia is a name of a Ben & Jerry’s ice-cream flavour with heterogeneous aspects of

chocolate and fruit.

deadlocks between concurrently transactions. Next, optimisations to the protocol are discussed in Section 5.12. While Section 5.13 covers different failure scenarios possible and how the protocol handles them. In Section 5.14 we provide a sketch of a proof of correctness of our algorithm. We discuss approaches to extending Cherry Garcia to implement fully serializable transactions in Section 5.15 and list the algorithm for one of them. Finally, in Section 5.16 we discuss the correctness of this proposal.

5.1

Challenges

As described in Section 2.7 in Chapter 2, modern distributed NoSQL data stores are designed with scalability and high availability in mind. The distributed archi- tecture enables the data items to be spread across the storage nodes in the cluster using some form of data item key to node mapping. In order for such an archi- tecture to perform well for item lookups, there must be little to no coordination across the actual nodes that store the data. In addition to this, the assignment of data items to nodes may change over time depending on various factors includ- ing; the number of nodes in the cluster, the data placement and load balancing algorithm, and actual number of records stored in the cluster.

Support for transactions across multiple data items requires coordination of more than one node in the cluster for every transaction. CloudTPS [144] for instance, implements a key migration protocol in order to ensure that the Local Transaction Manager (LTM) can locally coordinate transactions across multiple records in a key-group on the same node. This does not scale as the number of transactions increase and the number of nodes in the storage cluster grows as a result of growing data.

The sources of this scalability bottleneck are:

• Storage space: the need to keep extra transaction state for each data record • Network communication: the messaging overhead of transaction protocol

coordination

In order to avoid these performance issues, these systems typically provide lower transactional guarantees on data items stored in it. For instance, Amazon’s

5.1. CHALLENGES 83

Simple Storage Service (S3) provides only eventually consistency; essentially, there is no guarantees that when a record is written to the data store, its latest version will subsequently be read by the same or another application, particularly when the system is being actively updated. Eventually, when the system settles down, the value will be propagated to all replicated storage nodes ensuring that all readers will see the same value.

A slightly, higher level of transactional support is called Timeline Consis- tency [28] where the system guarantees that at any point, a reader will not get an older version of the data it read in a previous read operation.

While these are weak guarantees, in reality, they often work quite well. This is particularly true for write-one-read-many (WORM) applications like web-content delivery and infrequently updated data.

However, in recent years, new systems have begun providing higher transac- tional guarantees on single data item updates for the data stored in them. Two examples of commercially available systems with single data item transactional ac- cess are Google Cloud Storage (GCS) and Windows Azure Storage (WAS). Other research prototypes and open source systems with similar capabilities have also been developed.

There are various approaches to implementing transactions across multiple data items. Broadly, these can be classified into three categories.

The first implements transaction support in the distributed data store infras- tructure itself. It is more suitable for homogeneous systems and makes it possible to implement performance optimisations otherwise harder to implement in het- erogeneous systems.

The second involves implementing the transaction coordination in the middle- ware between the application and the data store. This can cause the middleware to be the performance and scalability bottleneck depending on the middleware and it architecture. This is suitable for access to heterogeneous data stores even though the inclusion and exclusion of data stores can have significant procedural overhead.

The last technique involves coordinating transactions in the client application. We use this technique to implement transactional access to multiple data items in heterogeneous data stores.

The techniques described here are discussed in closer detail in Chapter 2 Sec- tion 2.8 and Chapter 3 Section 3.6. In the remaining part of this chapter we describe our solution in further detail.

Documento similar