This work investigates how performance and resource consumption can be improved in structured P2P overlays. The problem this chapter addresses is that an optimal mainte- nance interval in P2P overlays cannot be predicted and thus has to be adapted dynamically. Some related work was identified which shares aspects of this work’s high-level objec- tive of optimisation with respect to performance and resource consumption and provides similar approaches (section 4.3.1). Other related work was identified which shares the high- level objectives but addresses them via different approaches, retaining statically configured maintenance intervals (section 4.3.2). All of the latter potentially face the same problem which is that an ideal static interval cannot be predicted.
4.3.1
Dynamic Control of P2P Overlays
The work which is most closely related to this is based on the Pastry overlay [57]. The authors propose to improve both performance and resource consumption in structured P2P overlays by dynamically adapting the interval between maintenance operations. The op- timisation focus however lies only on the performance, because they propose to adapt the maintenance interval in order to achieve a specific minimum performance. Thus, their man- ager will not change the maintenance interval once this minimum performance is achieved. That means that in a situation in which churn keeps decreasing and the minimum perfor- mance is reached, their manager ceases to increase the maintenance interval. In such a situation an increasing maintenance interval correlates directly with decreasing unneces-
sary use of network resources. Thus resource usage is only improved up to a fixed point by their manager. Further comparison of [57] and the approach of this thesis is provided in chapter 7 which reports on the experimental evaluation of the approach introduced here. [7] focuses on Chord’s stabilizeoperation and is based on a churn estimation mechanism. Although this is introduced as a first step towards a self-tuned maintenance mechanism, the dynamic adaptation of maintenance intervals is not specified nor evaluated. Their churn estimation mechanism is based on an analytical model into which data gathered during
stabilizeexecutions is fed. A potential problem of this approach is the scope of the input data for the churn estimator. As only data which is gathered during maintenance operations is processed, a high interval may result in out-dated information being used for the churn estimator. Thus, this approach potentially results in estimations of the degree of churn which do not represent the current situation in the case illustrated above.
4.3.2
Static Control of P2P Overlays
In [8] an adaptation of structured P2P overlays based on the Kademlia [59] overlay is pro- posed. The maintenance interval which governs Kademlia’s maintenance operations is not adapted dynamically. Instead an additional maintenance mechanism is developed which improves performance, and increases or decreases maintenance overhead on top of the orig- inal maintenance overhead depending on the exhibited churn. In more detail: Kademlia is a combination of a P2P overlay network and a Distributed Hash Table (DHT). Kademlia’s P2P routing protocol maintains a node’s routing state lazily. Its DHT component, however,
executes a periodic replica maintenance operation, during which potential routing-state er- rors are repaired. In [8] a modification is proposed based on a distributed data structure for maintaining failed node addresses. As the number of failed node addresses in this data structure corresponds with the membership churn, the maintenance overhead for keeping this list up-to-date correlates with the membership churn. Thus the resulting network usage due to maintenance varies dynamically but the maintenance interval is not adapted.
In [94] an additional layer to a Chord overlay is proposed. Here again the maintenance intervals are not adapted but an additional maintenance mechanism is introduced. This is based on a second overlay layer for finger table maintenance. This layer consists of a manually selected set ofsuper nodes with better stability characteristics than the average P2P participants. P2P nodes which are not in the maintenance layer do not maintain their finger tables. However, they periodically maintain their immediate neighbours in the key- space. Thesuper nodesare informed about failed nodes, which triggers a distributed finger- table repair process.
In the FS-Chord project [43] a two-step joining protocol for Chord is proposed to reduce maintenance overhead and to improve stability. The modified joining protocol is based on the following: a node that has sent a join request is only accepted as an overlay node after a fixed time during which the new node’s availability is monitored. If the new node does not fail during this time it is granted permission to fully join the overlay network. This may stop unstable nodes from joining the overlay network. Subsequently any network usage caused by maintenance work due to the unstable node is saved. This approach does, however, not enable Chord to adapt to a change in the environmental conditions after the
monitoring period has passed.
In [9] a model of a Chord network is developed which shows that increasing the size of the successor list in a Chord overlay network improves stability. The authors suggest the dynamic adaptation of the successor list length. This mechanism is however not further specified or evaluated. Even though their proposal can be considered as a dynamic adapta- tion of the peer-set it is not based on the same principles as the approach introduced here. It may increase the stability of a Chord ring up to a certain level, but may also reach the point where maintenance is not executed frequently enough.
In [51] a number of modifications of Chord are introduced to improve stability in the pres- ence of membership churn. None of the modifications is however evaluated. They comprise periodic rejoins to maintain the structure of the Chord overlay network in the event of node failures. Additionally, a modified lookup algorithm which improves the stability in the event of failed successors is proposed. This involves nodes discovered during lookup op- erations being used for peer-set maintenance. It also involves the suggestion to decrease the stabilize interval when errors are detected, but no rule is specified for increasing the interval.
In [52] it is experimentally evaluated whether a modified stabilizealgorithm can improve stability in a Chord network in the presence of high churn. A modification of the Chord protocol is made which maintains a list of predecessors and successors. The interval how- ever is kept at a fixed value. It is suggested that a small interval is desirable in networks with high membership churn and thus stated that there is a correlation between the interval
and the degree of churn. However, it is not proposed to dynamically adapt the maintenance interval.
In [53] Chord is compared with an overlay network based on a hierarchical grouping schema. The maintenance mechanism of the hierarchical groups overlay network is not further specified. The authors experimentally evaluate how differentstabilizeintervals af- fect the lookup error rate in Chord but do not evaluate or propose dynamic adaptations. In [55] Chord’s maintenance mechanism is analysed and it is concluded that its execution rate is an important configuration factor. The authors analyse the correlation between main- tenance rate and performance, stress the importance of conservative network usage and ask the question whether an optimum maintenance rate can be learned. An analytic model is developed for computing a lower bound maintenance rate given the time interval during which 50% of the P2P participants leave an overlay network (this time is referred to as half-life).
In [54] Chord joining and leaving protocols are modified to execute faster. A positive aspect of this approach is that nodes may be able to fully establish a valid peer-set sooner than in the original Chord protocol. Its applicability may be limited as the authors assume a fault-free overlay network in which nodes leave only voluntarily.