Confecció de l’acta - El Manual de l’Àrbitre

El Manual de l’Àrbitre

Tema 9: Confecció de l’acta

We now have the basic ingredients to implement range propagation, both from parent to child and from child to parent. Range propagation can either stem from a SID range, as introduced, for example, by selection push-down using a min-max index, or from RID ranges, as introduced by range partitioning on the current table count during parallelization of query plans. The resulting parent and child ranges respect the original clustering, allowing us to employ highly eﬃcient merge-based join plans.

We restrict our discussion to the most complex, and interesting, range propagation scenario: from a child-side RID range to the corresponding parent RID range. Our goal is to generate the virtual FRID column for a given child-side range, i.e. for each child tuple in the range, fill in the RID of the matching tuple in the parent, without performing a join. This requires range propagation to find the corresponding range in the parent, as that is where the JI column and its updates are stored. Furthermore, the start of the parent range gives us a JI count, but we likely have to start decompressing at a certain offset within that count, as the corresponding child tuple is not necessarily the first in its cluster. The process is outlined in Algorithm 15, where we restrict ourselves to the start offset of the RID range, ridC, as the end is simply a matter of counting.

Algorithm 15 takes as input a child RID, ridC, which it ﬁrst converts to the SID associated with that tuple, sidC. At line 2, we then convert this child

7.5. UPDATE OPERATORS 161

Algorithm 15 JI.initializeDecompressedScan(ridC)

Initialize a scan over a virtual FRID column, starting from an arbitrary LRID in the child, provided as ridC.

1: sidC ← child.RidT oSid(ridC)

2: minP ← JIS.childLoT oP arentLo(sidC)

3: (sidPsync, sidCsync) ← JIS.f indSyncP oint(minP ) 4: ridPsync← parent.SidT oRidLo(sidPsync)

5: ridCsync← child.SidT oRidLo(sidCsync) 6: skipC ← ridC − ridCsync

7: this.initScan(ridPsync) 8: this.skipScan(skipC)

9: return

SID to a parent SID. For this, we use JIS.childLoToParentLo(sidC)5 _{, which}

ﬁnds the partition sidC falls into, and returns the current value of the MIN_P ﬁeld in the corresponding JIS entry, which is a guaranteed lower-bound for the parent-side SID we are searching for. We then use Algorithm 14 to convert this (potentially dirty) minP SID to the nearest stable sync point.

Now that we have a stable SID sync-point, we need to convert it to a conservative RID sync point, by including any PDT inserts that reside at sidPsync

in the parent or sidCsyncin the child. Note that the fact that the sync point is

stable, guarantees us that none of the child inserts refers to an earlier partition, so that the ﬁrst of them is a cluster head, either for the original parent-side sync tuple, or for a newly inserted one (at sidPsync).

Given the pessimistic RID sync point, at line 6 we compute how many tuples we can skip to reach our ridC of interest. We then initiate a (decompressing) join index scan from the safe sync RID. The skipScan routine then performs a Merge of the JI column, producing up-to-date counts, discarding the ﬁrst skipC worth of cumulative counts. Now the join index scan is positioned at the destination ridC, and we are ready to produce uncompressed FRIDs.

7.5 Update Operators

Now that we know how to add updates to a PDT and the impact they might have on indexing structures, we are ready to provide a high-level outline of the full update operators, Insert, Delete and Modify.

7.5.1 Insert

The Insert operator adds a batch of tuples to a table. The batch should be sorted according to the sort key ordering of the destination table, and enumerated by the RID positions to insert each tuple at. The RID positions can be obtained by MergeFindInsertRID (see Section 6.4.6), the output of which can be fed directly into Insert. If we insert into the child side of a join index association, each new tuple should furthermore be annotated with the parent-side RID (FRID) of the

The naming of this routine indicates that we are converting the low end of the child-side range, and wish to convert this to a conservative (safe) lower bound in the parent. Similar routines exist for high ends, and also for the inverse direction, from parent to child.

162 CHAPTER 7. INDEX MAINTENANCE

tuple it refers to. These FRIDs are obtained from a foreign-key join between the insert batch and the parent table.

Algorithm 16 Table.Insert(tuples, rids)

Inserts an ordered batch of tuples into a Table (this) at the RID positions given in rids. The batch should be ordered on the sort key of the destination table, which implies that rids is non-decreasing. If this acts as the referencing side in a join index relationship, each tuple must be annotated with F RID, the parent-side RID of the tuple being referenced.

1: i ← 0

2: for (tuple, rid) in (tuples, rids) do

3: rid ← rid + i

4: tpdt.AddInsert(rid, tuple)

5: sid ← this.RidT oSid(rid)

6: minmax.updateAll(sid, tuple)

7: if isJoinP arent(this) then

8: tpdt.InitJoinIndexCounts(rid)

9: end if

10: if isJoinChild(this) then

11: f rid ← tuple[”F RID”]

12: ji.parent.tpdt.IncrementJoinCount(f rid, 1)

13: f sid ← ji.parent.RidT oSid(f rid)

14: lsid ← this.RidT oSid(rid)

15: ji.jis.T estAndSetM inF oreignSid(f sid, lsid)

16: end if

17: i ← i + 1

18: end for

Algorithm 17 JIS.TestAndSetMinForeignSid(f sid, lsid)

Checks whether we should update the MIN_P ﬁeld of the JIS partition lsid falls into. If fsid is smaller than the current MIN_P value, we update it to f sid.

1: partitionIdx ← lsid/this.partitionSize

2: mutex_lock(this.mutex)

3: if f sid < this.partition[partitionIdx].M INP then 4: this.partition[partitionIdx].M INP ← f sid 5: end if

6: mutex_unlock(this.mutex)

The Insert operator itself is outlined in Algorithm 16, where we iterate over the tuples in the insert batch. We add each tuple to the PDT, adjusting the insert-RID by i to accommodate for the shift introduced by tuples inserted during earlier iterations. The next step ﬁnds the corresponding SID, and uses it to update the global min-max index of the destination table (using a mutex for protection from concurrent modiﬁcations).

If the destination table participates as a parent in one or more join index as- sociations, we initialize the join index (JI) counts to zero. For child-side inserts,

7.5. UPDATE OPERATORS 163

we increment the +JI ﬁeld of the referenced parent tuple, at FRID, by one, to ac- count for the new reference. Finally, we convert both FRID and the local insert RID, rid, to (FSID, LSID), which we pass to the JIS.TestAndSetMinForeignSid routine (Algorithm 17) to maintain MIN_P of the JIS partition LSID falls into

6_{. As with min-max, the JIS index is maintained “optimistically”, meaning that}

we directly manipulate the global data structure, accepting potential pollution in case a transaction happens to abort.

7.5.2 Delete

Delete is similar to Insert in that we need to manipulate join index counts. However, we do not perform maintenance on min-max and JIS indices. When deleting from the parent table in a join index association, we should ensure that referential integrity constraints are not violated. I.e. a parent tuple may not have any child-side references at the time we try to delete it. Given the reference counts in the JI column, we can easily verify that the current count is 0, as is done for all incoming join indices in Algorithm 18.

Algorithm 18 Table.Delete(rids, f rids)

Deletes the tuples at RID positions given in rids from a table (this). The optional frids argument must be provided in case we delete from the referencing (i.e. child) side in a join index association, and should contain, for each deleted tuple, the parent-side RID of the tuple being referenced.

1: qpdt = pdt_create()

2: for (rid, f rid) in (rids, f rids) do

3: if isJoinP arent(this) then

4: for jiColumn ← this.N extJoinIndexColumn() do

5: if jiColumn.GetJoinCount(rid) 6= 0 then

6: return “ERROR: referential integrity violation”

7: end if 8: end for 9: end if 10: if isJoinChild(this) then 11: ji.parent.tpdt.DecrementJoinCount(f rid, 1) 12: end if 13: qpdt.AddDeleteBySid(rid) 14: end for 15: tpdt.P ropagate(qpdt) 16: qpdt.destroy()

When deleting from a child-side table, for every deleted tuple we also need to know the foreign-RID (FRID) that identiﬁes the referenced tuple in the parent. Those FRIDS can be readily obtained by scanning along the up-to-date and decompressed join index in a Delete plan. They are used in the call to DecrementJoinCount to decrement the -JI ﬁeld of the referenced tuple in the parents trans-PDT.

6 _{In a real-world “vectorized” implementation, we ﬁrst gather a batch of (FSID, LSID)}

164 CHAPTER 7. INDEX MAINTENANCE

Algorithm 18 also shows the usage of a fourth PDT layer, the query-PDT, identified by qpdt. It starts out empty, and contains updates with respect to the RID image produced by the current trans-PDT, i.e. the SIDs in qpdt, refer to the RID enumeration generated by a merge of the trans-PDT. The purpose of the query-PDT is to provide a query-local isolation layer to effectively sort an arbitrary (i.e. unordered) sequence in rids on the fly. Recall that for Insert, where new tuples either come in sort-key order, or are appended to the end of a table, we had to adjust the destination RID to compensate for previously inserted tuples, allowing in-place modification of the trans-PDT. If such an ordering can not be assumed, we avoid direct manipulation of the trans-PDT, and treat input rids as SIDs of the query-PDT, as illustrated by our use of AddDeleteBySid. When all input RIDs are processed, we use Propagate to migrate the updates from the query-PDT into the trans-PDT.

7.5.3 Modify

All we need to do in case of Modify, is to add a PDT update for each attribute being altered, and to inform the min-max index about the changes to relevant columns, so that it can check for changes to minimum or maximum attribute values in the relevant SID range. The process is summarized in Algorithm 19. Modify never changes the SID or RID enumeration of tuples, and modiﬁcations of sort key attributes are rewritten into Delete followed by Insert.

Algorithm 19 Table.Modify(colnos, valueLists, rids)

Updates a list of attributes identiﬁed by colnos, for all tuples at positions in rids with the corresponding attribute values from valueLists.

1: for (valueList, rid) in (valueLists, rids) do

2: for i = 0; i < colnos.size(); i = i + 1 do

3: tpdt.AddM odif y(rid, colnos[i], valueList[i])

4: sid ← this.RidT oSid(rid)

5: minmax.updateColumn(sid, colnos[i], valueList[i])

6: end for

7: end for

7.6 Concurrency Issues

In Section 7.5 we described optimistic maintenance of the global min-max and JIS indices belonging to a table, where we used simple mutual exclusion mech- anisms to avoid corruptions caused by concurrent updates. There are, however, more subtle concurrency issues that are semantic in nature, as they are caused by the inherent volatility of positional information under updates. Section 7.6.2 discusses the issue of maintaining indices in a second database image, as generated during a background checkpointing transaction. In Section 7.6.3 we discuss obstacles during serialization of trans-PDTs from the child-side of a join index association. Solutions to both problems rely on a generic solution to the problem of matching child-side PDT inserts to the (volatile!) foreign-RID of the parent tuple they reference, at any moment in time, without performing a join. Therefore, Section 7.6.1 ﬁrst presents a solution to that problem.

7.6. CONCURRENCY ISSUES 165

In document PCAT. Programa del Curs d iniciació a l Arbitratge. Part Teòrica. (página 36-45)