The two important use scenarios for event matching are:
• Event forwarding to neighbouring routers.
• Event forwarding to local clients.
These two scenarios differ, because generalized filter sets are used for neighbouring routers, but local filtering requires specific filters. Remote filter sets are envisaged to be considerably smaller than local filter sets. Moreover, for neighbouring brokers the filters are typically stored in a poset whereas local filtering may require optimized filtering structures. Opti- mized matching structures may also be used for filter sets from neighbour- ing brokers.
The counter algorithm is the basic mechanism for efficiently matching events [28, 34, 88, 90, 104]. The counter algorithm keeps track of the num- ber of attribute filter matches for each filter. Counting is based on the
3.7 Poset-based Matching 45
Algorithm 4 Subscription message handlers for advertisement semantics. IncomingSub(f ,source)
1. Add (f ,source) to Ps.
2. Calculate neighbourss using Pa and Equation 3.3.
3. Send subscription message to forwards(f ).
IncomingUnsub(f ,source)
1. Remove (f ,source) from Ps.
2. Forward unsubscription following the procedure in Algorithm 2. The set may be empty if there are subscriptions from other neighbours that cover f . The forwards sets of subscriptions covered by f may change, which may require the forwarding of new subscriptions. An uncovered subscription is such that its forwards set gains an additional element due to the removal of a covering filter.
fact that filters are conjunctions of attribute filters. Typically, the count- ing algorithm is divided into a preliminary elimination phase in which un- matchable filters and interfaces are removed, and a counting phase. If the counter of a filter becomes equal to the number of attribute filters in the filter, the filter matches the input notification and the corresponding inter- face is added to a set of output interfaces. The counter algorithm returns either the identifiers of matching filters or a set of output interfaces. Op- timized matchers use efficient data structures for different predicates, for example hashtable lookup for equality tests and interval trees [40] for range queries.
The data structure and posets in general have two interesting proper- ties for matching that follow from the definition of the covering relation (Property 3.12 and 3.13).
Property 3.12 If a node n1 matches a notification then all the predeces-
sors of n1 must also match the notification.
Property 3.13 If a node n1 does not match a notification then none of
the descendants of n1 matches the notification.
The node in this case may be any object that is comparable using the covering relation, for example: filters, attribute filters, and disjuncts.
Algorithm 5 Advertisement message handlers. IncomingAdv(a,source)
1. Add (a,source) to Pa.
2. Forward advertisement message to forwards(a).
3. Determine the set of overlapping subscriptions using Ps for which a
is the only advertisement from the source that overlaps and send them to the source. In other words, any subscriptions that have not yet been sent are forwarded to the advertising node (source). Those subscriptions that overlap with an existing advertisement from the source have already been forwarded so they are not processed. The overlapping set is found by iterating over the first two levels of Ps
and testing the overlap of subscriptions with the advertisement.
IncomingUnadv(a, source)
1. Remove (a,source) from Pa.
2. Forward unadvertisement in a similar fashion than the unsubscription is forwarded. The forwards(a) set may be empty if there are adver- tisements that cover a from other neighbours. Forward any uncovered advertisements in Pa.
3. Remove any subscriptions for source that are no longer needed. All subscriptions are removed from neighbours other than the source that do not have an associated overlapping advertisement from some other neighbour.
3.7 Poset-based Matching 47
The Siena poset-based matcher uses Property 3.13 in order to optimize matching. The pseudocode for the forest is given by Algorithm 6. The poset-based matcher is similar, but requires the testing of nodes that have already been visited.
The forest or poset also supports approximate matching. For example, we may walk the forest with the notification breadth-first and define a time bound for matching. When this time expires the algorithm simply walks the remaining nodes and records the interfaces as matched. This is approximate, because it may result in false positives.
Algorithm 6 Pseudocode for forest-based matching.
Match-Forest(n)
1 let S be an empty sequence
2 let F W be an initially empty set of forward interfaces 3 let Imax be the # of interfaces for the event type
4 let q = false 5
6 R = Get-Roots(n.type)
7 let Im be an imaginary root of a tree 8 Im.children = R
9 addLast(S, Im)
10 while S is non-empty and not q
11 do
12 o = removeFirst(S)
13 while o has unprocessed children and not q
14 do 15 c = nextChild(o) 16 if subscribers(c) ⊆ F W 17 then addLast(S, c) 18 elseif match(c, n) 19 then 20 addLast(S, c) 21 addToSet(F W,subscribers(c)) 22 if |F W | ≥ Imax 23 then q=true 24 return F W
The interesting feature of the algorithm is that the matching mecha- nism does not know the details of the filtering language — it only assumes that there are covering relations between nodes. This makes the algorithm suitable for environments where the filtering language and the operators (predicates) are dynamic and change. In addition, adding new operators does not require complicated changes to the matching algorithm, such as creating new indexing structures.
We propose two improvements to the basic algorithm. First, the match- ing test does not need to be performed for a filter whose interfaces have already been added to the result set. We found that this modification re- sulted in better performance. The second improvement is to use the inter- face index to prevent the processing of those subtrees in a balanced forest, which have been already matched. This is easily accomplished by simply checking whether the interfaces of a particular node are already contained in the result set; if they are the node is not processed further.