• No se han encontrado resultados

CAPÍTULO III. INTEGRANDO LOS CONCEPTOS DE CULTURA Y CLIMA ORGANIZACIONAL

3.2 MODELOS RECIENTES QUE INTEGRAN EL CLIMA Y LA CULTURA

3.2.2. Ostroff, Kinicki y Muhammad (2012)

The complex computational dependency between the events poses challenges during the inter-EPN parallelization. Two approaches for parallelization of EPNs are ana- lyzed: the first one uses a decentralised or asynchronous notification mechanism (a message passing system), where the event aggregator is dynamically assigned to one of the computing nodes complying to the rules defined in the algorithm. The sec- ond option is to express the computation as centralised or map reduce functions. In the map reduce based parallelization, the master node acts as the centralised compu- tation and aggregation point where periodic checkpoints of the incoming events are directed to the map and the reduce operators. Following the constraints specific to both paradigms, the communication, correlation and aggregation mechanisms can be mixed. In this section, we explain the implementation of inter-EPN parallelization with the help of a simple use case. Each approach optimizes the processing between the EPNs by selecting the group of EPNs and optimally allocates the computing resources.

Within the asynchronous or decentralised notification mechanism, a search for a pat- tern or an event can be initiated by any computing node in a cluster, which will declare itself as an aggregator. For example, let us consider the computation of average as

Figure 8.3: Decentralised algorithm for inter-EPN parallelization

illustrated by use case 8.1, in a decentralised manner. Any arbitrary node receiving the event first declares itself as aggregator and tries to collate the sum and the count of events from the rest of the nodes. Once the node declares itself as an aggregator, all other computing nodes in the cluster subscribe to notification from the aggregator. When the computation is accomplished based on the n depth traversal of the directed acyclic graph representing the process flow in the EPN, the aggregator node resets its state and will be ready to accept instructions from a new aggregator in the network of computing nodes, as illustrated in Figure8.3.

Some of the key challenges to implementing decentralised event processing are listed below:

• During the process execution, a dynamic decision should be made to pick one of the computing nodes as an aggregator.

• During complex inter-dependency between events, the aggregation of the events could be undertaken in multiple nodes based on the pattern in the incoming events.When one or more computing nodes try to declare itself as an aggregator, a race condition occur. In order to prevent race conditions, one of the node is fixed as an aggregator arbitrarily.

• During a few instances of complex use cases, aggregator could be arbitrarily fixed to maintain the stability of the system.

The overall outline of the decentralised event processing to acheive inter-EPN paral- lelization is listed in the steps below:

Step1 The copy of the EPN is instantiated in idle computing nodes. The count of computing nodes required to hire in the system is identified through the reconfiguration algorithm described in chapter 6.

Step1.1 The decentralised parallelization algorithm initiates the search. Let us consider the use case which computes average. Any arbitrary node receiving the event checks for the availability of the aggregator. If an aggregator is found, the node computes the sum and the count of the events. In the circumstance of no aggregator, the node declares itself as an aggregator and notifies all other nodes registered in the system.

Step1.2 The N ode1 in Figure 8.3 , passes the notification to process the incoming

events. The rest of the nodes in the system will act as the processing node. The results are forwarded to the aggregator nodes (N ode1).

Step2 Once the results are computed by the aggregator, the algorithm resets it as one of the processing nodes. Based on the arrival of the events, any node in the network would declare itself as an aggregator.

As an alternative approach, centralised or map reduce paradigm uses the customized event processing routine (EPN) for the given use case called ‘’Map’ by receiving the input events along with key/value pairs and generates intermediate events. The map reduce library groups the set of intermediate events and passes them to the reduce function. The reduce function, which is also a customized event processing routine (EPN), groups the events based on the intermediate key and merges the events to a smaller set of values. Typically zero or one output value is produced per Reduce invocation. Splitting the events in to several groups (Map) provides the opportunity for distributing the EPN on several machines: within the same group, where a few of the EPNs are executed on virtual machines on the same server or in a different location.

Figure 8.4 shows the overall outline of the map reduce operation for the inter-EPN parallelization. The map reduce engine or appropriate API, is integrated in to the user defined program called mappers and reducers. The master receives the incoming events, communicates with the mapper, reducer and the algorithm containing the instances of the EPN. The simple case such as finding the average can be easily

Figure 8.4: consecuetive events split using map reduce

accomplished through the master node. The master splits the events based on a user defined criteria. Each mapper processes the event and sends the sum and count of events. The reducer sorts, merges and forwards the results to the master. The sequence of actions is illustrated in the figure corresponding to the listed numbers.

Step1 The master splits the input events in to M pieces based on size or count of events as defined by the parameters defined by the user.

Step1.1 The master picks the idle nodes and assigns a few nodes as map and re- duce task. The count of idle nodes can be picked based on the reconfiguration algorithm defined in chapter 7. The master starts multiple copies of the EPNs in the mappers.

Step1.2 A mapper is assigned a task to process the contents of the incoming events. It uses the query information from the master to process the events. The inter- mediate events are passed to the reducer as illustrated in Step1.3.

Step1.3 Periodically, the buffered events from each mapper are passed to the reducer. The reduce worker reads the intermediate events, sorts and groups the results. If the amount of intermediate events is too large to fit in the memory, additional computing nodes can be used. The master defines the architecture for the reducers during the system initialisation.

S E L E C T p e r s o n _ n a m e , avg ( accel - x ) /* the o u t p u t a t t r i b u t e ( s ) */

F R O M m e r g e d _ a c c e l e r o m e t e r _ l o g s . win : l e n g t h ( 6 4 ) /* the i n p u t r e l a t i o n */ g r o u p by p e r s o n _ i d o u t p u t e v e r y 32 e v e n t s ; /* the p r e d i c a t e */

Figure 8.5: Example of the scan of acceleration logs to compute rolling average

Step1.4 The reduce worker iterates intermediate results and generates the final output.

Step2 : When all the map tasks and reduce tasks are accomplished, the master sends the final aggregated results to the algorithm.

MapReduce framework are used to maintain the pool of EPN jobs and the count of EPNs subject to the change in the workload. The incoming data is split in to Map tasks. Map and Reduce task are implemented as a separate Java Virtual Ma- chine (JVM) residing within each virtual machine. Multiple map tasks connect with multiple reduce task as specified in the Inter-EPN parallelization algorithms. The streaming data is directed to each EPN hosted in the virtual machines. Total number of Map tasks are proportionate to the number of EPNs hosted by the virtual machines. Each Map task performs a user-defined Map-functions and generates the intermediate key-value pair data. The intermediate events are organized on the cache within the virtual machines. Each of the virtual machine consist of key-value data pairs, whose keys are classified in to one group. The hash function is used to aggregate the events belonging to the same group across all the virtual machines and merged together. Many implementations of the EPN parallelization are possible. The correct choice depends on the complexity and the computational dependency of the events. For example, in the ambient kitchen activity recognition, a rolling average is computed. Average is computed in windows of arriving events. The 32 events from the current window event stream are merged with 32 events in the previous window of events. Instead of one final result (average), the system needs to deliver continuous results (average) to derive the pattern for the activity recognition. The query is illustrated in

8.5. During the parallelization of the rolling average, each computing node should pass the last 64 events to the new computing node registering in the system to parallelise the event processing.

The complexity of the use case in each event processing scenario determines the cri- teria for parallelization. The next section describes algorithms using a decentralised (asynchronous) notification mechanism or centralised (map reduce) approach using range partitioning to improve the EPN performance. The algorithm design is defined for a few complex use cases to demonstrate the inter-EPN parallelization.

Figure 8.6: consequetive events split between two windows