We propose the Cresco Application Description Language (CADL) to model dis- tributed applications managed by the Cresco framework. CADL is a graph language, where nodes represent Cresco Plugin configurations, edges represent relationships be- tween Cresco Plugins, and graphs composed of nodes and edges represent applications. The format of CADL node fields and requirements are shown below:
• [node id] (required, unique): Node IDs are unique identifiers for nodes within specific pipelines.
• [node name] (required): Node names are used as short descriptions for nodes within specific pipelines.
• [type] Required: Node types represent specific Cresco Plugin implementations. • [description] (optional): Node descriptions are used for descriptions of node
• [params] (optional): Params are collection of key-value pairs that specify plugin-specific configurations. Manifest descriptors within Cresco Plugin im- plementations determine parameter requirements.
• [isStateless] (optional): The isStateless parameter is a boolean value repre- senting the ability of the configuration instantiation to be migrated between Cresco Agents without maintaining plugin memory state. For example, the configuration of a stateless plugin processing (filtering, format conversion, etc.) data from a source with delivery guarantees, such as a durable queue, can be migrated without memory migration or data loss.
• [isSource] (optional): The isSource parameter is a boolean value designating the node as a data source for a potential external pipeline. For example, a destination node that removes sensitive data in one pipeline might serve as the data source for another pipeline.
• [location] (optional): The location parameter is used to relate nodes to specific agents or locations. For example, to sample the network traffic at a specific location, the location parameter would need to match the Cresco Agent location parameter at the desired location.
Listing 4.1 shows the node description for a plugin that serves as an AMQP data exchange at location X.
Listing 4.1: CADL Node
1 ” n o d e i d ” : ” 0 ” 2 ” node name ” : ” p S t a r t ” 3 ” t y p e ” : ” amqp” 4 ” params ” : 5 ” a m q p s e r v e r ” : ” l o c a l h o s t ” 6 ” outExchange ” : ” eQuery ” 7 ” a m q p l o g i n ” : ” l o g i n ”
8 ” amqp password ” : ” password ” 9 ” i s S t a t e l e s s ” : t r u e
10 ” i s S o u r c e ” : t r u e 11 ” l o c a t i o n ” : ”X”
As previously mentioned, edges represent the relationship between nodes. The format of CADL edge fields and requirements is shown below:
• [edge id] (required, unique): Edge IDs are unique identifiers for nodes within specific pipelines.
• [node from] (required): The node from parameter is used to designate source node id for the edge within a specific pipeline.
• [node to] (required): The node to parameter is used to designate destination node id for the edge within a specific pipeline.
Listing 4.2 shows an example of a CADL edge description relating two node ids. Listing 4.2: CADL Edge
1 ” e d g e i d ” : 0 , 2 ” n o d e t o ” : ” 1 ” , 3 ” n o d e f r o m ” : ” 0 ”
CADL node and edge descriptions are combined to form a pipeline description, which represents a Cresco application. The format of CADL pipeline fields and requirements are shown below:
• [pipeline name] (required): Pipeline names are used as short descriptions for pipelines maintained by a specific Cresco Global Controller.
• [nodes] (required): Nodes are collections of CADL node descriptions. At least one node description must exist for a pipeline to be considered valid.
• [edges] (optional): Edges are collections of CADL edge descriptions.
• [description] (optional): Pipeline descriptions are used to describe the opera- tion of pipelines.
• [isFaultTolerant] (optional): The isFaultTolerant parameter is a boolean value designating that pipeline components should be rescheduled if failures are de- tected.
CADL pipelines can be used to describe a number of applications. Suppose we want to construct the following application pipeline:
1. Read JSON-formated Netflow records from an AMQP data source and emit data to a downstream node.
2. Read data from an upstream node, marshal JSON data into a strongly typed Netflow class, calculate the top ten network flows in a one minute sliding win- dow, and emit JSON-formatted data to a downstream node.
3. Read data from upstream node and place results in a FIFO (first-in-first-out) memory buffer, which is externally accessible through a RESTful interface pro- vided by the plugin.
Listing 4.3 shows a three state CADL pipeline for the previously described appli- cation.
Listing 4.3: CADL pipeline
1 {” nodes ” : [
2 {” node name ” : ” p S t a r t ” , ” t y p e ” : ” amqp ” , ” n o d e i d ” : ” 0 ” , ” params ” : { ” a m q p s e r v e r ” : ” l o c a l h o s t ” , ” outExchange ” : ” someexchange ” , ” amqp password ” : ” somepassword ” , ” a m q p l o g i n ” : ” s o m e l o g i n ” } } , 3
4 {” node name ” : ” netFlow Query ” , ” t y p e ” : ” e s p e r q u e r y ” , ” n o d e i d ” : ” 1 ” , ” params ” : { ” q u e r y c l a s s ” : ” netFlow ” , ” q u e r y s t r i n g ” : ” s e l e c t i p s r c
, i p d s t , b y t e s from netFlow . win : t i m e ( 1 min ) . e x t : s o r t ( 1 0 , b y t e s d e s c ) ” } } ,
5
6 {” node name ” : ” ” , ” t y p e ” : ” membuffer ” , ” n o d e i d ” : ” 2 ” , ” params ” : { ” d a t a u r l ” : ” h t t p : / / l o c a l h o s t /API/ b u f f 0 ” } } ] , 7 8 ” e d g e s ” : [ 9 {” e d g e i d ” : 0 , ” n o d e t o ” : ” 1 ” , ” n o d e f r o m ” : ” 0 ” } , 10 {” e d g e i d ” : 1 , ” n o d e t o ” : ” 2 ” , ” n o d e f r o m ” : ” 1 ” } ] , 11 12 ” p i p e l i n e n a m e ” : ” Top 10 N e t f l o w s ”}
CADL descriptions are submitted to Cresco Global Controllers for interpretation and resource scheduling. Figure 4.1 shows the steps taken in the deployment of a Cresco application.
CADL description (Pipeline)
Component Representation (AppSchedulerEngine)
Resource Placement (ResourceSchedulerEngine)
Figure 4.1: Cresco Application ProcessIn the next section we will cover how CADL components are represented within the Cresco framework.