Adaptive and Distributed Service Oriented Processes: An Architectural Methodology Michael Pantazoglou, George Athanasopoulos, Aphrodite Tsalgatidou

səhifə	2/3
tarix	25.06.2016
ölçüsü	253 Kb.

1 2 3

Data-Driven Process Adaptation

The prime assumption of the Data-Driven Process Adaptation approach (DDA) is that “a service-oriented process comprising heterogeneous services should be able to use the information available within its environment and adapt its execution accordingly”. To facilitate the provision of such adaptable processes, Athanasopoulos et al. [DBLP:conf/icsoft/AthanasopoulosT10] proposed an approach that exploits information contained within a specific ‘space’ to adapt a service-oriented process. The space is considered to be the process’s environment, which is open to other processes and systems for information exchange. Appropriate algorithms specify adaptation paths for given processes along with queries that can be executed in the shared space; these queries search for relevant information, which when found, is fed to a process execution engine. The execution engine uses the discovered information for controlling the execution and adaptation of running process instances according to the adaptation paths specified by the provided algorithms.

The proposed solution accommodates the necessary components to address the following three basic functional needs: Collection of contextual information; Execution of heterogeneous service processes; Process adaptation driven by collected information. Therefore, the accommodating infrastructure comprises three main components:

A Semantic Context Space Engine (SCS Engine) that supports the exchange of contextual information,
A Service Orchestration Engine that executes heterogeneous service processes and uses contextual information to adapt a running process.
A Process Optimizer, which generates Data Driven Adaptable Service-Oriented Processes (DDA-SoP), and

The SCS Engine provides an open space where one may i) write and retrieve information, which is annotated with meta-information, ii) logically group information of interest, e.g. information pertaining to a specific domain, e.g. weather conditions, and to specify associations among groups, which contain information from related/depending domains, e.g. a weather conditions group can be associated to a group with information on the aquatic conditions of a specific region.

The Service Orchestration Engine provides a BPEL-based engine executing heterogeneous service orchestrations, e.g. comprising Web, Grid, P2P, and OGC [OGC:REPORT] services. The orchestration engine supports the monitoring and reconfiguration of running process instances according to the suggestions made by the Process Optimizer.

The Process Optimizer component implements an AI planner to discover process plans controlling the execution and adaptation of service processes upon the emergence of related information. Specifically, according to Athanasopoulos and Tsalgatidou [DBLP:icsoft/Athanasopoulos] the problem of Data-Driven Adaptation can be modeled as a non-deterministic, partially observable planning problem [Nau:2004]. In this context, solutions are modeled as conditional plans, which contain branching control structures, i.e. if-then-else, that decide on the execution path that will be followed based on the values of specified conditions.

A crucial step in the provision of data-driven adaptable service-oriented process is the introduction of extensions to the planning problem representation. This step can be regarded as the incorporation of additional ‘sensors’ for monitoring the process, along with appropriate actions for handling the accruing ‘observations’. More specifically, this extension process comprises the following steps:

The semantic-based extension of observations,
The extension of the action set with actions capable of supporting the exploitation of the introduced observations
The consolidation of the extended action and observation sets.

The execution of the aforementioned expansion actions vies to support the introduction of appropriate adaptation steps that would enable the execution of alternate process paths upon the discovery of related information. In the context of the DDA approach adaptation steps are selected points in a process model where the existence or absence of appropriate information, i.e. observations, could be exploited for deciding whether an alternate service or service chain could be used. Adaptation steps are introduced so as to reduce the set of actions that have to be executed for achieving the process goal.

Details on the architecture and main properties of these components are provided next.

Main FOCUS OF the CHAPTER

The Adaptive Process Execution Infrastructure

This section presents the architecture of the proposed Adaptive Execution Infrastructure and describes its main components and functionality.

Figure 1 illustrates a high-level architectural view of the Adaptive Execution Infrastructure. As it can be seen, it comprises a set of three main components, namely the Deployment Service, the Semantic Context Space (SCS) Engine, and the Service Orchestration Engine. The infrastructure also provides a set of interfaces, namely the Deployment Interface, the Data Acquisition Interface, and the Execution Interface. These components and interfaces enable the deployment and adaptive execution of environmental models as BPEL processes in a distributed and scalable manner.

Figure 1. High-level architecture of the Adaptive Execution Infrastructure.

Let us briefly describe how the abovementioned components cooperate with each other during the deployment and execution of an environmental model that has been implemented as a BPEL process.

Deployment Service

The Deployment Service is the entry point to the Adaptive Execution Infrastructure. It supports the deployment of environmental models as WS-BPEL processes, as well as the un-deployment of environmental models that were previously deployed but are no longer used, or they need to be substituted. In a typical usage scenario, this component accepts a bundle from the client, which contains the BPEL process file and all its accompanying artifacts (i.e. the WSDL (Christensen, Curbera, Meredith, & Weerawarana 2001) documents of the constituent services, the external XSD (Sperberg-McQueen & Thompson 2000) files, and any required XSLT (Clark 1999) files).

The contents of the submitted bundle are processed by the Process Optimizer, which is an internal component of the Deployment Service and is further described in the remainder of this paragraph. The outcome of this processing is an expanded BPEL process definition that is dispatched to the Service Orchestration Engine. In turn, the latter binds the deployed BPEL process to a unique Web service endpoint address. Hence, in compliance to standards and common practices, all BPEL processes that are deployed to the Adaptive Execution Infrastructure can be conveniently invoked as standard SOAP Web services.

The Process Optimizer performs all necessary work to expand and render the originally submitted BPEL process adaptive. Its main objectives are to (i) expand the provided BPEL processes with extension points that are evaluated at runtime, and (ii) accommodate process adaptation based on the exploitation of available information elements, which are external to the process execution context.

These objectives are met by the Process Optimizer by specifying (i) the set of information elements which are relevant to a given environmental model, and should be pushed by the SCS Engine to the environmental model instances upon execution, and (ii) the adaptation steps (equivalently referred to as plans) that should be performed upon the discovery of such information at runtime.

Figure 2. Phases of BPEL process expansion.

Taking a closer look into the operational semantics of the Process Optimizer, there are four distinct phases for the generation of the extended environmental process model (see Figure 2). These distinct steps support the transformation of the provided input (i.e. the bundle fed to the Deployment Service) into an internal finite state machine model representation (referred to as State Transition System model, or STS model, hereinafter); the expansion of the generated STS model with the inclusion of observations and additional actions (i.e. service operations); the generation of Planning Domain Definition Language (PDDL)-based representations for the ex- tended planning problem domain and goal descriptions, which are fed to an external Artificial Intelligence (AI) planner; and the extraction of the extended WS-BPEL specification out of the planner outcomes.

Figure 3. Overview of the internal architecture of the Process Optimizer.

The above steps are implemented by the internal components of the Process Optimizer, which are shown in Figure 3. More specifically, the Input Translator is responsible for managing the transformation of the provided input to a finite state machine model (STS model), whilst its comprising sub-components, namely the Observation & Service Expansion Engine and the Planner Input Producer, are responsible for the expansion of (i) the generated STS model, and (ii) the planning problem domain and goal descriptions, respectively. The planning problem domain and goal descriptions provide an abstract representation of the set of available activities (along with complementary descriptions of variables, constants, states, transitions, etc.) and of the expected initial and final states of the requested controlling automaton (i.e. the STS representation of the extended process). This abstract representation is particularly described by Non-deterministic Planning Domain Definition Language (NuPDDL) [ref] constructs. The PlannerProxy component provides a wrapper service to the employed planner, and the Output Provider facilitates the extraction of the extended BPEL descriptions based on the outcome of the planner.

Figure 4. Process Optimizer execution flow.

The execution flow along with the artifacts exchanged between the components of the Process Optimizer and the external components are presented in Figure 4. As it can be seen, the provided process specification (i.e. the BPEL file) and the associated service descriptions (i.e. the WSDL documents) are all fed to the Input Translator. The Input Translator generates the corresponding STS model representations using, in addition to the provided BPEL and WSDL descriptions, the semantic descriptions of all related services (i.e. WSMO-Lite specifications [32]).

All these artifacts are then pushed to the Observation & Service Expansion Engine, which defines an initial set of observations on the given process model and then expands it based on the use of appropriate semantic similarity measures. The Observation & Service Expansion Engine is then able to identify additional services, which are also transformed to STS representations. All the STS representations are combined by the Planner Input Producer and jointly constitute the planning problem domain. The planning problem goals are extracted from the originally specified process model using a backward searching approach that is able to identify expected final states from a given BPEL description. The generated planning problem descriptions are submitted to the PlannerProxy component, which pushes them to the planner. At the end, the generated planning problem solution is retrieved from the Outcome Provider, which uses it for extracting the expanded BPEL process.

Figure 5. The landslide BPEL process.

To better exemplify the expansion process employed by the Process Optimizer, let us con- sider the BPEL process of Figure 5, which was developed by one of our partners in the ENVISION project, and is part of their decision support system dedicated to landslide risk assessment. Overall, the process consists of 15 activities, which are represented in the diagram as rounded boxes, and 10 variables, which are shown as cornered boxes. The control flow of the process is indicated by normal arrows connecting the various activities, while the dashed arrows pointing from the variables to the activities and vice versa display its data flow. The various assign activities in the process are used to copy data from one variable to another; the invoke activities allow the process to interact with external Web services; the receive and reply activities are used by the process in order to retrieve the user input and send the final output, respectively; finally, the structured sequence and flow activities dictate the order in which their included activities will be executed.

In essence, the landslide process orchestrates four OGC Web services. First, a digital elevation model of the specified area is retrieved by invoking a Web Coverage Service (WCS) (“OGC Web Coverage Service” 2012) through activity Invoke1. In parallel, a Sensor Observation Service (SOS) (“OGC Sensor Observation Service” 2007) is called through activity Invoke2 in order to retrieve the precipitation data of the user-specified area. These data along with a set of user input parameters are further fed through activity Invoke3 to a Web Processing Service (WPS) (“OGC Web Processing Service” 2007), which simulates the main mechanisms of the water cycle by a system of reservoir. The produced digital elevation model and the hydrological model containing the produced map of groundwater level in that area are finally passed as input to another WPS, which is invoked through activity Invoke4 and performs static mechanical analysis, in order to calculate the landslide probabilities in the area of study, in the form of a map of safety factors ranging between zero and one. That map is finally returned to the user as the process output.

The expansion of the landslide process starts with the transformation of the into the corresponding STS representation. A simplified illustration of the generated STS model for the nominal process flow (i.e. without any consideration of potential exceptions) is presented in Figure 6. A set of original observations are identified and associated to the end states of service invocations; these include invocations of external services that are not part of a loop construct in the process model.

Figure 6. The landslide process STS model.

Each of the identified observations is linked to a specific ontology concept. For example, the observation used for monitoring the outcome of the SOS service returning the precipitation of a given area (activity Invoke2 in Figure 5), is linked to the geoevents:#precipitation ontology concept. Given an expansion ratio of 0.80 value, the Observation and Service Expansion Engine proposes a set of candidate observations, which consists of {geoevents:#precipitation, geoevents:#flow}. This set of candidate observation concepts provides the input required for the discovery of alternate service chains. Starting from the geoevents:#flow concept and using a forward search strategy, the Observation and Service Expansion Engine identifies a candidate service chain that returns the hydrological model. This chain comprises a single WPS service named ”Water Flow Level Estimation” that accepts water flow measurements and calculates an estimate of the hydrological model.

Assuming that the input required by this service chain, i.e. the properties monitored by the proposed candidate observation, are available to the landslide process prior to the execution of the SOS precipitation service, then this chain can be used as an optimization to the set of activities specified in Sequence3. Along the same lines, similar suggestions can be identified for the rest of the process activities, i.e. in case there are services that can exploit the related candidate observations.

The STS models of the identified candidate service chains and the STS model of the original landslide process are combined in order to formulate the planning problem domain. The extraction of the planning problem goals can be achieved through the discovery of the expected goal states of the landslide process model. According to Figure 5, the (nominal) goal state is achieved when the landslide process response is assembled out of the SafetyFactorsMapOutput, which is returned from Invoke4 activity. Both the domain and goal planning problem descriptions are sent to the planner for the calculation of the planning problem solution.

Figure 7. Expanded version of the landslide BPEL process.

The solution proposed by the planner is generated by the Output Provider and is graphically illustrated in Figure 7. To avoid unnecessary clutter, our example focuses on the extensions introduced to the Sequence3 activities, but similar extensions can be provided to the whole list of process activities. As it can be seen in Figure 7, the provided extensions (marked with a gray background) include in addition to the original two alternative paths, i.e. these additional paths correspond to the activity sequences occurring when floodObs condition is true and the sequence occurring when PrecipitationObs condition is false. These paths enable the exploitation of flow- and/or precipitation-related information, which may become available on the SCS Engine, for the adaptation of the landslide process. For example, in the case of flow related information emerging at the SCS Engine prior to the execution of Invoke2 activity (i.e. the [floodObs==true] condition is valid), the Service Orchestration Engine will invoke Invoke6 activity. Considering that the alternative path includes a smaller set of activities, it is probable that this leads to smaller execution times. Similarly, the emergence of precipitation-related information to the SCS Engine prior to the execution of the Invoke2 activity can save execution time, as the invocation of the corresponding SOS will be skipped.

Semantic Context Space Engine

The Semantic Context Space (SCS) Engine facilitates the provision of adaptable processes by offering an open mechanism for the collection and sharing of external information elements; information elements refer to structured, annotated data i.e. semantically and/or spatio-temporally, and contained within a specific space. The provided mechanism is independent of the metadata primitives used for the annotation of information elements and supports their logical organization into groups, similarly called scopes. The SCS Engine provides its clients with a basic set of operations, which include writing, grouping, and retrieving of information elements. Specifically the core features of the SCS Engine are:

The acquisition of semantically and spatio-temporally enhanced information elements. The need for the semantic annotations leads to the support of WSML [ref] and RDFS [ref] meta-information models along with associated meta-information search engines.
The support for the logical grouping of information, the so-called ‘information islands’ (e.g. information pertaining to weather conditions), as well as the specification of associations among those information scopes.
The provision of a loosely coupled coordination model, and more particularly, a subscribe-notify model, which ensures the decoupling between the client and the SCS Engine.

Figure 8. Information model of the SCS Engine.

Each information entity stored in the SCS Engine abides by a specific form, which is illustrated in Figure 8. In particular, the main attributes of an information entity are: (i) a unique identifier Id of each information element, (ii) a Lease that represents a fixed period of time in which the information element is considered to be valid, and (iii) a set of MetaInformation objects, which are responsible for holding the attributed meta-information properties. Instances of the Scope class are used for maintaining details about the logical groups that an information entity pertains to.

The MetaInformation class is further refined via the RDFSMetaInformation and WSMLMetaInformation classes that hold semantic extensions described in RDFS and WSML notation respectively, as well as the SpatialFeature and TemporalFeature classes which store spatial and temporal characteristics about the inserted information accordingly. The implementation can be easily extended so as to offer other types of MetaInformation if needed.

The acquisition mechanism is independent of the metadata primitives used for the annotation of information elements, and supports their logical organization into groups, also referred to as scopes. The SCS Engine provides its connected clients, i.e. external data sources, with a basic set of operations, which support the efficient writing, grouping, and retrieving of information elements. The latter can be enhanced with the addition of semantic, spatial, and temporal metadata annotations.

Let us exemplify the role of the SCS Engine in the execution of the expanded Landslide BPEL process of Figure 7. For the sake of our example, we assume that a sensor is available and plays the role of an external source. We also assume that we have available an application that wraps the sensor. This application is directly connected to the RMI interface provided by the SCS Engine and can execute the provided operations. It is also aware of the landslide ontology. To keep the example simple, we decided to omit using spatiotemporal annotations, so we assume that the sensor is located in the same area that the SOS refers to.

Figure 9. Interaction of the SCS Engine with an external source and the Service Orchestration Engine.

Figure 9 graphically illustrates the described example showing SCS Engine’s interaction with a sensor playing the role of an external source and the Service Orchestration Engine respectively. The sensor periodically gathers the precipitation value of the area and writes this value along with its meta-information in the SCS. In detail, the sensor performs a write operation with input of type:

< valueOf Precipitation, geoevents:#precipitation >

Taking into account Figure 8, let us suppose that the process starts executing Sequence3. In the evaluation of the If2 condition, the Service Orchestration Engine needs the value of the precipitation. The Service Orchestration Engine will first search in the SCS Engine for the precipitation value. As the Service Orchestration Engine is integrated with the SCS Engine, this is translated into a simple read call in the SCS for values which are annotated with the meta-information geoevents:#precipitation. If the read operation returns a value for the given query, then the Assign2 and Invoke2 operations will not be executed, as the Precipitation data is already available through the result of that operation. This way, we save time by omitting the execution of the SOS (Invoke2).

Service Orchestration Engine

The Service Orchestration Engine is the main component of the ENVISION Adaptive Execution Infrastructure, and is responsible for the decentralized execution of environmental models that are implemented and deployed as BPEL processes. Central to its architecture is the underlying P2P infrastructure, dubbed P2P Engine hereinafter, which implements a binary hypercube topology to organize an arbitrary number of available nodes. Each node hosts an instance of the Service Orchestration Engine and cooperates with the rest of the available nodes in the hypercube for the distributed deployment, execution, and monitoring of BPEL processes.

Figure 10. A three-dimensional hypercube topology.

Figure 10 illustrates a complete three-dimensional binary hypercube topology. The number on each edge denotes the dimension in which the two connected nodes are neighbors, while each node is identified by its position, which is conveniently given in Gray code. In general, a complete binary hypercube consists of

nodes, where d is the number of dimensions equaling to the number of neighbors each node has. Hence the network diameter, i.e. the smallest number of hops connecting two most distant nodes in the topology, is

Hypercubes have been widely used in P2P computing (Schlosser, Sintek, Decker, & Nejdl 2002), (Ren, Wang, & Liu 2006), (Anceaume, Ludinardn et al. 2008), and are particularly known for a series of attributes, which are also fundamental for the applicability of our approach:

Network symmetry. All nodes in a hypercube topology are equivalent. No node incorporates a more prominent position than the others, while any node is inherently allowed to issue a broadcast. Consequently, in our case, any node can become the entry point for the deployment and execution of a process.
Efficient Broadcasting. It is guaranteed that, upon a broadcast, a total of exactly messages are required to reach all N nodes in the hypercube network, with the last ones being reached after ∆ steps, regardless of the broadcasting source. Since broadcasts are extensively used in our approach for the deployment and un-deployment of BPEL processes, this property proves to be critical in terms of performance.
Cost-effectiveness. The topology exhibits an complexity with respect to the messages that have to be sent, for a node to join or leave the network. Hence, the execution of the respective join and leave protocols does not inflict the overall performance of the distributed BPEL engine.
Churn resilience. It is always possible for the hypercube topology to recover from sudden node losses. This makes possible the deployment of the distributed BPEL engine in less controlled WAN environments, if needed, where churn rates are naturally higher than the ones met in centrally administered LANs.

Each node participating in the P2P Engine is capable of executing one or more individual BPEL activities as part of one or more process instance executions, while also maintaining one or more of the instances’ data variables. Thus, one or more nodes are recruited to contribute in the execution of a given process instance, and coordinate with each other in a completely decentralized manner that is exclusively driven by the structure of the corresponding process.

Figure 11. Internal architecture and main components of the P2P Engine node.

The main internal components of a node participating in the P2P Engine are shown in Figure 11. The P2P Connection Listener acts as the entry point of each node accepting incoming requests from other nodes in the hypercube. Each request is bound to a new P2P connection, which is then passed to a P2P Connection Handler for further processing. Since the latter runs in a separate thread, it is possible for a node to simultaneously serve more than one incoming requests.

Depending on its type, a request is always associated with a particular P2P service, which the P2P Connection Handler selects, instantiates and executes. P2P services fall into two distinct categories:

Hypercube services are used by the node to perform the various tasks needed for the maintenance of the hypercube topology. Such tasks implement the join and leave algorithms of the hypercube protocol, as well as additional functionality such as broadcasting, random walks, heartbeat, etc., which is essential for the network.
BPEL services encapsulate all functionality necessary for the distributed deployment, execution, and monitoring of BPEL processes by the nodes of the P2P Engine. Such services provide for the execution of individual BPEL activities (by employing the appropriate BPEL activity executors), the read/write of process variables, the response to notifications such as the completion of an activity or the completion of a process, etc.

P2P services may follow a simple one-way communication, or otherwise implement the request-response pattern, in which case the corresponding P2P Connection Handler is used to send back the response message. The execution of most supported P2P services includes the invocation of one or more P2P services on other nodes within the hypercube. This is typical for instance in the hypercube service implementing the broadcast scheme, or the BPEL service that is used to execute a particular activity. To support such situations, each node is equipped with a P2P Service Client, which is responsible for establishing a P2P connection with a specified node and consequently submitting the prepared service request. Finally, the majority of the supported P2P services make use of a local database that is embedded within the node. The database holds all information that is needed by a node to participate in the hypercube topology, and also maintains the various tuples, which are generated upon deployment and execution of a BPEL process.

Figure 12. Recruitment of workers for the execution of the landslide BPEL process.

Let us describe how the node recruitment algorithm works in the case of the landslide BPEL process of Figure 5. For the sake of simplicity in our example, we assume that the Service Orchestration Engine has just started and comprises a hypercube of eight nodes (3-cube). Figure 12, read from left to right and top to bottom, demonstrates the sequence in which the hypercube nodes are visited upon receipt of an execution request by node 000, while Table 1 shows the recruitment results, i.e. the distribution of the BPEL activities and variables to the hypercube nodes. As it can be seen, the recruitment algorithm managed to engage all available nodes while taking into account their frequency of use upon distribution of the workload.

Hypercube Node	Assigned Activities	Assigned Variables
000	Receive, Reply, Invoke2	LandslideInput, Precipitation
100	Sequence1, Assign3	HydroModelInput
110	Flow, Invoke3	HydrologicalModel
010	Sequence2, Assign4	SafetyFactorsMapInput
011	Assign1, Invoke4	DEMInput, SafetyFactorsMapOutput
111	Invoke1, Assign5	DEM, LandslideOutput
101	Sequence3	-
001	Assign2	PrecipitationInput

1 2 3