evolving through ...: BPEL

Service-oriented architecture (SOA) has become mainstream technology for integrating disparate systems and applications. For building composite applications, Business Process Execution Language (BPEL) has emerged as the standard for business process flow orchestration and application integration within organizations. Many IT organizations are deploying composite applications that use BPEL to automate critical business processes. If your application needs to be available 24/7, you need a reliable BPEL infrastructure. In this article, we first present a typical BPEL engine architecture and then outline its manageable entities, and the typical management functions you need to worry about as an administrator. Finally, we conclude with approaches and best practices for managing your complete BPEL infrastructure.

To effectively manage BPEL infrastructure, administrators need to manage all the components that make up the BPEL ecosystem. Before diving into management functions you need to concern yourself with as an administrator, let's look briefly at the BPEL ecosystem.

The BPEL Ecosystem

By BPEL ecosystem, we mean the system components and resources used for executing the BPEL process and those that ensure guaranteed availability of the BPEL process. Although some of those systems may not be directly under your control, you need to take some action to monitor their availability. Unequivocally, if one or more of these components go down, it will put the business process execution at risk. Figure 1 depicts the architecture of a typical BPEL engine.

A BPEL process typically depends on one or more partner links. A partner link might be a service owned by the organization, such as an EJB running an application server or a database stored procedure over which you have immediate control. It might also be a Web service provided by a partner or vendor over which you do not have any control.

The BPEL processes are run in a runtime environment called a BPEL server or engine. The availability and service level associated with BPEL processes are completely dependent on the BPEL engine. BPEL engines can run in an application server environment. You can have a single instance or a clustered application server environment if your composite application requires high availability. A BPEL server may depend on some of the resources, such as data sources or messaging services, made available by the application server environment.

BPEL processes are long running, and hence they need a persistent store to persist state. Typically the persistence store is a relational database management system. This persistence store is called a dehydration store.

All of these software components run on several nodes, or hosts, that constitute the critical piece of your BPEL ecosystem.

To summarize, a BPEL ecosystem comprises the following:

BPEL processes and partner links
BPEL engine
Dehydration store
Gateway to BPEL engine
The application server on which the BPEL server is running
Hosts on which the BPEL server, the dehydration store, and the gateway are running

The health of a BPEL process that you manage is a function of any of the entities that constitute the BPEL ecosystem, and you need to concern yourself with managing and monitoring all of these entities.

Let's dive down into the management functions for each component and which approaches you should follow to manage them, starting with the management aspects of BPEL processes.

BPEL Runtime Governance and Suitcase Deployment

As an administrator, you are responsible for deploying BPEL processes to a single server or multiple servers. Think about transitioning a BPEL process from test to production or deploying the process to additional BPEL engines to add scalability and availability to your system.

Typically BPEL processes are packaged in an archive known as a BPEL suitcase, which contains a .bpel file, WSDL for partner links, and classes and schemas required by the BPEL process. If you are using an SCA-compliant composite application, it may contain a BPEL module. As an administrator, you may be deploying the BPEL suitcase to more than one BPEL engine. You may want to deploy BPEL processes during off-peak times, so you want to automate the deployment procedure.

To automate deployment of BPEL processes, you can build scripts by using tools provided by your BPEL server vendors. If you're using a management tool, it may provide support for uploading the BPEL suitcases to the software library and then enable you to deploy them as needed. Figure 2 outlines the deployment of BPEL suitcases from a software library.

If your BPEL processes depend on resources such as adapters, it makes sense for you to automate the deployment and configuration of them if you want to avoid any manual errors.

Manage Versioning of BPEL Processes

In a dynamic environment, BPEL processes may change, depending on business needs. You need to implement a method of storing different versions of BPEL that have previously been deployed to the BPEL engine. This will help you track changes between different versions and back track any changes, should you run into issues. You may need to compare different versions of processes within one BPEL server or across BPEL servers. Management tools may provide the ability to store/retrieve and compare versions of a process and dependent artifacts such as WSDL and XML Schema that helps in the quick resolution of issues. The software library will allow you to roll back to a previous version of the BPEL process, should the need arise.

Monitor BPEL Process and Partner Links

Monitoring the health of BPEL processes is critical to meeting service-level agreements for your processes. You can probably ensure availability and meet service-level requirements by ensuring performance and functioning of the partner links your BPEL processes depend on and identifying problems in a proactive manner. You can develop some SOAP tests to verify the availability and performance of partner links. Use either a SOAP testing tool provided by a vendor or an open source Web service testing tool such as soapUI. The correct approach is to automate tests to be performed at regular intervals. You want to be alerted when a particular partner link has an unscheduled service shutdown or a longer-than-expected response time. If you're using a management tool to manage your BPEL infrastructure, check whether it provides such a capability.

SOAP Test Example

The Elevation Query Web Service returns the elevation in feet or meters for a specific latitude/longitude (WGS 1984) point from the USGS Seamless Elevation datasets hosted at http://eros.usgs.gov.

WSDL

Operation
getElevation
Input Parameter X: 45.890610
Input Parameter Y: 7.157390
other parameters: left empty
Output Parameter: elevation of latitude/longitude provided.

You can monitor this Web Service by executing a SOAP test at regular intervals through automated tests. These SOAP tests monitor the availability and perceived performance of the Web service. Figure 3 depicts automating testing of Web services.

Monitoring BPEL Processes from the End-User Perspective

Your BPEL process may be accessed from different locations, and you want to monitor the availability and performance from those locations. Say you're part of a global organization and users in North America, Europe, and Asia access your BPEL processes. Although it's challenging to monitor the performance of BPEL processes from the end-user perspective, you can create an agent at the customer locations to test the performance from their end. However, manually performing these types of tasks from a BPEL process that is accessed from multiple customer sites could be challenging. Hence, you can use the help of management tools to monitor the performance of your BPEL applications from the end-user perspective as depicted in Figure 3.

Error/Fault Monitoring

A BPEL process may raise errors and faults for several reasons, including data exceptions or non-availability of partner links, although you need to limit the number of faults in BPEL processes by proactively monitoring the processes, as we described earlier. You need to drill down into the processes that have raised faults, and you should be able to diagnose the cause of the faults and make sure that this does not happen again. You may want to monitor the processes that failed and help diagnose issues by using tools provided by your BPEL vendor and take corrective action as appropriate. It will also help if you implement some methodology to monitor SOAP messages sent to and received by the BPEL process and the partner links.

Gateway to BPEL Engines

BPEL process instances are created when invoked by a client and the traffic is routed through Web servers and load balancers. You want to be alerted when these Web servers or load balancers are not available or not meeting the expected performance goals.

Managing the BPEL Engine

The BPEL engine is by far the most important part of the BPEL ecosystem. You have to monitor the availability and performance of BPEL engines and need to be alerted when a BPEL engine is not performing as expected. You may have multiple instances of clustered BPEL engines. Hence, you need to know if any one of the engines is not performing or whether the load balancer is working properly. More important, you want to find out proactively when all the engines are not performing because it will lead to total disruption of service.

Configuration Management and Versioning

You may change the configuration of your BPEL engines from time to time. You may want to track any interim changes and to backtrack changes to a previous version. If you follow ITIL practices, this should sound familiar. You can store the configuration of a BPEL engine in a configuration management database (CMDB) or version control system. It can make your life easier to have your management tool provide integration with the CMDB in which you store your configuration.

Keeping a gold image of your BPEL ecosystem configuration and using it to compare with your current configuration will help in identifying issues attributable to configuration changes. And keeping track of components added to and/or removed from your BPEL system will also help in analyzing performance issues - for example, performance was bad during the first week of January but improved afterward, because there was one extra BPEL engine in the cluster after the first week of January. By comparing the current configuration with the gold image, you can monitor changes in some parameters that can have a big impact on BPEL instance throughput, such as dispatcher thread min/max parameters.

BPEL Engine Monitoring

The health of the BPEL engine is critical for your applications. Besides the status of the BPEL engine, there are some statistics you should focus on. They include system statistics such as memory; CPU consumed; and business metrics such as open and closed instances, synchronous and asynchronous process latency, and load factors. In the event of abnormal behavior such as high process latency or load factors, you have to proactively resolve the issues before they affect BPEL processes deployed in the engine.

As an administrator for your BPEL engine, you have to perform some of the following operations:

Archive information about completed BPEL processes
Remove from the database all XML messages that have been successfully delivered and resolved
Purge stale instances
Re-execute failed processes

Ideally, these mundane tasks should be automated in a production environment. You can build some automated scripts to perform these operations and schedule them to be performed at regular intervals. Many management tools help you with automation by providing a graphical user interface.

Dehydration Store

The BPEL engine stores process data in the dehydration store. This is a critical piece of the puzzle. As we discussed earlier, the dehydration store is typically a relational database. If you want high availability for your dehydration store, you will probably use a clustered database. Your database administrators should make sure that the database is available and performing properly.

Monitoring Adapters

Your BPEL processes may depend on adapters that access resources such as databases, messaging services, and EIS that may reside locally or remotely. The status and performance of adapters may have a heavy impact on the response time of your BPEL engine and processes.

Managing the Application Server

A typical BPEL engine runs in an application server environment. For example, Oracle BPEL Process Manager can be deployed on a J2EE-compliant application server such as Oracle Application Server, Oracle WebLogic, IBM WebSphere, or JBoss Application Server. The BPEL engine may depend on several resources and services provided by the application server, such as JDBC DataSource, JMS providers, JCA connectors, and shared libraries. The health of the application server, along with the performance of these resources, may directly influence the performance of your BPEL engine. Each application server provides several health indicator metrics. You should automate a mechanism that would proactively issue an alert before anything goes wrong with your application server.

Managing the Hosts/Nodes

Your BPEL infrastructure may be running on several nodes or machines. These server nodes may run into several resource issues during runtime. For example, they may run out of disk space due to excessive logging or run out of memory or CPU due to a spinning process. You need to proactively monitor these metrics and fix any issues before they become a problem for your infrastructure.

Bringing It All Together

It's now clear that the BPEL infrastructure includes a lot of entities. Monitoring these as independent entities could be a challenge.

A typical SOA environment will have hundreds of artifacts like Web services, BPEL processes, EJBs, adapters, and other resources. Monitoring a flat list of these artifacts is not productive. The dependency between these components and the impact of any changes you make is critical to determine. Distributed ownership of these components also requires that service-level agreements be established between providers and consumers of these components. These issues are addressed with a runtime governance offering. So it's important to employ a management tool that provides:

Translation of design-time discovery into production to help manage all the components and their dependencies in context
SLA capabilities to capture and measure SL compliance of your SOA application and SOA components

Think about a complicated BPEL process that uses several heterogeneous systems and applications. Say you're using Oracle's BPEL engine on an Oracle WebLogic server running on a Linux box. Your business processes involve partner links that depend on adapters that connect to IBM MQSeries applications and applications from Oracle's PeopleSoft Human Resources product family that run IBM WebSphere. Your BPEL engine uses an Oracle Real Application Clusters Database as the dehydration store. You want to monitor all these products together from a single management console. There are several management products on the market that may meet your needs. Figure 4 shows how you can manage the Oracle BPEL engine running on an Oracle WebLogic server from Oracle Enterprise Manager Grid Control 10g

An integrated management solution lets IT administrators manage complexity within a BPEL and SOA environment. Without this management solution, IT will see an increase in incidents, lower service levels, and increased end-user frustration. Moreover, as IT scales out with a new BPEL infrastructure, new personnel will be required to manage this complex distributed architecture. In addition, a move from legacy to SOA architecture can increase costs without a management strategy and toolset. You should consider investing in the right management product that makes sense for your infrastructure.

Another compelling reason why you should consider a management tool is for alignment between IT and your business. As an IT administrator, you need to monitor and report issues in terms of business processes. You should have monitoring statistics aligned with business processes. To accomplish this you need to have a view of business process flows that should be synched up with actual flows in the BPEL server. BPEL management tools provide you with a view of business processes and reports monitoring statistics for each business process.

Best Practices

Here are some of the best practices you can follow to ensure the availability of your BPEL infrastructure:

Establish service-level objectives for your BPEL processes and partner links.
Keep a library of BPEL suitcases in your software library. It will help in rebuilding a system in case of server failure.
Automate routine operations such as purging old process instances.
Monitor the performance of partner links.
Monitor the whole BPEL ecosystem, not just the BPEL engine.
Keep track of BPEL ecosystem membership/topology changes.
Keep a gold image of your configuration when everything is stable and keep updating it after every configuration change. This will help you find the cause of any possible problem due to configuration changes.
Monitor BPEL server-specific J2EE artifacts such as the JMS queues and data sources used by the BPEL server in addition to J2EE constructs used by BPEL processes.
Make sure that you select a management solution that can maximize your productivity and help you deliver maximum service through automation.

Conclusion

With SOA, managing the complete infrastructure - involving lots of heterogeneous components and products - has become challenging. For IT administrators, managing the service-level agreements for composite applications can be a monumental task. Using the right tools and methodology to make sure your BPEL infrastructure is highly available can make your job much easier.

Steve Smith has post a discussion on the basics of BPEL correlations. In this post, with his help, it is presented the challenge faced when trying to apply correlations to my business process, which unfortunately for me does not align with any vendor tutorials I have seen. Imagine that :)

The portion of my business process which requires correlation involves calling out to a notification web service and supplying an unbounded set of people to which I want to notify. This notification web service then responds with a single notification identifier (notification ID) representing the notification sent. At some future point in time, the people notified will respond (or the notification system will let me know there was no response). These responses can be received in any order at any time, and are received via a web service call into my ESB. The payload of this web service call includes the notification ID and the response of the person. After receiving the response, my business process updates the person information with the response and continues on. This is exactly what correlations are made for. Perfect!

Challenge 1: If I define correlation initialization on the notification web service invocation activity, this will persist 1 instance of the business process. But I have multiple responses coming back. After the first response is received, I have no business process instances waiting.

Solution 1a: I'll just loop around the web service invocation for each person in my set of people and make individual calls out to the notification web service. This will create multiple notification IDs and therefore persist multiple business process instances. So as responses come back, each one will correspond to its own instance. The business process would look something like this:

Problems: First, the business process will always run to completion as the response web service implementation gets invoked after the while loop (i.e. there is no receive task for the process to wait on). So, if we replace the web service implementation with a JMS receive task we can get over that hurdle. All we have to do is create another business process that implements a web service, and within the implementation it uses a JMS send activity to send the notification response to the corresponding JMS queue that this process is monitoring. The new business processes look like this:

Now the problem is that we really only have one business process instance being created. Although it seems like multiple business process instances are being persisted, each one really maps to the same instance. So, once the 1st response is received, the persisted instance is gone and other responses are left hanging.

Solution 1b: So, we remove the loop here and will need to create another business process (we are now up to 3 business processes) that sends messages with a single person to the inbound JMS queue. This will cause a new business process instance to be created for each person, so each response can now be correlated to one of these instances.

Challenge 2: How do we define the correlation set so that it actually works? Our correlation aliases were initially defined as the notification ID returning from the notification web service invocation and the JMS receive message text. Since the message text value (an XML message) does not match the notification ID (an int data type), the correlation is not working.

Solution: This was a difficult one. I went through many iterations with problems....err challenges encountered with each one. Here they are:

(1) Add an unmarshal task so that we can correlate on the resulting notification ID value. This will require using correlation on the Unmarshal activity. Since the JMS receive activity, which occurs prior to the unmarshal, causes the BPEL engine to retrieve an instance from persistence, and no correlation set is defined for this task, the engine grabs any instance. So essentially there is no real correlation going on.

(2) Try using the JMS message header correlation ID. This involved mapping the response notification ID to the correlation ID of the JMS message header prior to placing this on the response queue (in our second business process). The issue here was the correlation ID message property is defined as a string, while the notification ID is an int. So CAPS will not let you define an appropriate correlation key/correlation set that will work. The data types of each alias need to be the same, which makes sense. Also, since my business process has 2 JMS receive tasks, the BPEL engine gets confused when trying to create the instance identifier using the correlation ID of the JMS message header.

(3) Call for help. I discussed this problem with someone I know from Sun and he was able to provide a pre-release of some documentation that provided a good amount of insight into how to deal with correlations. The result of this new found knowledge was the creation of one more business process (for a total of 4) that actually had 2 JMS receive tasks. I could then create a correlation set based on the JMS message header correlationID and use correlations on each of the JMS receive tasks. The following images show 3 of the 4 business processes (the one not shown is simply a loop that places messages on the inbound JMS queue for the notification web service invocation process).

Notification Web Service Invocation

Notification Response Web Service Implementation

Combined Business Process Using Correlation

First, a message is placed on the inbound queue of the notification web service invocation process. This calls the external web service and uses the response to place a message on the correlation.queue JMS queue. Also, the notification ID that is the response from the web service invocation is set to the JMS message header correlationID property.

Recall that the Combined Business Process has a correlation set defined to be the correlationID message header property. The 2 JMS receive activities are set to use correlations. The first receive task initializes it and the second uses it.

So, once the first message is received, an instance of the Combine Business Process is created and then persisted (since it is now waiting on the second JMS receive). Now, at some future point in time, a person responds to the notification (or the service responds with "no response"). The response includes the notification ID and the response. The Notification Response Web Service Implementation maps the notification ID to the correlationID message property and places the response on the response.queue.

The second JMS receive task on the Combined Business Process is using this queue. This causes the BPEL engine to retrieve the business process instance that corresponds to the notification ID contained within the persons response. The process instance is retrieved (along with its state), and the business process continues to completion. This implementation of correlation finally worked.

evolving through ...

Monday

BPEL Environments and Complexity management

The BPEL Ecosystem

BPEL Runtime Governance and Suitcase Deployment

Manage Versioning of BPEL Processes

Monitor BPEL Process and Partner Links

SOAP Test Example

Monitoring BPEL Processes from the End-User Perspective

Error/Fault Monitoring

Gateway to BPEL Engines

Managing the BPEL Engine

Configuration Management and Versioning

BPEL Engine Monitoring

Dehydration Store

Monitoring Adapters

Managing the Application Server

Managing the Hosts/Nodes

Bringing It All Together

Best Practices

Conclusion

Tuesday

BPEL flows and correlation

recents

share this quickies

GeoVisite

trying to step further...

Labels

Blog Archive

follow up...

who's blog is

TOGAF

blog log

Give me a drop...

Followers

Visitors on this blog

visiting stamp

clouding

evolving through ... enhancements

evolving through .. net-stamp

BlogCatalog

Search in "evolve through..."

live traffic stamp