Sunday, February 9, 2014

Message routing over an ESB

How is the routing done over the ESB / bus?

Routing configuration can be available on the client or on the bus and can be changed during runtime.
Certain bus products allow you to configure routing of messaging by using a DSL. 
Here is an example:

from("invoices").choise().when(header("invoiceType").isEqualTo("clearing")
.to("clearingQueue")
.otherwise("costcenterQueue")

This shows a content-based routing. Depending on a meta data the message from one queue is routed to one of other 2 queues.

Further example with Camel:

For routing purpose we take a quick at Camel.
Camel is an integration platform, a framework with the target to provide EIP based components.
Is Camel itself already an ESB? As there is no standard definition, the answer is yes and no.
But the core functionality it brings is definitely  a routing and transformation engine.
Routes are defined in Camel with XML configuration or a DSL like above.
The messages are collected at the endpoints and processed through the defined routes.
A route itself contains of flow and integration logic. 
A message always has a producer and a consumer - inside the Camel context (runtime system of Camel)
there are processors that processes the message further by filtering, enriching, routing, transforming etc.
The component inside Camel that manages these processing that happens between the service provider and consumer
is called MEC (Message Exchange Container). This component has further information about the unique message ID, exception information, etc.

The routes are defined or to be more explicit are added to the Camel context.
This runtime system brings all the defined components in Camel together.
A route definition is a plain Java class that needs to extends RouteBuilder and needs to implement the configure-method.
In here the route must start with from(…) and end with to(…).

Like in the above snippet all the processing logic happens besides these two points.

Thursday, February 6, 2014

What is the difference between an ESB, SOA and EAI?

ESB - Enterprise Service Bus

An ESB is an integration platform in that applications that want to communicate with each other have to integrate. It also defines a backbone of your enterprise landscape so that applications and services can easily communicate.

SOA - Service Oriented Architecture

A SOA describes an architecture style in that software resources of an enterprise get accessible and discoverable on network as dedicated and defined services.

EAI - Enterprise Application Integration

An EAI is driven by business needs to achieve a certain business objective by connecting applications inside an enterprise and external partner systems. So it is a concept for integrating business functions with the help of a dedicated IT infrastructure along the value change. As they are provided by different applications and platforms by nature EAI deals with data and business process integration.


So what is the difference?

To make it short: 
SOA is an architecture style based on services, EAI is a concept about connecting applications and services to new valuable services. Whereas the ESB is a concrete method for establishing an integration platform for inter-application communication.
Both SOA and EAI need in their concrete transformation components and the ESB plays an very important role for the realization.

Monday, February 3, 2014

ESB: Drawbacks and risks

Depending of the choice of the broker architecture:

Very often a central Hub is chosen as the broker architecture. That means the messages on the bus all go through the hub.
- Scalability
-- less scalable as a distributed broker architecture

- Maintenance
++ easy to monitor
++ easy understanding 
-- maintenance work on hub gets crucial

- Stability
-- single point of failure


Service locator
The service locator is responsible for locating / identifying the service of an application, that needs to be called to process a certain message.
Every service provider has to register the service at the service registry. Service consumer ask for the service and retrieve the endpoint for starting to consume the service.
The danger of a centralized service locator is similarly to the central Hub. If the service locator is to a certain moment not available no messages are processed as no services could be identified.
It is important that there are multiple instances of the service registry and that the configuration information is synchronized between them.

Messaging
Leads to decoupling between the communication partners - the messages are guaranteed delivered by buffering, saving and forwarding. 
But there are situation where asynchronous communication leads to just more overhead and complexity swallowing the advantages.

Batch processing
The bus should not be misused for batch processing as throughput may much less as direct batch processing.
Mass data processing should still be accomplished by corresponding batch applications that might publish their result or special cases to the bus.

Business logic
A big danger is the possible development that business logic leaks into the bus - leading to maintainability and scalability problems.
Business logic should not be handled directly in the bus. 

Commands
The bus should not be thought of an assembly line for placing orders. 

Much better is to place occurred business events on the bus and let the connected applications / services consume those events.

ESB: Typical tasks

- Transports
Transportation of the messages
- Messaging
Allowing asynchronous and synchronous messaging.
- Security
Application that wants access to the bus must authenticate and must get authorization for the requested operation.
- Datatransformations
A message of data model A is transformed into data model B without losing content.
- Service locator
On receive of a message the bus identifies / locates the service of an application to be called.
- Interceptors
After processing messages at certain stages processing can be intercepted by configured service calls - e.g. allowing realization of cross-cutting concerns.
- Protocol bindings
The bus allows forwarding requests from one endpoint to another one. Not changing the message at all or enriching or transforming it.
- Service model

Sources that will  / should be connected to the bus:
- Web services
- Queues
- Portals
- File / FTP / ...
- BPEL

Systems that would / should be connected to the bus:
- business applications
- mobile devices
- partners
- browsers
- rich clients

Friday, January 31, 2014

ESB concept / bus

1) Explanation
The ESB (Enterprise Service Bus) is the holy grail of applications integration.
It is actually an integration platform used to allow applications to integrate more easily into an enterprise landscape.
The ESB defines the infrastructure, the foundation, in which the different applications and services of an enterprise landscape will integrate. It stands for a specific architecture style enforcing a communication bus to be used for inter-application or inter-services communication. Depending on the structuring of the enterprise landscape, that means service oriented or application based, the bus is used for service integration in comparison to "only" connecting different applications over a common communication bus.

2) Initial situation
Having an enterprise landscape with many applications that need to communicate with each other,
lead to many point-to-point communications with all its disadvantages.
That are implementing retry mechanism in each application that interacts with another application.
Dealing with the complexity of timeouts(connection and read timeout) - finding the right timeout window, distinguishing between operations that can and that shouldn't be repeated.

3) Benefits
Business events that happened are published by the source application to  the bus.
Typically there is an application interested in this event, so it will consume it as this kind of processing would have been  done by a point-to-point communication.
But now comes the benefits of an enterprise bus into play. If business requirements changes and enterprise landscape grow more and more applications might potentially be interested in that kind of events. So the producing application does not have to be changed at all.
Additional systems that are interested in that event just subscribe to the topic or if there is a dispatcher application in place, that consumes special kind of messages laying on the bus, will then dispatch requests to the corresponding systems.
Integration scenarios are typically configured / implemented with the help of a DSL leading to more maintainability and faster realization.


ESB allows:
- Heterogeneous environments get interoperable
- Service virutalization aka as location transparency
The service consumer and the service provider is decoupled by the ESB.
With service virtualization the service consumer has not be reconfigured if the service provider is changed regarding
   endpoint information.

- Service versioning
With message transformation the message of an interface that is not support anymore is translated into a message of
  new interface version.
Benefitting from again the fact that the service consumer is decoupled from the service provider.
The additional "layer" can be used to allow a technical mapping between an old interface and the new one.
Allowing business adaption independent of interface availabilities.

- Transaction-based message processing
The message is taken from queue and during the processing of the defined workflow involving maybe different services
   and service calls, the whole transaction is only committed if the process arrived the last service that is a database
   adapter. The whole process is committed as the flow reaches the db adapter.
So the ESB can be used to coordinate distributed transactions with different services involved. The client has only to  
   mark the begin and end of a transaction.

The coordination work is done by the bus.


4) Background
Basic groundwork was done by Hophe and Woolf with their work on EIP - Enterprise Integration Platform.
They have described common integration scenarios, abstracted them into different patterns, and categorized them into the following 6 categories:

- Message endpoints
All kind of patterns for connecting applications that should integrate with each other.
E.g.:
- Polling consumer - An adapter that polls periodically a data source and consume the data that needs to be
            processed.
         - Service activator - A component that locates a service of an application that then will be called.

- Message construction
All kind of patterns that deal with messages itself.
     E.g.:
- Correlation Identifier  -  The correlation id is used to map messages to a transaction Messages that took part in a
           transaction should have the same correlation id.
  - Message expiration - Message have a period of time in which they need to be processed. If this time passed
           without processing they are dropped by the message broker or moved do a dead letter queue.

- Message channels
All kind of patterns that describe how message are delivered to message endpoints.
E.g.:
 - Guaranteed delivery - Producer can rely to 100% on the fact that if a topic or queue has confirmed the receive of a
           message, that the message will be later on processed. After the confirmation the resources of the JMS client are
           freed and he can continue with further actions/processing.
-  Point-to-point channels - A message channel pattern to realize a synchronous communication based on JMS.
           Sender blocks until receiver of a message has processed and delivered the result to the sender.

- Message routing
All kind of pattern that do not change the semantics of a message - no content change. But based on certain rules
     messages are forwarded to different endpoint.
E.g.:
- Aggregator - Split message are composed to a new resulting message.
- Splitter - A message is split in several parts that result in new messages.
- Message transformation
All kind of patterns that change the content of a message.
E.g.:
- Enricher - A message is enriched with further information from other services, data sources etc.
- Message translator - Transforming a message of a data model A into a message of a data model B without
          changing the semantic of the message.
- System management
All kind of patterns that can't be categorized in the above categories or have general supporting characteristics.


Saturday, January 25, 2014

GAE and JPA

How does JPA work in a GAE environment?

The Google App Engine(GAE) supports JPA, but the persistence is not done in a relational database.
It uses a NoSQL-database, using BigTable technology.
So there have to be some restrictions:
1) Polymorph queries
2) Aggregation functions
3) Transactional behavior: in a transaction only objects of the entity group may be changed
4) ...

1) Especially in an object-oriented abstraction where the data model knows about inheritance relations between the entities and they get persisted accordingly e.g. each class will be saved into a separate table. 
On querying such a structure with JPQL GAE does not allow you the use of polymorph queries:
A extends B extends C
If you are not interested in retrieving a special entity type, it's very handy to retrieve all entities based on C, the top super class, and apply abstracted treatment in cases where common treatment can be done.
So a "from C where …." - JPQL query will be possible on a relational database, but unfortunately will fail in a GAE environment.

2) Aggregation functions like SUM, AVG, … are not usable on GAE.
The like -operator is limited in use - it can only be used on the end of a search - token e.g. ... like 'Adam R%' ....

3) The transactional behavior JPA offers is limited on the GAE platform.
In one single transaction A only objects of the same entity group may be changed.
Their changes will then be applied accordingly in the database.
An entity group is a collection or grouping of objects that creates a data structure. This data structure consists of root objects and dependent objects. Instances of an entity group are so called
- root objects (starting entity)
- and from that dependent objects.
On creation of an instance objects can point to a parent entity.
The entity without a parent entity are the so called root entity.
Datasets of this entity groups reside on different nodes of the cluster in the distributed data storage. But one single dataset is normally physically available on one node. So that during data processing communication overhead is reduced.  

One may question how the relationship inside an entity group is realized as GAE does not run on a relational database?
The relationship is not handled like on relational database by using attribute/s or in JPA language properties. The primary key of the entity is used to route through the hierarchy:
In such an entity group we have a parent having a primary key and all the children have a pk that contains the parent pk. The pk normally has the type and a certain id. Hence a child pk consists of the type + id of parent + own id.
For example: Invoice(5)/InvoiceItem(1)

Because of this @ManyToMany and joining is not usable as well. 

Tuesday, March 27, 2012

Optimistic / pessimistic locking LockModeType

What is the difference between optimistic locking and pessimistic locking?
First of all locking is intended for managing transactions.
If transactions can run serial, that means one after another,
nothing can go wrong:
The transaction fulfills the ACID properties:
- atomic
- consistent
- isolated
- durable

From database view a transaction should always have these properties:
It is executed as a whole or not - atomic.
It drives the state of the database from one consistent state to another state - consistent.
Transactions do not influence each other - isolated.
Transactions and there changes are saved durably in the database  - durable.
From JPA or Hibernate view the easiest way is using the highest isolation level:
Serializable.
Serial transactions in high frequent systems have a bad performance and scalability.
Therefore the level of serialization of an application / system is reduced to increase performance and throughput.
You pay this by problems which occur on account of certain isolation levels:
  • Dirty Reads: Changes are read by other transactions before they are committed. If a rollback takes place you have read something wrong.
  • Non repeatable Reads: Rows are read, another transaction makes changes, rows are read again, other data exists as to begin of the transaction
  • Phantom Reads: a query delivers different results during a transaction
Normally you have some places in your application or system which have to handle transaction in a highly secure way. The business logic needs a high transaction security and this is payed by less performance.

How is transaction management done in JPA?
In JPA you normally have 2 scenarios:
  1. JEE - environment and existence of a JTA manager
  2. no JEE- environment and management of transactions by the application itself
Assuming point 2) the entity manager is used for the management of transactions:
the EntityTransaction- Interface, which is delivered by EntityManager.getTransaction().
Here you have the usually methods for transaction management defined:
  • begin()
  • commit()
  • rollback()
In the Spring-environment the transactions are managed by the annotation declared on the method declaration:
@Transactional(isolation=Isolation.READ_COMMITTED, propagation=Propagation.REQUIRED)

If the method is stated with @Transactional you have the behaviour like above:
the isolation level is set by the database default(usually READ_COMMITTED) and a transaction is needed.

Locking -strategies
JPA behaves like you know it from Hibernate already:
If an entity is annotated with @Version optimistic locking takes place for this entity.
Optimistic locking is the mechanism where objects are not explicitly locked at the beginning of a transaction.
It assumes that optimistically no conflict will take place. Therefore at the moment of persisting/writing on commit a version check takes place:
Every update on an entity increments the version of an entity.
The version - property of an entity can only be written by the JPA-provider.
On commit the entity is checked if it has a different version value.
If so, an OptimisticLockException is thrown.
If not the entity with the new version number is persisted in the database.
The OptimisticLockException should be handled by the application.
This behaviour is delivered by using @Version.
With Hibernate as the JPA-provider and setting the isolation level of the transaction on Repeatable Read or Serializable, the version checking is done explicitly with a select for the Entity for retrieving the actual version used in the database. This is the Hibernate specific LockMode.READ.
If the cache is used for version checking this corresponds to LockMode.NONE.
LockMode.UPGRADE corresponds to the JPA mode LockModeType.READ.

A more restrictive OptimisticLocking - mechanism can be configured by selecting the isolation level.
For that the objects which should be locked are locked by:
EntityManager.lock(object, LockModeType t);
If LockModeType.READ is set, normally during commit the corresponding object is locked by select .... from table for update:
on row level an exclusive lock is set, which is set for a very short time span.
There is LockModeType.WRITE which increments the version - also if nothing has changed on the entity.

Pessimistic Locking, that means locking of objects on transaction begin and keeping the lock during transaction is done by these 2 PessimisticLockModes:
- LockModeType.PESSIMISTIC_READ -->
entity can be read by other transactions but no changes can be made
- LockModeType.PESSIMISTIC_WRITE -->
entity can not be read or written by other transactions