Thursday, February 27, 2014

JPA2 new features

What features came with JPA2?

The JPA2 was delivered with Java EE6. 
JPA2.1 was shipped with EE7 and is currently the latest version that can be used. 
Features:
  1. properties have been standardized
  2. support for using cache solutions
  3. better and finer support of lockings 
  4. enhancement of JPQL
  5. support of validation API
1. In the first JPA version the properties in the xml - configuration have been proprietary, so for each JPA provider the used property have been different:
in hibernate the url to the datasource was named "hibernate.connection.url" in toplink it was named "toplink.jdbc.url". Now common properties have been abstracted to:
<property name=“javax.persistence.jdbc.driver" 
 value=“XXX”/>
 <property name=“javax.persistence.jdbc.url" 
 value="XXX"/>
 <property name=“javax.persistence.jdbc.user" 
 value="XXX"/>
 <property name=“javax.persistence.jdbc.password" 
 value=“XXX"/>


2) Cache support allows main operations like 
  • does an entity exists in the cache: boolean contains(Class clazz, Object entity)
  • remove entity from cache: evict(Class clazz, Object entity)
  • remove all entities from a type: evict(Class clazz)
  • clear the cache: evictAll()

3) Better support for locking modes
  • OPTIMISTIC
  • OPTIMISTIC_FORCE_INCREMENT
  • PESSIMISTIC
  • PESSIMISTIC_FORCE_INCREMENT
For retrieval API of the entity manager you can specify one of the above mentioned modes or lock it after obtaining the entity: 
      EntityClassX entity = em.find(EntityClassX.class, id, PESSIMISTIC); 
vs.
     em.lock(entity, PESSIMISTIC);

Surely it is possibly to read the entity without severe lock mode, apply business logic on it and the obtain the lock to the end of the business transaction: 
     em.refresh(entity, PESSIMISTIC);



4) Enhancements of JPQL
  • date and time support like {d '2014-02-27'} or {t'14:00:00'}
  • member support: FROM ORDER O WHERE 'RECURRING_INVOICES' MEMBER OF O.TYPES
  • Collections comparing to empty: FROM ORDER O WHERE O.ORDERITEMS IS EMPTY
  • index support (retrieving rows based on the existing index of the table): WHERE INDEX(t) BETWEEN x AND y
  • ...

5) Validation
The validation part used in JPA2 is based on the specification in JSR303 and has the reference implemenation: HibernateValidator. 
Important to mention is that the JPA2 does not explicitly define a bean validation support. 
So the JPA provider could have a bean validator support. With Hibernate as the JPA provider, the Hibernate Validator is used.




Saturday, February 22, 2014

Spring batch - special aspects of batch processing

Spring batch what for?

The spring batch is a framework specially designed for batch processing.
Intended for processing e.g. files with large amount of data, providing a clear DSL defined based on XML.
The framework comes with abstractions and defaults that have "extension points" where business or processing logic can be placed.

Why should I use spring batch?

It is regarded as the standard framework for batch processing. A lot of developers know how to use it.
It provides useful abstractions and can be configured in many regards to support more advanced requirements:
  • transaction support
  • retry support
  • skip functionality
  • perfectly integrated in the spring world (DI etc.)
  • strong layered architecture
  • very scalable due to support of step partitioning, multi-threaded steps, …

Basic concept

In spring batch the processing starts with spring batch job. It consists of steps. 
The steps can be chunk-oriented or so called TaskletStep but this is for supporting legacy code.
The main components of spring batch are: 
  • ItemReader - reads one item
  • ItemProcessor - processes one item, but is optional
  • ItemWriter - writes a list of items
Besides that there are different type of listeners for placing business
logic:
  • job/step execution listener
  • chunk listener
  • ItemRead/Process/Write- Listener
  • SkipListener
The transaction boundary can never be around steps or a complete job.
Metadata like execution starting/end point, amount of commits/rollbacks, step status etc. are saved at several points:
  • step execution context - a map that is used for serializing data
  • chunk execution context - used inside a chunk transaction for knowing the current item in process
The default behavior for rollbacks is that if a non-caught exceptions happened during the processing of a chunk, the step is rolled back.
All committed chunks until that stay committed, but the complete job fails.

The meta data of a step are initialized in the beginning of the step and updated at the end of the step. This is done in separate transactions, in order to update the state of the step status this has to be done in an own transaction as the processing of the step itself can fail and has to be rollbacked then.

A spring batch job consists of steps as we know now and these steps consists of chunks. Each chunk is executed in its own transaction.
When does spring batch know how much data has to be read into the chunk?
This is specified by a policy: CompletionPolicy.
Specifying the commitInterval on the chunk tag leads to a SimpleCompletionPolicy.
As soon as the amount of items are read satisfying the completionPolicy, the items read and processed are passed to the ItemWriter.

Restart of a job

How is a restart of a job done?
Well, for batch processing the data retrieval if not from a file but retrieved from a database table(s) or messaging system are done over the declaration of a non-transactional datasource. As on a processing error a rollback would close the data retrieval channel as well.
On this non-transactional datasource no other job, component or module should operate. The data is normally read by use a database cursor in order to avoid memory issues. Spring batch provides here the JdbcCursorItemReader.
The restart of a job itself is analyzed by spring batch. If a job is called with the same parameters (same job parameters) spring batch sees this as a restart of the job if the job before ended with a failure.
In order to accomplish that the state is saved in the execution context of the chunk.
Here every reader persists the counter of read items inside the transaction of the chunk. Every chunk commits his work inside his transaction towards the end of work. Meaning if the completionPolicy is set to 10. After the 10 item the chunk tries to commit its work. On job restart the execution point of the last successful chunk is taken and the not committed items are processed.
The execution context is seen by all the readers - that means the state of the counter can be modified by different counters. This setup is not thread-safe!

Ordering

For the restart-ability of a job the ordering of the read data must be well-defined.
If the data comes from database or from a file the ordering in the data retrieval must be explicit set in order that the restart of the job get the items that have to be reprocessed in the same order as on the first run.














Friday, February 21, 2014

What is the impact of the ESB on the work of a requirement analyst?


What is the impact of the ESB on the work of a requirement analyst?

  • Work gets easier
  • Helping the requirement analysis
  • Moves the focus to the business events
  • Volère model



Why?

Let's look at the requirement analysis starting point:

Everything begins with the system idea document.
Next step on the list to do: stackholder matrix.
Leading to the business context diagram:
  Mainly covers the actors, parties, processes
  of the business process to support or that needs to be automated
  It should discover all business events coming to the work

Next starting point for requirements trawling:
Starting from the business events
Every business event is answered by a business use case of the work.
Requirement analyst should analyse business use case with use case templates.
Then talk to the solution designers and architects


What is easier?

Business events are coming to the bus!
BA should start work here



Wednesday, February 19, 2014

Benefits and usage of spring data JPA

What are the benefits of using spring data JPA?

Spring data JPA addresses the following situations:
  • unclear how the persistence layer will develop
    • the first prototype starts because of time and focus with a map, later it probably it will be replaced with a longterm persistence
  • persistence layer might change from relational to NoSQL or vice versa 
  • for the sake of a set of fast running unit-tests the persistence layer might be configured to use light-weight persistence like a simple map

To be really open regarding the persistence layer, the domain layer should be separated from the data access layer. For this an approach like the repository pattern from Martin Fowler is common praxis.
The repository enforces to treat objects of a type as a "conceptual set" like a collection. 
With a simple DAO approach you see the DAO as a gateway for accessing the database. 
This DAO tend to grow extensively as new querying or update functionality is needed.
This leads to poor responsibility. With the repository you treat all the objects as a conceptual set. 
For querying and update extensions the repository will make usage of DAO(s). 
So the DAO are well-focussed and have a single responsibility for gathering / changing data.
The set objects of a type are handled in the repository.
In the beginning of your development you can just have a simple in-memory storage as a map 
in order to focus on the domain logic etc. Later you can delegate the storage and access to sophisticated DAO(s).
So the domain objects used by the business logic in the domain layer are developed against the interfaces that are used by the repository interfaces.

On the top of the jpa entities the repository layer is placed.
Next to the domain objects the repository interfaces are placed. They always present to the outside the interfaces that are used by the domain layer which are exposed by the repository interfaces. These repository interfaces provide basic CRUD functionalities.

Example:
domain layer: Customer implements ICustomer
repository layer: CustomerRepository delivers ICustomer
persistence layer: CustomerRepositoryImpl implements the CustomerRepository

The CustomerRepositoryImpl also could make further usage of DAOs to access the objects.
The CustomerRepositoryImpl will make usage of the EntityManager of JPA and will define the transactional context:

public class CustomerRepositoryImpl implements CustomerRepository {

    private EntityManager entityManager;

    @Transactional
    public ICustomer save(ICustomer customer) {
      Customer c = new Customer(customer);
       entityManager.persist©;
       return c;
    } 

}

Usage of spring data jpa

The spring data jpa has the objective to simply the development of the repository layer, mentioned above as this code is boilerplate-code. With spring data jpa you only have to define the interface, an implementation for delegation and provide the corresponding spring configuration.
The rest will be instantiated and delivered by Spring.
So first of all we have to define the repository interface:

public interface CustomerJpaRepository extends JpaRepository<Customer, Long> {
    Customer save(Customer customer);
}

Second we need the spring configuration:
<jpa:repositories base-package="…">
   <jpa:repository id="customerJpaRepository" />
</jpa:repositories>

Unfortunately spring data jpa can't operate directly on the interface defined as the CustomerJpaRepository. It always needs the specific jpa entity. 
Therefore we need to implement the interface of CustomerJpaRepository.
But we will inject an instance of the CustomerJpaRepository and all operations will delegate to this instance:

public class CustomerJpaRepositoryImpl implements CustomerJpaRepository {
       private CustomerJpaRepository repo;

      public Customer save(Customer customer) {
           return repos.save(customer);
      }  

}

In the background spring data jpa will dynamically provide an instance of the interface and make it available under the id customerJpaRepository.


Advantages of spring data jpa

Spring data jpa provides finder methods out of the box. 
So based on naming conventions findByX will be provided by spring data jpa dynamically and will result to an entity result where all the entities will have for their field X the corresponding parameter value.
Besides there are other useful features like paging including sorting and others.





Friday, February 14, 2014

Requirements analysis: Business context diagram

Benefit of a business context diagram?

Defines the scope of the work we have to study.
It shows the work as a single, as-yet-uninvestigated process.
It is surrounded by adjacent systems and actors.
Arrows show the data flows between the work and the adjacent systems → carried as business events.

The business context diagram shows where the responsibilities of the adjacent systems start and end.

The data flow makes it clear what work has to be done by the adjacent systems and what has to be done by the work.
Preplanned business use cases that are activated as soon as an actor initiates an business event.


Why does business events and business use cases help?

  • A way to partition the work in a non - subjective way by identifying responses to outside stimuli.
  • Benefit of a clear view of the needed functionality.
  • Internal partitions are mainly the result of technologies, design and history.
  • Business events point out what belongs together.
  • Perfect vehicle for further requirement analysis work!






Benefits of an ESB

Source application / service has not to be changed

Business events that happened are published by the source application to the bus.
Typically there is an application interested in this event, so it will consume it as this kind of processing would have been done by a point-to-point communication.
→ If business requirements changes and enterprise landscape grow more and more applications might potentially be interested in that kind of events. So the producing application and already existing consumers do not have to be changed at all.

Easy integration through a DSL

Additional systems that are interested in that event just subscribe to the topic or if there is a dispatcher application in place, that consumes special kind of messages laying on the bus, will then dispatch requests to the corresponding systems.
Integration scenarios are typically configured / implemented with the help of a DSL leading to more maintainability and faster realization.

Further benefits


  • Heterogeneous environments get interoperable
  • Service virutalization aka location transparency
  • Service versioning
  • Transaction-based message processing






Sunday, February 9, 2014

Message routing over an ESB

How is the routing done over the ESB / bus?

Routing configuration can be available on the client or on the bus and can be changed during runtime.
Certain bus products allow you to configure routing of messaging by using a DSL. 
Here is an example:

from("invoices").choise().when(header("invoiceType").isEqualTo("clearing")
.to("clearingQueue")
.otherwise("costcenterQueue")

This shows a content-based routing. Depending on a meta data the message from one queue is routed to one of other 2 queues.

Further example with Camel:

For routing purpose we take a quick at Camel.
Camel is an integration platform, a framework with the target to provide EIP based components.
Is Camel itself already an ESB? As there is no standard definition, the answer is yes and no.
But the core functionality it brings is definitely  a routing and transformation engine.
Routes are defined in Camel with XML configuration or a DSL like above.
The messages are collected at the endpoints and processed through the defined routes.
A route itself contains of flow and integration logic. 
A message always has a producer and a consumer - inside the Camel context (runtime system of Camel)
there are processors that processes the message further by filtering, enriching, routing, transforming etc.
The component inside Camel that manages these processing that happens between the service provider and consumer
is called MEC (Message Exchange Container). This component has further information about the unique message ID, exception information, etc.

The routes are defined or to be more explicit are added to the Camel context.
This runtime system brings all the defined components in Camel together.
A route definition is a plain Java class that needs to extends RouteBuilder and needs to implement the configure-method.
In here the route must start with from(…) and end with to(…).

Like in the above snippet all the processing logic happens besides these two points.

Thursday, February 6, 2014

What is the difference between an ESB, SOA and EAI?

ESB - Enterprise Service Bus

An ESB is an integration platform in that applications that want to communicate with each other have to integrate. It also defines a backbone of your enterprise landscape so that applications and services can easily communicate.

SOA - Service Oriented Architecture

A SOA describes an architecture style in that software resources of an enterprise get accessible and discoverable on network as dedicated and defined services.

EAI - Enterprise Application Integration

An EAI is driven by business needs to achieve a certain business objective by connecting applications inside an enterprise and external partner systems. So it is a concept for integrating business functions with the help of a dedicated IT infrastructure along the value change. As they are provided by different applications and platforms by nature EAI deals with data and business process integration.


So what is the difference?

To make it short: 
SOA is an architecture style based on services, EAI is a concept about connecting applications and services to new valuable services. Whereas the ESB is a concrete method for establishing an integration platform for inter-application communication.
Both SOA and EAI need in their concrete transformation components and the ESB plays an very important role for the realization.

Monday, February 3, 2014

ESB: Drawbacks and risks

Depending of the choice of the broker architecture:

Very often a central Hub is chosen as the broker architecture. That means the messages on the bus all go through the hub.
- Scalability
-- less scalable as a distributed broker architecture

- Maintenance
++ easy to monitor
++ easy understanding 
-- maintenance work on hub gets crucial

- Stability
-- single point of failure


Service locator
The service locator is responsible for locating / identifying the service of an application, that needs to be called to process a certain message.
Every service provider has to register the service at the service registry. Service consumer ask for the service and retrieve the endpoint for starting to consume the service.
The danger of a centralized service locator is similarly to the central Hub. If the service locator is to a certain moment not available no messages are processed as no services could be identified.
It is important that there are multiple instances of the service registry and that the configuration information is synchronized between them.

Messaging
Leads to decoupling between the communication partners - the messages are guaranteed delivered by buffering, saving and forwarding. 
But there are situation where asynchronous communication leads to just more overhead and complexity swallowing the advantages.

Batch processing
The bus should not be misused for batch processing as throughput may much less as direct batch processing.
Mass data processing should still be accomplished by corresponding batch applications that might publish their result or special cases to the bus.

Business logic
A big danger is the possible development that business logic leaks into the bus - leading to maintainability and scalability problems.
Business logic should not be handled directly in the bus. 

Commands
The bus should not be thought of an assembly line for placing orders. 

Much better is to place occurred business events on the bus and let the connected applications / services consume those events.

ESB: Typical tasks

- Transports
Transportation of the messages
- Messaging
Allowing asynchronous and synchronous messaging.
- Security
Application that wants access to the bus must authenticate and must get authorization for the requested operation.
- Datatransformations
A message of data model A is transformed into data model B without losing content.
- Service locator
On receive of a message the bus identifies / locates the service of an application to be called.
- Interceptors
After processing messages at certain stages processing can be intercepted by configured service calls - e.g. allowing realization of cross-cutting concerns.
- Protocol bindings
The bus allows forwarding requests from one endpoint to another one. Not changing the message at all or enriching or transforming it.
- Service model

Sources that will  / should be connected to the bus:
- Web services
- Queues
- Portals
- File / FTP / ...
- BPEL

Systems that would / should be connected to the bus:
- business applications
- mobile devices
- partners
- browsers
- rich clients