Bare bones SOA
Software vendors have hijacked a potentially useful concept by pushing heavy weight complex tools like ESBs. The goal of this article is to find out which of those tools we really need so we can stay away from unnecessary complexity. I'll do that by describing the infrastructure services we really need and how these services can be implemented in the simplest possible way.
Software depends on other software because we don't want to build systems from scratch. Each piece of software your code depends on may:
- change the address or name by which it is known
- change the technology it is implemented in
- be unavailable when you need it
- live on a different server or in the same process
- be connected through infrastructure that cannot be trusted
- speak a different language
A name is no more than an abstraction: using an algorithm a name can be translated into something useful.
In networks we're already used to names: nobody types IP addresses to locate a server on the Internet, we just use an easy to remember name. The mapping of this name to a physical address may change over time, but the rate of this change tends to be much slower than the rate of change for the software that is represented by the name.
Basically, we can get away with a DNS-like registry to translate service names into information that allows software to call a method on a server.
A message broker could be useful, but if all we need is an extended naming service we might as well use a directory. Plumbing software, the code that is used by our business code to call a service, performs a lookup of a service and stores the result. This avoids costly round trips to a broker. Compare this to the way a browser uses a DNS: the name is translated to an IP address and the result is cached until an error occurs. When an error occurs the lookup is executed again and the new result is used until further notice.
This solution is easy, robust, avoids a single point of failure for most calls, allows change etc.
Most software changes location over time, but changes occur at a slow rate. Messages shouldn't be forced to pass through expensive infrastructure each and every time.
As an example, lets consider the interaction between two applications:
App1 sends a message to a broker via a firewall
The broker sends a message to App2 via a firewall
App2 returns results to App1, using broker and firewall
By the way, App1 and App2 are deployed as two EARs in a single Weblogic instance. This simple form of interaction should not be implemented by the complex process outlined above.
Most processes require an answer right here and now, but unfortunately some systems are unavailable during several hours a day. You can accept data and promise to handle the request later. To do this safely you can either store data in a database or put a message on a queue.
Queues are an important piece of middleware that is well understood, widely available on many different platforms (most notably mainframes) and can be configured to never loose a message and be always available.
Therefore I consider queues an essential part of any non-trivial infrastructure.
The basic concepts of security are identity and confidentiality. Within a company infrastructure both are addressed by plain SSL connections. There's more to security, but we'll save that for later. My rule is that simple problems require simple solutions, so I'm claiming that SSL will do most of the time.
Angle brackets are performance hogs. XML is inefficient in network bandwidth use and requires costly marshalling and unmarshalling. It may be easy to use for developers, but the cost of infrastructure in terms of servers and networks may easily outweigh the benefits of development efficiency.
I'll even go one step further: XML was never intended to be used at runtime. Using concepts like XMLSchema we have a powerful, machine readable way to define messages. But does this imply we need all that dead weight between angle brackets all of the time? Again, following the simplicity rule I would rather choose a message format that can be parsed and transported efficiently. Like in the security case there is more to say on this topic later.
Batch processing differs from OLTP. Batches are about hundredths or thousandths of messages each hour or day. If you handle each message as a unique item, looking up the location of the service that should handle it, translate contents to XML, send it to a broker over a network, translate to Esperanto, send to another server over a network, translate to more XML, then call a service, then send the result back the other way around, batches will take down your infrastructure when volume starts to grow.
You may come away with this for OLTP but it will fail miserably for large numbers of messages.
So treat batches with the respect they deserve and design a solution that balances runtime performance and design time efficiency.
The list so far
What we need in infrastructure so far is the list of concepts and tools below:
- a directory that allows software to find other software, even if the target moves
- a mechanism to postpone processing to a later point in time
- a solution to ensure confidentiality and identity
- an efficient message format
- a batch processing strategy that can cope with large numbers of messages
Life isn't always easy
You may end up composing a service out of smaller services to offer easy access or capture knowledge about how to use services.
Services may not speak the same language.
Sometimes you are forced to encrypt a portion of a message because it is send as part of a larger message that shouldn't be readable while it passes through infrastructure.
The items above are valid concerns that become important at the fringes of your company or department infrastructure:
Composing services into a larger service stores knowledge and is therefore a useful concept. It also improves efficiency if the services are called over a slow network.
It is useful to standardize on a common dialect to be used throughout the enterprise. If you cannot start from scratch you will need something to translate between dialects. Also, standards are fine for exchanging data with business partners, but they may be to cumbersome to use in the enterprise. So we need something to translate at the borders. Maybe there's a standard XML format for the exchange of messages in an industry.
In business-to-business message exchange we can imagine a broker service that forwards messages based on characteristics in a header section. The body of the message may be highly confidential, so it should be encrypted. This is where SSL falls short and we need something more sophisticated.
In this article I argued that within an enterprise we rarely need brokers and can get by using simple measures like directories and SSL connections. If the enterprise grows or you start exchanging information with business partners, you may need more sophisticated infrastructure. So, brokers, XML and the full plethora of WS* standards fit nicely in your DMZ but are overkill elsewhere.