Archive for SOA

The Myth of the Silo

Warning: This article requires a lot of editing love before it is very useful. It might be somewhat incoherent. Read at your own risk. ;-)

Silo (software): A silo system cannot easily integrate with any other system.

In software, the term “silo” is used to refer to a system that is constructed as one unit from the front end to the data storage. Everything is tied together to work with the rest of the silo, but not with other elements.

This is considered a bad thing.

The problem comes when we want to integrate the system with other systems or reuse parts of the system.

Many of the new ideas in software development has as one of their big goals to try and rectify the silo problem. In general this is achieved by splitting up the system into services that may or may not be distributed across different computers.

But the badness of the silo hinges on two claims: First, that the applications built as an integrated stack cannot be integrated with, and second, that a system of distributed services can be integrated with.

Read the rest of this entry »

Comments (2)

Fire påstander om SOA

This article is a Norwegian-language version of my article Four bold claims about SOA.

Dette er et utkast til en artikkel jeg ønsker å få publisert. Jeg setter stor pris på tilbakemeldinger om uklare tanker og formuleringer.

To av de vanskeligste problemene vi møter innen programvareutvikling er integrasjon og det som gjerne kalles “business-IT alignment” eller forretningsorientering, altså: Hele organisasjonen arbeider for de samme målene.

Tjenesteorientert arkitektur, eller Service Oriented Architecture (SOA), hevder å kunne løse begge disse problemene. I det siste har jeg forsøkt å lytte mer aktivt til påstander fra evangelister av SOA, og jeg begynner å få en forståelse for hva det er SOA foreslår som løsningen på problemene vi står overfor. Dette er min oppfatning av påstandene om hvordan SOA skal løse problemene med integrasjon og forretningsorientering:

  • Web service-standarder vil løse de tekniske integrasjonsproblemene (”WS-*-påstanden”)
  • Å sentralisere integrasjon via ett knutepunkt vil løse de organisasjonsmessige utfordringene rundt integrasjon (”ESB-påstanden”)
  • Å modellere funksjonalitet som en arbeidsflyt mellom tjenester vil gi oss bedre forretningsorientering (”BPM-påstanden, del 1″)
  • Å kunne restrukturere arbeidsflyten mellom tjenestene vil gi oss en en smidig forretningslogikk (”BPM-påstanden, del 2″)

Read the rest of this entry »

Comments (2)

Link: Open Source in the Enterprise

CIO JP Rangaswami at investment bank Dresder Kleinwort Wasserstein talks about why he considers open source a corporate IT asset. In this talk, Rangaswami describes how DrKW wanted to create an internal incubator environment in order to combat skill attrition in the late 90s. In the course of doing this, they acquired OpenAdaptor and discovered almost accidentally benefits of the open source development model.

Read the rest of this entry »

Comments (2)

A Brief Adventure with Universal Repositories and REST Web Services

Inspired by Per Mellqvist (and myself, to be fair), I wanted to explore the possibility of using a generic DAO or Repository interface for REST. Based on this simple idea, I was able to create a very cute and testable prototype of a full Web Service stack for REST based Web Services. The most interesting aspect was creating a universal test case for Repositories.

This article shows how little code is required to implement and test a REST based Web Service in Java, despite the horror of the Java HTTP client API. The source code can be downloaded from my subversion repository. I also want to illustrate how to create black box tests that can be reused efficiently with different implementations of a Repository.

Read the rest of this entry »

Comments (7)

CRUD, REST, DDD, Rails - these are a few of my favorite things

Some time back, I watched a video David Heinemeier Hansson give a talk on ActiveResource on RailsConf. The thing that struck me is how much Rails’ ideas are connected to those of Domain-Driven Design. Watching DHH is like seeing a version of Eric Evans on speed.

Read the rest of this entry »

Comments (3)

On Integration: Consolidated View

In my last post, I wrote about four integration scenarios using databases: Reference data, Consolidated view, Subscription and Publishing. Of these, the Consolidated View scenario requires the most interaction between the server and the client roles. This post will examine how to make the pieces fit together.

Consolidated view joins the data of multiple clients into a consolidated view. This makes you able to create administrative applications that span a set of subapplications without having to change the central view when a new application joins the mix.

Read the rest of this entry »

Comments

On Integration: Organizing the data

In my last post on using the database for integration, I argued that the best metaphor for creating systems that are interconnected is that of One-Large Database. Carl-Henrik asked some very relevant questions about this, which I will interpret (for now) mainly as “how do you avoid drowning in complexity”. This post will address the issue of maintainability, especially when things grow to be large.

The On Large Database metaphor is mostly useful as a starting point. The first thing I notice when I look at our systems that have been organized this way is the fact that the data-structure is far from flat. Each application will deal with four categories of tables. First, an application has a large private domain, containing tables that the rest of the world has no business messing with. Second, it exports some tables that other applications use. Third, it imports tables exported by other applications. And finally, there are some tables that are shared by a number of applications with no clear owner. The last case is something we normally want to avoid.

In my experience, the integration scenarios can further be divided into four cases: Reference data, Consolidated data, Published data, and Subscription data.

Read the rest of this entry »

Comments

On Integration: Why I enjoy working with databases

Status: This article is currently pretty dry. I’d like feedback on how to make it more eloquent.

In my previous blog post, I promised to write more about using databases as the main integration strategy. In the current post, I plan to cover maybe the most important question: “Why?”

Imagine an application where every time it wants to communicate with another system, it reads or writes to the database. For now, let’s ignore how this would work, and how it would evolve, which will be the subject of later posts. What advantages does this offer?

The alternative is usually to integrate with another system though a variety of means. In Java, the most common ones are Web Services, RMI, EJBs (which offers it own quirks in addition to those of RMI), Sockets, and various tricks using the file system.

The most important issue to me is invariably productivity. When I work with databases, I generally can use Object-Relation Mapping tools. This is a very productive way of accessing database data in an application. RMI offers similar advantages, but you will have to build lazy loading on top of the domain model if you want to have a rich model where the objects are interconnected. Web Services generally have some bindings to Java, but in my experience, these are really inadequate. Either the Java side suffers, for example by forcing you to have getters and setters, by forcing you to use arrays instead of collections, or by forcing you to use strings as the main data type. Alternatively, the XML-side suffers by having non-specific types (if you use collections). Sockets, of course are very unproductive. They give up productivity for simplicity.

The data that is managed by the remote service generally will come from a database anyway. This means that the data access code will be have to be developed somewhere anyway. A remoting layer will have to be developed in addition.

To maintain sustainable productivity, we need unit tests. Unit testing has for me proved to be hard to do well for both Web Services and RMI, and EJBs are of course out of the question. As my regular readers know, using a test database for standalone unit testing is quite simple. As an added bonus, tests that use the database will essentially have verified the integration. When I use a remoting protocol, I always run into strange problems very late in the test process.

Both unit testing and productivity benefits from the fact that dealing with databases is something we’ve done for a long time. The tools and techniques for doing so are very mature, compared to other methods of integration.

Secondly, there is the problem of reliability. If you use a single database, everything you do is within one transaction. Either all work will be committed, or it will be rolled back. This vastly simplifies your logic if you care about your correctness. For distributed systems, this will in theory be solved by the 2-phase commit protocol. However, my experience is that this adds so much complexity to a solution that the system can metaphorically collapse under its own weight. As a result, most solutions I’ve seen (and, I suspect, most solutions I haven’t) simply ignore this problem. This means that the odd resource error that occurs might very well have very unpredictable results.

A remote layer will also introduce another place where things can go wrong. Many developers end up coding recovery rutines for dealing with these kinds of errors. In my experience, this is some of the most error prone code you can write.

Third, performance-wise it is hard to beat the database. Most other methods will eventually hit the database anyway, and as a general rule, adding more steps to a solution seldom makes it faster. There are some issues with scalabilitity, however, that I will address in a later post.

Last, and maybe most importantly, I have never seen a standard interface for dealing with remote services. Solutions generally end up having half-a-dozen or more different policies for accessing different back end systems. There is one thing we will always be sure of, though: There’ll always be a database among these backend systems, no matter what else you have to talk to. Every extra communication mechanism you remove will reduce the shoestring-and-paperclip-factor of your system.

By using a single data source as the place for communicating with other systems, we will reduce complexity and improve testability, performance and reliabilty.

I hope that in this post, I have demonstrated why, in an ideal world, you would want to use a single database as your primary integration mechanism. However, the world is rarely ideal. Database schemas change, more load is added than what a single database can tackle, you have to understand a forest of database schemas, some applications should not be allowed to access all the data. In my next blog post, I will talk about how to solve these problems with database without giving up the single database vision. Stay tuned for evolution, scalability, security, reuse, and understandability.

Comments (2)

On Integration: The vision of a single database

Before Web Services, there was CORBA. Before CORBA, there was DCOM. Before DCOM, there was RPC. Before RPC, there was BSD sockets. Before sockets, there were databases. And as it was in the beginning, so shall it too be in the end.

The only systematically successful strategy in the history of computing is databases. I have discovered more and more lately that integration using a database is well-defined (DDLs - a WSDL that works!), flexible (views and triggers can hide many old sins), well-supported (today, powerful Object-Relation Mapping tools should be de regur for any sensible project), and performant (sooner or later, you’re gonna hit the database anyway). Using modern database features it can be made secure and scalable as well. In the end, databases are the best thing since, well, since databases.

I want to write a series of blog posts detailing strategies I use and explore to make database integration work. For now, let me just share my vision with you: One huge enterprise database that appears flat to any application that uses it. All applications in the enterprise using the single database instance. This vision has many practical issues in terms of performance, security, maintainability and understandability. I will spend the blog posts exploring these issues.

First, though: Where is integration using databases applicable, and how does it relate to the main buzzword of the day, Service Oriented Architecture (SOA)?

Database-based integration is only applicable for applications that are not distributed across multiple organizations or distributed widely within the same organization. This is what we can call “application-to-application” (A2A), as opposed to “business-to-business” (B2B). For B2B, technologies associated with SOA are still going to be your best bet. Also, I would not use database integration from desktop clients (”2-tier architecture”). I am not sure whether this is just because everyone has been so excited about the 3-tier architecture for so long. Maybe you could make it work. However, I don’t much care for desktop clients, so I will leave this subject to someone else.

However, when the “services” are just internal services between different parts of my application portfolio, I think integration on the database layer is greately underused.

Comments (5)

Why I Love SOA: Design Business-Related Services

What happens when a customer asks for a simple new bit of functionality? Do you have to execute changes on four different systems, test each in isolation and in combination, involve a separate testing, infrastructure and operations team?

If so, your architecture is probably not service oriented. In this post, I will examine the real meaning of coupling, and how it relates to SOA.

Read the rest of this entry »

Comments (2)

Creative Commons Attribution 3.0 Unported
Creative Commons Attribution 3.0 Unported