Feeds:
Posts
Comments

Archive for the ‘Architecture’ Category

The trouble with software architecture is that it keeps getting re-invented and new acronyms appear followed by a slew of large unreadable books explaining why this new architecture is going to change everything.  This is actually a widespread phenomenon in the software industry of many emerging approaches/solutions/tools/languages/frameworks/patterns/protocols where adoption rules supreme resulting in a form of natural selection.  It is perhaps inherent in the nature of software that such flexibility results in so many solutions to the same problem.  A good guide through this maze is a pragmatically tuned intuition that tells you when something is unnecessarily complex to be effective.  Keeping things simple means more people will adopt, use, discuss and improve it.  A good example would be RESTful services that are gaining in adoption due to their simple clear approach to exposing services through HTTP.

What is AOA ?

With all of the above taken into consideration then I want to introduce yet another architectural meme, namely Assembly Oriented Architecture (AOA).  This is more of an approach with some guidelines and doesn’t require any standards or reference documentation to understand in order to apply it.  It is an approach that has evolved from real practical experience and is actively used on all projects that Optaros works on so is well proven in the field.

At Optaros we focus on assembling open source solutions which are often very strong on supporting open standards that lend themselves to assembly.  However proprietary solutions can also be assessed in terms of their ability to be part of an assembled solution.

Guiding principles for selecting AOA solutions

  • Lightweight standards based interfaces covering key functionality and data access.  For web based solutions these interfaces should be web oriented such as RESTful services and support returning different formats such as XML, JSON and HTML.
  • Supports open standards such as OpenID, OAuth, RDF, CMIS etc
  • Can the solution be easily disassembled – ie can the built-in search or authentication mechanism be easily switched to use another

Why use AOA ?

Using AOA is around fast delivery of robust, flexible architectures.  It is inherently pragmatic accepting that most real world solutions  are largely comprised of combining disparate applications and not nicely packaged services.  More explicitly the benefits of using it are as follows :-

  • Results in clean standards based architecture without getting locked in to a particular solution
  • Less coding – gaps are plugged by identifying suitable applications or components that meet the need and are assembly friendly
  • Faster to deploy than custom build
  • Best components for the job and ease of changing them out when something better comes along
  • Lower cost of ownership compared to either custom build or customising an off the shelf application due to the above benefits

How it differs from SOA ?

Enterprise Architect’s at this point might be thinking surely this is what SOA is intended to provide, a clean architecture that allows the different systems to be replaced as needed without breaking any of the interfaces.  I should firstly say that AOA is not an alternative to SOA they are completely compatible architectural approaches and I would go further to suggest that both should be adopted to ensure a clean, flexible and  robust architecture.  SOA differs from AOA in a number of different areas namely

  • SOA is concerned with defining clean services independent of any specific application, whereas AOA is about selecting applications that are Assembly friendly
  • AOA looks to have applications that can themselves be disassembled and easily configured to use external components for some areas of functionality such as workflow, rules, search etc whereas SOA would define services for key capabilities and invoke the relevant application interface
  • SOA is more about providing a layer of abstraction on top of applications whereas AOA is about  effectively combining applications to deliver a solution
  • Although not directly tied to SOA there is the whole area of Web Services and associated specifications – AOA doesn’t go to the level of detail specifications but relies on guiding principles

In my experience SOA can be taken too far and alot of time spent agreeing every possible service to cover all of the combined functionality of all of the main applications.  So it can turn into a time and money pit with no real clear business value.  SOA seems to work best when common services that will be called by many systems are developed rather than trying to boil the functional ocean.  The other area that alot of time can be lost is in the dark depths of the many WS-* standards that exist – again that pragmatic intuition should steer you clear of distractions from the task at hand when developing useful services.

AOA patterns

A number of patterns are starting to emerge for different types of assembly architecture – the following is a list of the common ones.

  • Plug-in Platform – Assemble a solution around a central component covering the core functionality and acting as the integration platform for assembling the missing parts, thanks to its extensible architecture.
  • Container Assembly – Assemble a solution around a central container not providing any business functionality but focusing on cross cutting concerns (security, logging, access to resources, …). This framework should be a standard (or de-facto standard) of the other components you want to assemble.
  • Service Oriented Assembly – Assemble a solution using a SOA approach. Each component to be assembled should provide a public interface that would be used for integration.
  • Mash-up Assembly – Assemble a solution using the web-browser as a rendering layer and an integration platform to assemble different application through JavaScript, DOM manipulation, Rest API, iFrames. Each component to be assemble should provide a RESTful API.

In the end pragmatism wins, technologies continue to change and no matter what is done to try and allow for that in an architecture, ultimately effort is needed to accommodate those changes.  Given that reality check it should be clear that spending months developing intricate service definitions for everything is probably not good for anyone.  Therefore AOA offers good guidelines and actually helps deliver solutions faster while allowing for applications and components to be changed in the future as required.

Advertisements

Read Full Post »

Web Content Management (WCM) seems to mean different things to different people.  This of course can lead to confusion.  The term Web Content Management has been around for a while, since the mid 1990’s,  but two key things have changed since the term was first adopted namely the web and the type of content available over the web.

The web has become a much richer visual experience in recent years with digital content such as video, flash, images becoming far more prevalent on all sites.  Also it has become much more interactive with users generating their own content from comments, reviews, blogs, wiki’s to images, presentations, music, profile pages, video’s and applications.  The web has evolved from being a fairly static publishing tool to a dynamic social media platform.

The technical infrastructure underpinning web sites has also evolved significantly since WCM was born.  We have moved far from the early days of HTML pages and CGI scripts that add dynamic content often from a single database to platforms providing presentation templating and layout, content creation and editing tools with content aggregated from multiple sources both text and digital media.  Expectations have changed as well with content creation and management being readily available to non-technical users.

Given these changes it is no surprise that WCM has changed and evolved to adapt to the ever changing landscape.  Broadly you can divide the approaches being taken in the WCM space as those that are coupled or decoupled.

Coupled WCM (content repository + presentation combined)

A coupled WCM solution combines the presentation and navigation of the site with managing the content that is available to be included in pages.  These type of solutions typically rely on a database to manage and store content and presentation details with files for templating/layout and styling.

Examples of coupled  WCM’s include Drupal, Liferay, Joomla, Plone

Strengths

  • Rich and easy to use editorial process allowing content to be easily combined and seen as it will be displayed on the site
  • Easy to associate and combine user generated content to published content
  • Often many additional modules available supporting authentication, rich media, ecommerce etc which all work off the content model
  • Requires less technical skills to manage and maintain site
  • Usually has a strong multi-site model allowing content and templates to be reused across different sites
  • Built in authentication to control access rights of users to content

Weaknesses

  • Not so strong for managing file based assets, including versioning, grouping, transformation and workflow
  • Poor API support to expose content externally
  • Design of site needs to be aligned to templating model of solution
  • Challenges of distributing development due to configuration being stored in the database
  • Poor support for managing deployment and versions of a site

Decoupled WCM (separate repository(ies) and presentation layer)

The decoupled approach focusses on managing content independently of any presentation of the content.  So the content is managed in a repository which provides versioning, metadata, workflow and the presentation is managed in a front-end platform that allows pages and navigation to be easily managed and often provides user management.  Some decoupled repository based solutions also offer features such as allowing users their own sandboxed version of a site to change so they can preview just their own changes before those updates are deployed to the main site.  However this type of approach does assume that changes are being made to files rather than changing config/content in a database through a social media front-end such as Drupal.

Examples of content repositories: Alfresco, Nuxeo

Examples of  front-end presentation layers: web frameworks such as Django, Symphony, Ruby on Rails, coupled WCM’s like Drupal where only User Generated Content (UGC) is stored in the front-end and all other content is retreived from one or more repositories, portals such as JBoss and Liferay

Strengths

  • Clean separation between content and presentation allowing different tools to be used that best suit the solution or enable use of new tools/technologies as they emerge
  • Strong API access to content within the repository
  • Ability to have several repositories that focus on certain content types such as documents or digital assets and leverage the specialised  functionality of these tools
  • Is possible to use a coupled WCM for front-end and gain the benefits that provides while reducing the limitations due to accessing content from a backend repository

Weaknesses

  • Challenge of providing an easy to use front end for managing composite pages which combine content from multiple repositories
  • Requires integration between front-end presentation platform and backend repositories
  • Content creation might require different UI’s if this is provided by each backend repository
  • Need to define clear separation of responsibilities between front-end and backend such as where are taxonomies mastered, how is search managed across both UGC and content in backend repositories, where is access control to content managed

In cases where there is alot of disparate content potentially from many sources the decoupled approach makes the most sense – combining content from multiple sources and being able to present that using one or more front-end platforms.  Developments such as CMIS will help facilitate accessing content from various sources from the front-end platform.  The greater challenge concerns providing easy to use editorial screens to easily manage composite pages that combine content from several sources.  Utilising a rich social media platform such as Drupal for the front end will help ease this process but there is still work to be done to make this even slicker.  There is already a CMIS connector for Drupal currently tested against the Alfresco implementation of CMIS.  For good coverage of some of the future trends being discussed in content management see What is the Future of Content Management ?

If anything is sure it is that WCM will need to continue to evolve regardless of whether the acronym itself remains or is replaced by a broader content consolidation and publishing meme.  Understanding the current state and trade-offs will help ensure an informed decision is made as to the right approach for any particular enterprise strategy to exposing content over the web.

Read Full Post »

Cloud computing is most often associated with scalability (see Amazon CTO Werner Vogel’s definition of scalability).  One commonly held view is that you can simply move an application onto cloud based infrastructure and it will then “magically” scale on demand.   The reality is that there is no free lunch.  Simply throwing additional CPU cycles or storage at an application is not going to deliver linear scalability unless the application was designed to scale in such a manner.

The cloud era heralds the development of new enterprise application platforms available on demand as well as new social platforms.  However this isn’t as simple as taking the current crop of relational database centric solutions and deploying them on Amazon EC2.    Of course this isn’t stopping vendors from taking that approach and offering on demand versions of their products.  The challenge is that these applications are not designed to scale dynamically and in a distributed manner.   The result is that as traffic and usage grows there will be a continual cycle of monitoring and patches to try and keep the application performing to an acceptable level.   While this will always be necessary to monitor and improve there are lessons to be learnt from some of the largest concurrent, multi-user sites that can help reduce the pain.

Consideration of cloud based scaling is clearly dependent on the nature of the application and the anticipated volume of usage.  If the application for example is very read heavy and low on write transactions then replicating databases with good caching could well be sufficient.  However for solutions that require massively concurrent heavy write based access to the database consideration needs to be given to architecting to achieve scalability.

Distributed database versus relational database

Relational databases are primarily designed for managing updates and transactions on a single instance.  This is a problem when you need massively concurrent access with millions of users initiating write transactions.  The approach taken to address this is usually clustering or sharding.  But this is really attempting to patch up the problem rather than addressing it full on.  That said there are many large scale examples using a relational database and applying these approaches.

Given a clean sheet and current developments what approaches can be used to address massively concurrent write heavy applications.  Well there are a number of different distributed database solutions that have emerged in the last few years either based on some form of key-value distributed hash table (DHT), column-oriented store or document centric.  They are often built to address precisely the issue of scaling for write heavy applications.  However they should not be considered a direct replacement for a relational database.  They often lack support for complex joins, foreign keys as well as reporting and aggregation – although some of these areas are beginning to be addressed.   Also there is not currently an SQL or object mapping such as Active Record to cleanly and transparently access them from code, so extra development effort is required.   However they should certainly be considered as part of on an overall architecture and leveraged to reduce write heavy bottlenecks in the solution.

Amazon SimpleDB – simple key value DHT, based on the Dynamo solution created by Amazon

Apache CouchDB – document centric approach built using Erlang

Cassandra – DHT variant that supports a rich data model,  originally born at Facebook, now an Apache incubator project

HBase – column-oriented store similar to Google’s BigTable, uses Hadoop as a distributed map/reduce file system.

Here is a great blog from the cofounder of Last.fm on the multitude of alternatives to a traditional RDBMS for heavy distributed write based applications http://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-value-stores/

Another blog worth reading on distributed key stores is http://randomfoo.net/2009/04/20/some-notes-on-distributed-key-stores.

Stateless immutable services

One of the guiding principles for linear scalability is to have lightweight, independent, stateless operations that can be executed anywhere and run on newly deployed threads/processes/cores/machines transparently as needed in order to service an increasing number of requests.   These type of services should share nothing with any other services they simply process asynchronous messages.  This type of async message passing has been proven to scale in languages such as Erlang.  One paradigm that is closely aligned to this approach is known as the Actor model.  The actor model is all about passing immutable messages and the share-nothing philosophy.  A lightweight stateless protocol such as REST is well suited to allowing these services to be accessed across the internet through HTTP.

Speaking the language of scalability

As always choice of programming language can end up being more an emotional rather than necessary decision.  But it is true that it can help to pick the right tool for the job at hand.  Some languages have better support for developing highly concurrent distributed and scalable applications. The characteristics to look for are languages that encourage immutable data structures and referentially transparent methods, typically being functional in nature and supporting asynchronous message passing.  Two popular languages that are receiving alot of attention are Scala and Erlang.  Scala runs on the JVM and was famously used to provide scalability for Twitter by implementing a message queuing solution.  Erlang has it routes in embedded systems and so was optimised to run on minimal resources.  It utilises processes which are much lighter and faster than even O/S threads supporting both multiple cores or multiple machines transparently.  Both Scala and Erlang have good support for the Actor model again encouraging scalable independent async message driven design.

In the end there is still more learning and maturing to be done in developing the next generation of cloud based solutions and not all will need to scale to high volumes.  It will be an interesting time and there is much that can be learnt from others who are already dipping their feet in this pool.  A good site for keeping track of what others are doing in the whole space of scalability is http://highscalability.com/.  Being aware of these changes is especially important when embarking on new projects where consideration to scale and using cloud infrastructure are factors.

Read Full Post »