/usr/portage

Drupal as a Content Repository 0

As one of my first projects at InterNations we want to introduce rich content management functionality for internal usage. We have a custom made PHP application and want to publish a bunch of content to provide our customers with an even richer experience and greater service. Our requirements can be read along the lines of:

And these are our restrictions:

  • Content management is not our core domain
  • Developing a social community is
  • A tendency towards a PHP based solution

Build, buy, customize, integrate

As always, build, buy, customize, or integrate are the options. Buying does not seem necessary, as the Open Source landscape of content management systems is too rich. So there we have build, customize, or integrate. As I don’t think I can add anything valuable to the dozens of ways content management has been tackled, building would have been too expensive and too cumbersome to really take into consideration. Plain customizing was not an option, as adjusting the look and feel was not the main problem we were having. Our main problem was integration with our existing system. We want to show content elements like teasers next to existing functionality in our application, we want to deeply integrate for example forum posts or places with our content. While customization can be part of that integration task, it’s not enough. We though about a CMS providing customized markup-snippets for later integration but it wouldn’t have helped. Not only is it bad from an TCO point of view: two systems, both need customization, upgrades are harder, steeper learning curve for everybody. Not the best thing.

So, why not leave the CMS as it is, use it’s media and content management facilities, it’s rich administrative functionality but treat of it as a storage for semi structured content. Similar to what JackRabbit is and Jackalope aims to be but with the benefit of not writing the administrative interface again. Basically: give me some kind of API to ask the content management system for a specific content item, be it media, content or hierarchy (which is media, nodes and terms in drupal’ish).

Having some experience with Drupal, we started diving deeper into what would be needed to use Drupal as a content repository and interface with it via web services.

Building Blocks

After a little research it was clear that we were going down the Drupal 7 route. Experience teaches that a year old Drupal might have its stability and compatibility issues but I figured having CCK built in was worth the trouble. And it was.

We use an unmodified Drupal 7 with a few modules to extend its functionality. The most important being services to expose web services and services_views to expose views via web services. Exposing views over web services is incredibly powerful as it allows us to fully control the content rules. Additionally we use media for media management, field_group to make the admin interface look nicer and rules for cache invalidation: when a node changes we redirect the browser to our combined preview and cache invalidation action, which resides in our application. The action then invalidates our cache so that we can serve content from purely from caches.

The Rules

Our Drupal lives in its own GIT repository fully separated from the main application. We use a specific revision of Drupal’s GIT repository as our base version and include all required modules as GIT submodules. This makes updates easy.
We never ever customize Drupal (that’s a lie, we have three minor patches which we keep totally separate to make it as hard as possible to mess around). This means, we can configure anything but we don’t write any extension code for Drupal and we don’t touch existing code. The “do not customize” rule is obviously the most important one.

Additionally we write integration tests against Drupal’s data structure for every remote call we use so we’ll find out what exactly breaks when updating.

Integration, Schmintegration

On the side of our application we have three components taking care of the content. The first one is a ContentRepository which provides domain (the domain being the one of our application) specific queries to our content store and hides all the nitty gritty service internals. The ContentRepository uses a Transformer component to transform the Drupal structure (a bit more on that later) of the services into a ViewModel. The ViewModel helps us to render everything into the page. These components are somewhat heavy-weight (around 1100 NCLOC for ViewModel, ~450 NCLOC for Repository and ~100 NCLOC for Transformer) but that’s 550 NCLOC overhead. And that’s much less than a custom content management system would take.

Most widely used calls are node.retrieve to retrieve a node with all its fields, views.retrieve to retrieve views, taxonomy_term.retrieve to retrieve a single term, taxonomy_vocabulary.getTree to get hierarchical information, and taxonomy_term.create and taxonomy_term.update to sync taxonomies. We store mapping identifiers with the taxonomy terms so that our system has a way to say “give me content for XYZ, which is called something else on your side but I don’t care”. We further use Views with contextual parameters to provide the queries to use with our application specific identifiers.

The Ugly

The payload Drupal’s services module returns is not transformed in any way and is exactly the internal one. Custom fields for example of a node exported via the API look like that:

{
  nid: 123,
  title: 'Some title',
  field_my_custom_field: {
    'en': [
       {
         'value': 'This is the value'
       }
    ]
  }
}

Boo. Not the nicest structure to work with, especially since there a number of variations.

Is it fast? No. But it don’t really need be as everything is served from caches anyway.

If you go down the road of using Drupal as a Content Repository, you’ll need to read some Drupal source, especially regarding service resources. It’s not hard, but it’s necessary because documentation is sparse.

After all, it was a successful project until now. It is easy to manage, Drupal updates are simple and it’s incredible what is possible with just Drupal and a few plugins.

Filed on 24-11-2011, 17:05 under , , , , , & no comments & no trackbacks

Trackbacks

Trackback specific URI for this entry

No Trackbacks

Comments

No comments

Add a Comment & let me know what you think