Tag Archives: integration

Technical Overview of the Alfresco / Jive Toolkit

Recently I transitioned from my long-standing role leading Alfresco’s Professional Services team to being the in-house technologist for the Business Development team, and one of my first tasks in the new role has been to work on an integration between Alfresco and Jive Engage. This work is being done in partnership with SolutionSet (a partner of both Alfresco and Jive), and I wanted to discuss some of the technical design work that has gone into the Toolkit, ahead of its availability (which will be soon after the Jive 5.0 launch – the version of Jive that the Toolkit is targeting).

Functional Overview, aka “What will it do?”

As announced at Gartner’s Portals, Collaboration & Content Summit this week, the integration (known as the “Jive Toolkit”) is a set of pre-built components that allows Jive to store documents in Alfresco while still offering all of the same social features as “native” Jive documents (commenting, rating, discussions, etc.). While not yet all-encompassing – Jive’s “social” content cannot yet be stored or managed within Alfresco – the Toolkit will provide a foundational level of document-centric integration, allowing implementers to focus on more use-case specific integrations as required (hence the positioning as a “toolkit”, rather than a fully fledged solution).

More specifically, the initial version of the Toolkit will allow users of Alfresco and/or Jive to create “managed” documents in any of the following 3 ways:

  1. By uploading a document to Alfresco, using the Jive UI.
  2. By “publishing” an existing document from Alfresco to Jive, using Alfresco’s Share UI.
  3. By “linking” an existing document stored in Alfresco to Jive, using the Jive UI.

In all 3 cases, the result is the same: the document is visible and accessible via the Jive UI in exactly the same way as any “native” document, but the content of the document is stored and managed in Alfresco only. Jive will maintain some metadata about the document – for example the document’s filename and a pointer to the document in Alfresco – but it will not store the binary content of the document. This approach ensures that the document is a first class citizen in both the Alfresco and Jive worlds, while minimising the risk of synchronisation issues between the two systems.

Here are some screenshots that demonstrate uploading a document to Alfresco using the Jive UI:

Alfresco managedocument step1

Step 1 – Navigating to a community in Jive

Alfresco managedocument step2

Step 2 – Managing a document

Alfresco managedocument step3

Step 3 – Select a file to upload

Alfresco managedocument step4

Step 4 – Select the target space in Alfresco

Alfresco managedocument step5alf

Document details (Alfresco)

Alfresco managedocument step5

Document details (Jive)

Technical Details, aka “Rubber, meet road”

As mentioned above, there are a variety of ways that the initial “linkage” of a document between Alfresco and Jive can be achieved, however all 3 creation mechanisms produce the same end state: Alfresco has the document in its entirety (including the filename, content, etc.) while Jive has a “proxy object” (a structured data-only object that has the filename and a pointer to the document in Alfresco, but does not have the actual binary content).

This means that all downstream events (updates, metadata modifications, deletes) can be handled the same way, irrespective of how the content was linked between the two systems in the first place – a major simplification in the logic for those downstream events.

Integration Mechanism, aka “CMIS, by any other name would smell as sweet…”

Another nice characteristic of this approach is that the calls from Jive to Alfresco (to create content, update and retrieve it) can be accomplished using the CMIS API. This has several benefits, from reduced development effort in the Toolkit itself (due to the ready availability of client-side CMIS libraries), to the potential for portability to other CMIS compliant repositories in the future.

One important thing to note is that the Alfresco-to-Jive API calls are not standards-based – they make use of Jive’s proprietary REST API. Jive does not expose a standards-based API (indeed, no suitable standard exists for social business systems yet), and CMIS doesn’t provide any kind of callback mechanism for clients to be notified when repository events of interest occur (i.e. a mechanism equivalent to Alfresco’s Component Policies).

Tricky Bits, aka “The Devil is in the Details”

As with any integration between complex enterprise applications, there is some trickery in some parts of the integration, and it’s critical to understand these if you’re evaluating the Jive Toolkit.

Deletion

The first piece of trickery involves deletion of the content, specifically deletion in Alfresco. Because Jive maintains a pointer to the document in Alfresco (specifically, the “cmis:id”), rather than the content itself, if the document is deleted in Alfresco without Jive being notified, attempts within Jive to retrieve that content will fail. To prevent this, the Toolkit is currently designed to veto deletes in Alfresco if the document has been socialised in Jive. To delete a document, it will first need to be deleted in Jive at which point it can be deleted from Alfresco too. The reason the Toolkit doesn’t simply synchronise deletes between Jive and Alfresco is that there are common use cases where the document may be removed from Jive, but needs to be retained in Alfresco – replicating deletes between the two systems would have ruled out these use cases.

Full Text Indexing

The second item of trickery revolves around full text indexing of the document in Jive. To accomplish this, Jive will retain a copy of the content of the document just long enough to index it into Jive’s full text index, and once indexing is complete the content of the document will be removed from Jive. As you’d expect, Alfresco will also notify Jive of any updates to the document, so that the content can be re-indexed on the Jive side.

Access Control and Identity

Access control to the documents is also tricky, primarily because the Alfresco and Jive ACL models differ in their level of granularity. Jive’s access control is primarily Community-centric (i.e. defined and enforced at the level of the Community), while Alfresco has a fine grained, per-node (file or folder) ACL mechanism. In this first release, the Toolkit will initially create the document in both systems in such a way that the ACLs are in sync, but modification of those ACLs in either system will not be replicated to the other system. The upshot is that direct manipulation of the document’s ACLs in Alfresco may cause errors in Jive (i.e. users who can see the document in the Jive UI, but are unable to download it).

Furthermore, in order for Alfresco and Jive to agree on the principal set, the initial version of the Toolkit assumes that both Alfresco and Jive are configured to use the same LDAP repository for user identity and authentication. During the design sessions it was felt that this was likely to be a requirement for an integrated solution anyway and hence wouldn’t be an impediment, but we’re keen to have that assumption validated as broadly as possible.

In Conclusion

So there you have it – a whirlwind tour of the upcoming Jive Toolkit! As a v1.0 there are some more sophisticated use cases that the Toolkit doesn’t address yet, including multi-document / library based integration, and capture of Jive’s social content (discussions, ratings, wiki pages, etc.) in Alfresco. The intention with the Toolkit is to initially provide Alfresco+Jive Systems Integrators (such as SolutionSet) with a small but solid base on which such extensions could be built, and if/when common requirements are identified for these more sophisticated use cases they can be rolled back into the Toolkit.

We’re keen to hear your feedback and look forward to your participation in the project!

Alfresco and Groovy, Baby!

For quite a few years now I’ve been a fan of scripted languages that run on the JVM, initially experimenting with the venerable BeanShell, then tinkering with Javascript (via Rhino), JRuby and finally discovering Groovy in late 2007. A significant advantage that Groovy has over most of those other languages (with the possible exception of BeanShell), is that it is basically a superset of Java, so most valid Java code is also valid Groovy code and can therefore be executed by the Groovy “interpreter”1 without requiring compilation, packaging or deployment – three things that significantly drag down one’s productivity with “real” Java.

To that end I decided to see if there was a way to implement Alfresco Web Scripts using Groovy, ideally in the hope of gaining access to the powerful Alfresco Java APIs with all of the productivity benefits of working in a scripting-like interpreted environment.

It turns out that the Spring Framework (a central part of Alfresco) moved in this direction some time ago, with support for what they refer to as dynamic-language-backed beans. Given that a Java backed Web Script is little more than a Spring bean plus a descriptor and some view templates, initially it seemed like Groovy backed Web Scripts might be possible in Alfresco already, merely by adding the Groovy runtime JAR to the Alfresco classpath and then configuring a Java-backed Web Script with a dynamic-language-backed Spring bean.

Oh behave!

Unfortunately this approach ran into one small snag: Alfresco requires that Java Web Script beans have a “parent” of “webscript”, as follows:

<bean id="webscript.my.web.script.get"
class="com.acme.MyWebScript"
parent="webscript">
<constructor-arg index="0" ref="ServiceRegistry" />
</bean>

but Spring doesn’t allow dynamic-language-backed beans to have a “parent” clause.

It’s freedom baby, yeah!

There are several ways to work around this issue, but the simplest was to implement a “proxy” Web Script bean in Java that simply delegates to another Spring bean, which itself could be a dynamic-language-backed Spring bean implemented in any of the dynamic languages Spring supports.

This class ends up looking something like (imports and comments removed in the interest of brevity):

public class DelegatingWebScript
extends DeclarativeWebScript
{
private final DynamicDeclarativeWebScript dynamicWebScript;

public DelegatingWebScript(final DynamicDeclarativeWebScript dynamicWebScript)
{
this.dynamicWebScript = dynamicWebScript;
}

@Override
protected Map executeImpl(WebScriptRequest request, Status status, Cache cache)
{
return(dynamicWebScript.execute(request, status, cache));
}
}

While DynamicDeclarativeWebScript looks something like:

public interface DynamicDeclarativeWebScript
{
Map execute(WebScriptRequest request, Status status, Cache cache);
}

This Java interface defines the API the Groovy code needs to implement in order for the DelegatingWebScript to be able to delegate to it correctly when the Web Script is invoked.

The net effect of all this is that a Web Script can now be implemented in Groovy (or any of the dynamic languages Spring supports for beans), by implementing the DynamicDeclarativeWebScript interface in a Groovy class, declaring a Spring bean with the script file containing that Groovy class and then configuring a new DelegatingWebScript instance with that dynamic bean. This may sound complicated, but as you can see in this example, is pretty straightforward:

<lang:groovy id="groovy.myWebScript"
refresh-check-delay="5000"
script-source="classpath:alfresco/extension/groovy/MyWebScript.groovy">
<lang:property name="serviceRegistry" ref="ServiceRegistry" />
</lang:groovy>

<bean id="webscript.groovy.myWebScript"
class="org.alfresco.extension.webscripts.groovy.DynamicDelegatingWebScript"
parent="webscript">
<constructor-arg index="0" ref="groovy.myWebScript" />
</bean>

While a little more work than I’d expected, this approach meets all of my goals of being able to write Groovy backed Web Scripts, and in the interests of sharing I’ve put the code up on the Alfresco forge Google Code.

I demand the sum… …OF 1 MILLION DOLLARS!

But wait – there’s more! Not content with simply providing a framework for developing custom Web Scripts in Groovy, I decided to test out this framework by implementing a “Groovy Shell” Web Script. The idea here is that rather than having to develop and register a new Groovy Web Script each and every time I want to tinker with some Groovy code, instead the Web Script would receive the Groovy code as a parameter and execute whatever is passed to it.

Before we go any further, I should mention one very important thing: this opens up a massive script-injection-attack hole in Alfresco, and as a result this Web Script should NOT be used in any environment where data loss (or worse!) is unacceptable!! It is trivial to upload a script that does extremely nasty things to the machine hosting Alfresco (including, but by no means limited to, formatting all drives attached to the system) so please be extremely cautious about where this Web Script gets deployed!

Getting back on track, I accomplished this using Groovy’s GroovyShell class to evaluate a form POSTed parameter to the Web Script as Groovy code (this is conceptually identical to Javascript’s “eval” function, hence the warning about injection attacks). Effectively we have a Groovy-backed Web Script that interprets an input parameter as Groovy code, and then goes ahead and dynamically executes it! It’s turtles all the way down!

The code also transforms the output of the script into JSON format, since there are existing Java libraries for transforming arbitrary object graphs (as would be returned by an arbitrary Groovy script) into JSON format.

Here’s a screenshot showing the end result:

Alfresco Groovy Shell

Alfresco Groovy Shell - Vanilla Groovy Script

The more observant reader will have noticed the notes in the top right corner, particularly the note referring to a “serviceRegistry” object. Before evaluating the script, the Web Script injects the all important Alfresco ServiceRegistry object into the execution context of the script, in a Groovy variable called “serviceRegistry”. The reason for doing so is obvious – this allows the script to interrogate and manipulate the Alfresco repository:

Alfresco Groovy Shell

Alfresco Groovy Shell - Groovy Script that Interrogates the Alfresco Repository

Sharks with lasers strapped to their heads!

Now if you look carefully at this script, you’ll notice that it (mostly) looks like Java, and this is where the value of this Groovy Shell Web Script starts to become apparent: because most valid Java code is also valid Groovy code, you can use this Web Script to prototype Java code that interacts with the Alfresco repository, without going through the usual Java rigmarole of compiling, packaging, deploying and restarting!

I recently conducted an in-depth custom code review for an Alfresco customer who had used Java extensively, and this Web Script was a godsend – not only did I eliminate the drudgery of compiling, packaging and deploying the customer’s custom code (not to mention restarting Alfresco each time), I also completely avoided the time consuming (and, let’s be honest, painful) task of trying to reverse engineer their build toolchain so that I could build the code in my environment. This alone was worth the price of admission, but coupled with the rapid turnaround on changes (the mythical “edit / test / edit / test” cycle), I was able to diagnose their issues in a much shorter time than would otherwise have been possible.

Conclusion

As always I’m keen to hear of your experiences with this project should you choose to use it, and am keen to have others join me in maintaining and enhancing the code (which is surprisingly little, once all’s said and done).


Technically Groovy does not have an interpreter; rather it compiles source scripts into JVM bytecode on demand. The net effect for the developer however is the same – the developer doesn’t have to build, package or deploy their code prior to execution – a serious productivity boost.