New Portlet Development Options for Alfresco!

One of the new features added to Alfresco Enterprise 3.2r is the ability to turn Web Scripts into standalone portlets that run on Liferay or other portals and include support for Single Sign-On (SSO) and generating portal-friendly URLs.

At the moment, the way to get this working requires deploying share.war into the portal environment even though the Alfresco Share app itself is not going to work as a portlet due to browser-side JavaScript/AJAX issues. A colleague of mine is reportedly working on creating a smaller WAR but the mechanism to make all this work is surprisingly simple and is what I’d like to illustrate now.

I assume you’ve installed the Alfresco and Liferay Tomcat bundles into separate directories. I’ll refer to these as <ALFRESCO_HOME> and <LIFERAY_HOME>

Step 1. Configuring Liferay’s Tomcat Server to resolve port conflicts by editing <LIFERAY_HOME>/<tomcat>/conf/server.xml:

Change all the port numbers to avoid conflicts with the Alfresco’s Tomcat server, here are the changes I made:

...
<Server port="8105" shutdown="SHUTDOWN">
...

    <Connector port="9090" protocol="HTTP/1.1"
               connectionTimeout="20000"
               redirectPort="9443" URIEncoding="UTF-8" />
...
  <Connector port="8109" protocol="AJP/1.3" redirectPort="8443" URIEncoding="UTF-8" />
...

Step 2. Next, edit <LIFERAY_HOME>/<tomcat>/conf/catalina.properties and locate the “shared.loader” entry and replace it with this:

shared.loader=${catalina.base}/shared/classes,${catalina.base}/shared/lib/*.jar

Step 3. Edit <ALFRESCO_HOME>/tomcat/shared/classes/alfresco-global.properties and add the following lines to the end of the file:

authentication.chain=alfrescoNtlm1:alfrescoNtlm,external1:external
external.authentication.proxyUserName=

NOTE: Yes, the value for proxyUserName needs to be blank.

Step 4. Copy <ALFRESCO_HOME>/tomcat/webapps/share.war to <LIFERAY_HOME>/deploy

Step 5. Copy the entire <ALFRESCO_HOME>/tomcat/shared directory to <LIFERAY_HOME>/<tomcat>/

NOTE: Technically, we only need the shared/classes/alfresco/web-extension directory, but copying the whole shared directory is a convenient alternative.

Step 6. Rename <LIFERAY_HOME>/<tomcat>/shared/classes/alfresco/web-extension/webscript-framework-config-custom.xml.sample to webscript-framework-config-custom.xml and edit the file to uncomment the second config block to enable remote authenticator. The comments are on lines 44 and 73. The result should look like this:

...
   <!--       sessions" feature of your load balancer must be used -->

   <config evaluator="string-compare" condition="Remote">
        <remote>
            <!-- SSL client certificate + trusted CAs. Optionally used to authenticate share to an external SSO system such as CAS -->

            <keystore>
           ...
            </keystore>

            <connector>
           ...
            </connector>

            <endpoint>
                <id>alfresco</id>
                <name>Alfresco - user access</name>
                <description>Access to Alfresco Repository WebScripts that require user authentication</description>
                <connector-id>alfrescoCookie</connector-id>
                <endpoint-url>http://localhost:8080/alfresco/wcs</endpoint-url>
                <identity>user</identity>
                <external-auth>true</external-auth>
            </endpoint>          

        </remote>
    </config>

NOTE: If you’re running Alfresco’s Tomcat server on a different host or port, please change the corresponding endpoint.

Step 7. Startup Alfresco, then follow up with starting Liferay.

Step 8. Once Alfresco and Liferay are started, add and try out the basic CMIS Repo Browser portlet. It should appear under the “Alfresco” category.

This integration makes use of an external authenticator that will automatically create an Alfresco user account whenever a new Liferay user accesses an Alfresco portlet.

To create your own custom portlet, I recommend you look at the CMIS Repo code under in the share.war file under WEB-INF/classes/alfresco/webscripts/org/alfresco/test/cmisrepo.* and cmisfolder.*.

In short, you’ll need to create a Web Script which can be placed inside the share.war file under WEB-INF/classes/alfresco/webscripts or in <LIFERAY_HOME>/<tomcat>/shared/classes/alfresco/web-extension/webscripts (create the directory if it doesn’t already exist.

You’ll also need to edit the portlet.xml file in the share.war/WEB-INF directory to add a new entry for your portlet using the Alfresco “ProxyPortlet” as the portlet class.

	<portlet>

		<description>CMIS Folder Browser</description>
		<portlet-name>CMISFolder</portlet-name>
		<portlet-class>org.alfresco.web.portlet.ProxyPortlet</portlet-class>
		<init-param>
			<name>scriptUrl</name>
			<value>/share/service/sample/cmis/repo</value>
		</init-param>

		<supports>
			<mime-type>text/html</mime-type>
			<portlet-mode>VIEW</portlet-mode>
		</supports>

		<portlet-info>
			<title>CMIS Folder Browser</title>
			<short-title>CMIS Folder</short-title>
		</portlet-info>

	</portlet>

Finally add your portlet to the liferay-display.xml and liferay-portlet.xml files.

To make sure that your URLs are correctly generated, please use the “scripturl()” function in your Freemarker templates to wrap them:

<a href="${scripturl(url.serviceContext + "/sample/cmis/repo", false)}">CMIS Repository</a>

This new mechanism is the first significant step in better tying together Alfresco with portal platforms and expect new developments with future releases.

True Elastic Cloud Computing with Alfresco and RightScale

Alfresco LogoRightScale Logo

Alfresco today announced the availability of RightScale templates that facilitate the deployment of scalable and elastic Alfresco clusters. I emphasize “elastic” because I believe this is one of the key principles that distinguish a “true” cloud computing strategy from a vendor versus simply making single-server, non-clustered product images available.

Alfresco RightScale Deployment
CLICK TO VIEW DEMO VIDEO OF ALFRESCO AND RIGHTSCALE

Understand that there’s nothing wrong with having single-server images, Alfresco has had EC2 images available since October and we will continue to maintain and evolve them. But if a vendor claims to have a “cloud strategy” and that having public EC2 images encompasses the entirety of that strategy, then they either a) don’t get it or b) are misleading their customers.

Here at Alfresco, we’ve been quietly yet actively working with cloud platforms such as EC2, GoGrid and RackSpace and feel that the timing is right to put forth our strategy not just for the cloud, but also for emphasizing Alfresco’s utility as a Content Platform that can be leveraged to develop Content Rich applications that can then deployed to any kind of runtime environment, be it a traditional Data Center, Virtualized Infrastructure or Public & Private Clouds.

This strategy relies on a few trends that Alfresco has been actively endorsing as well as some specific product features that we hope will make Alfresco a very accessible and powerful framework that is attractive to developers.

The most significant of these trends and features are:

  • CMIS: The new “lingua franca” of content management. Empowers developers with a standardized mechanism for leveraging content repositories.
  • Spring Surf: Alfresco recently donated the Surf Framework to SpringSource as a Spring Extension. Spring Surf offers Spring developers a more convenient way to develop SpringMVC web applications and can optionally connect these apps into a content repository using CMIS.
  • Scalability, Manageability & Configurability: We’ve made great strides in making Alfresco one of most scalable ECM platforms in the industry, but have gone beyond that by adding JMX support and revamping the configuration mechanism thereby dramatically improving the way that IT operations can manage their Alfresco servers. These capabilities are absolutely pivotal to Alfresco’s cloud strategy.

Virtually all web applications make use of content, yet very few of them are actually backed by any kind of content repository. The aforementioned three points are the pillars on top of which developers can create and deploy a new breed of content-enabled applications that are backed by Alfresco’s CMIS-compliant repository. These new applications can be social, e-commerce, multimedia, business-oriented or anything else. More importantly, developers now have the flexibility to easily deploy to their preferred environment knowing that Alfresco can be properly managed regardless of where it’s deployed.

We emphasize the “cloud” because it conveniently allows for an environment to scale on-demand based on metrics such as performance or query volume, but understand that the same principles can be applied to more traditional environments as well.

In any case, I invite readers to attend the webinars we’ve got scheduled for after the new year and send any questions to cloud -at- alfresco -dot- com.

For more information on Alfresco in the Cloud, please visit:

News and Blogs Relating to the Alfresco/RightScale Announcement:

Records Management and Cloud Storage

Alan Pelz-Sharpe brings up an interesting topic about Records Management in the cloud and concludes that the cloud is “dangerous” to RM. I’m no expert in the intimate details of RM, but I must observe that Alan made certain blanket assumptions about cloud storage that are not accurate and in the least, cannot be broadly applied to all cloud providers.

To begin, Alan notes that users of cloud storage services have no control as to where content gets physically stored and that one could never know where the content is at any given time, implying that the content could be replicated all over the globe. I think this is based on incomplete knowledge of how cloud storage solutions work.

From a technical standpoint, the primary aspects that distinguish “cloud storage” from more traditional storage are:

  • Use of commodity hardware and/or software.
  • ‘Chunking’ of files and replication of said ‘chunks’ across multiple storage nodes.
  • Simple retrieval and insertion API, often delivered through REST and/or SOAP interfaces.

It is that second point, that I wish to elaborate on, as I believe this is where Alan is mistaken in his observations. To make things more concrete, I’m going to focus on Amazon S3 as a specific example, but other cloud storage providers such as Nirvanix, EMC and Iron Mountain operate under similar principles.

Many cloud-storage solutions are implemented in a similar fashion to the Google File System (GFS) where in order to achieve fault tolerance and redundancy of data, files (particularly large files) are broken up into chunks and replicated to a certain minimum number of storage nodes. This allows for one or more nodes to fail while ensuring the integrity of the data. In the case of Amazon S3, the data is replicated across different “availability zones” within one geographic “region”.

Amazon’s US region is in Virginia while their EU region resides in Ireland. An availability zone is a collection of servers within a region that share common infrastructure such as power and cooling. In Amazon’s case, they further qualify how they ensure fault separation of their availability zone thusly:

“Amazon EC2 provides customers the flexibility to place instances within multiple geographic regions as well as across multiple Availability Zones. Each Availability Zone is designed with fault separation. This means that availability zones are physically separated within a typical metropolitan region, on different flood plains, in seismically stable areas. In addition to discrete uninterruptable power source (UPS) and onsite backup generation facilities, they are each fed via different grids from independent utilities to further reduce single points of failure. They are all redundantly connected to multiple tier-1 transit providers.”

Users of S3 can choose which region their content is stored into and hence they know what datacenter the data resides in. Content is simply NOT replicated across regions unless the user specifically inserts the content into more than one region which requires TWO API calls, one to insert into the US and another to insert into the EU. Note that Amazon allows users to transmit the data through HTTPS and to encrypt said content prior to insertion so as to virtually guarantee that the content is unreadable in the very unlikely event that someone gains illicit access to the data.

So now, that we know we can control where the data resides, let’s discuss disposition… Amazon’s security whitepaper states:

“Securing data at rest involves physical security and data encryption. [...] Amazon employs multiple layers of physical security measures to protect customer data at rest. For example, physical access to Amazon S3 datacenters is limited to an audited list of Amazon personnel. Encryption of sensitive data is generally a good security practice, and Amazon encourages users to encrypt their sensitive data before it is uploaded to Amazon S3.

When an object is deleted from Amazon S3, removal of the mapping from the public name to the object starts immediately, and is generally processed across the distributed system within several seconds. Once the mapping is removed, there is no remote access to the deleted object. The underlying storage area is then reclaimed for use by the system.

[...]

When a storage device has reached the end of its useful life, AWS procedures include a decommissioning process that ensures customer data are not exposed to unauthorized individuals. AWS uses the techniques detailed in DoD 5220.22-M (“National Industrial Security Program Operating Manual “) or NIST 800-88 (“Guidelines for Media Sanitation”) to destroy data, as part of the decommissioning process.”

I believe this is more than adequate in an RM scenario, but should further evidence be required, I suggest reading Amazon’s security and HIPAA whitepapers.

I also highly recommend that all interested parties first have in-depth discussions with the different players in this space before coming to any hasty conclusions.

To help you begin, I offer this list of cloud storage vendors:

Amazon
Nirvanix
EMC (Atmos)
Iron Mountain
ParaScale
Caringo
Gluster

Alfresco Public EC2 Image Now Available

Powered by Amazon Web Services
Alfresco has recently announced the availability of the first “official” Alfresco server image for Amazon EC2. Along with that, we’re also among the first members of the Amazon Solution Provider Program.

Though in actuality we at Alfresco have had much of experience running Alfresco on the Amazon cloud, this marks the first time we’ve created a ready-to-use public image.

Being the first public image and that it’s built on Alfresco Community 3.2, we’re initially targeting developers and early adopters with this release. We’d like to encourage interested parties to try it out and report any issues by emailing us at cloud (at) alfresco (dot) com.

Here are some possible uses you can put this cloud server to if you need some ideas:

  • Try Alfresco Explorer and Alfresco Share
  • CMIS Development and Integration Sandbox
  • Web Script Development
  • Alfresco Share Dashlet Development
  • Customize Alfresco and bundle a new image based on this one

Longer-term, we’re looking at doing the following:

  • Offer Alfresco Enterprise Images.
  • Auto-mounting of EBS volumes and perhaps Amazon S3-content storage.
  • Self-configuring, elastic clustering for true cloud-style scalability.
  • Images for other cloud providers and platforms such as GoGrid, Rackspace Cloud and RightScale.

You can learn more about this image by visiting the Alfresco EC2 Wiki Page and viewing this demo video.

Host your own cloud storage array with Parascale

Last Friday, I visited with Parascale where I met with Mike Maxey (Dir. of Product Management), Sajai Krishnan (CEO), and Cameron Bahar (CTO and Founder). Parascale has developed software that allows you to turn 2 or more Linux servers into an Amazon S3-like cloud storage array that runs in userspace and can be easily mounted via NFS.

Parascale competes with Caringo’s CAStor in that it too can automatically replicate files across your storage nodes so as to ensure that your files are safe in the event one or more nodes fail.

The main differences with Caringo are that the software runs in userspace on a fairly vanilla Linux server while Caringo’s CAStor is delivered as a completely self-contained Linux software appliance. Additionally that it can be mounted via NFS without requiring a “Content File Server” or use of a REST API (for which there exists a very good Alfresco-CAStor Adapter).

A neat thing about Parascale is that it’s rather lightweight so that gave me the ability to install Alfresco on the one of the Linux boxes that also ran the storage services. This can be put to good use in a production environment as it allows enterprises to maximize their hardware investment by consolidating Alfresco and reliable storage.

I decided to run a 100,000 document benchmark while we stepped out to lunch. Upon our return I was quite pleased that the rather under-powered server I was running on managed about 8 documents per second. Unlike my GoGrid benchmarks, the benchmark client was running on the same physical machine as Alfresco and the storage software so I didn’t need to deal with network latency.

Parascale has already proven the longevity and scalability of their storage cloud but you can expect even more enhancements over the coming months. Check them out at http://parascale.com

Mike Maxey will be present at the special cloud event in San Francisco on May 13th. You can learn more about the event and register here.

Alfresco in the Cloud: GoGrid

I spent a few hours this weekend installing and configuring an Alfresco cluster on GoGrid and ran a 200,000 document Alfresco Benchmark against the cluster. Setup was pretty easy, I simply used the graphical tools to drag 4 Linux servers (1 benchmark client, 1 MySQL, and 2 Alfresco) plus a load-balancer and a cloud storage appliance.

I struggled a few minutes with some Linux firewall issues but after resolving those and raising the ulimit to max, I had a happy cluster. The benchmarks ran for a few hours and averaged a very respectable 10 documents per second.

You can learn more about GoGrid by visiting their site. I’ll also be discussing this and other topics at the San Francisco user conference on May 14th and during the Special Cloud Event on the evening of May 13th. There are still seats available for the cloud computing event so please register by emailing harinder.sunder@alfresco.com.

Cloud Computing Special Event in San Francisco!

Interested in learning about Cloud Computing?

Curious to know how Alfresco can be deployed to the cloud?

If you’re in the San Francisco Bay Area, you may be interested to learn that Alfresco has partnered with GoGrid, provider of cloud-based hosting services, to organize a special event to be held the evening of May 13th, the night before the Alfresco Meetup.

Some of the topics that we will explore include:

  • What is cloud computing?
  • What are the benefits of leveraging the cloud?
  • How does Alfresco fit into the cloud?

This event will feature experts from Alfresco, GoGrid (Randy Bias and Michael Sheehan) and CNet cloud computing blogger James Urquhart (Wisdom of the Clouds). We’ll also have guests from Parascale who develops cloud-style storage software that you can run on your own datacenter.

There will be opportunities to network with fellow Alfresco and cloud enthusiasts as well as ask questions.

IMPORTANT NOTE: There’s room for only 30 guests, so we require registration to this event.

Date: May 13th, 2009
Time: 7:00pm (Please show up early as registration with security is required)
Place: GoGrid/ServePath Corporate Office - 2 Harrison St. Suite 200 - San Francisco
Registration: To register, please send an email to harinder.sunder@alfresco.com

* GoGrid Cloud Hosting is a service of ServePath Dedicated Hosting

Are you doing something interesting or cool with Alfresco Labs or Enterprise?

I’m doing a quick, informal survey of what people are doing with Alfresco Labs and/or Alfresco Enterprise. Just answer these 6 easy questions at http://snurl.com/gs4mc.

I’d love to hear about cool implementations, integrations with other systems, etc. Anything goes!

CMIS-based Alfresco + Fujitsu ScanSnap Integration

Check out this demo video of a CMIS-based integration between Alfresco and the Fujitsu ScanSnap scanner.

Alfresco + Fujitsu ScanSnap Integration from Alfresco Software.

The CMS Vendor Meme

Well, this is kind-of amusing… Our fellow open sourcerors over at Magnolia have tagged Alfresco among other CMS’s to participate in the “CMS Vendor Meme”.

The rules are as follows:
  • A CMS vendor is challenged to honestly answer all items on the “Reality checklist for vendors” suggested by CMSWatch’s Kas Thomas (aka the “we-get-it checklist for vendors”).
  • If possible the vendor has supply screenshots, links or other means to make it easy to verify the answers.
  • The answers also need to be supplied in a short form of one to three stars (denoting “no”, “sort-of”, “yes”).
  • Answering all questions on his blog allows the vendor tag some other WCMS vendors.
  • A tagged vendor should provide a link back to the blog that tagged him.

I’m hereby invoking our right to tag: Documentum, Oracle/Stellent and finally our good friends at Acquia/Drupal

Now, In the spirit of goodwill, fun and transparency, I offer you our response…

“WE GET IT” CHECKLIST FOR VENDORS

1. Our software comes with an installer program.

Installers are available for Windows, Linux and Mac. Non-installer distributions also available. Even the WAR distribution generally only requires that the software be unzipped and it’s ready to go.

2. Installing or uninstalling our software does not require a reboot of your machine.

No reboot is required…

3. You can choose your locale and language at install time, and never have to see English again after that.

Users pick their language upon login time, not install time. Language packs are separately downloaded and installed as per our Wiki.

4. Eval versions of the latest edition(s) of our software are always available for download from the company website.

We make both our Labs and Enterprise editions available for download. Registration is required for the Enterprise edition, not for the Labs edition. Nightly builds of the Labs edition are also available.

5. Our WCM software comes with a fully templated “sample web site” and sample workflows, which work out-of-the-box.

A sample “Alfresco” site is included but I’m not personally too happy with it. I have my own demo sites that I use, but I’m not allowed to distribute them since we don’t own the copyright on the templates that I used. Now that Web Studio is coming along, I expect us to be able to offer a fully template-driven sample site very soon.

6. We ship a tutorial.

We certainly do… Documentation is very important to us and we continue to make every effort to improve upon it.

7. You can raise a support issue via a button, link, or menu command in our administrative interface.

Yup, a link is available on the upper right corner of every screen.

8. All help files and documentation for the product are laid down as part of the install.

Help docs are accessible via the Web UI and the remainder of our documentation is available online via our Wiki and Customer site.

[UPDATE: We no longer include the tutorial document as part of the distribution but we do link to the online help on every product screen. Nonetheless, I've knocked down our score to 1 star instead of 2 because of this.]

9. We run our entire company website using the latest version of our own WCM products.

Indeed we do…

10. Our salespeople understand how our products work.

Inasmuch as a non-technical/non-web-development savvy person can, yes. But that’s why we have a stellar team of Solution Engineers ready to lend prospects, customers and community members a helping hand.

11. Our software does what we say it does.

Yup… That’s the whole idea, ain’t it? Download it and see!

12. We don’t charge extra for our SDK.

The SDK is completely open source and free.

13. Our licensing model is simple enough for a 5-year-old to understand.

You tell me, the Labs version is completely free while the Enterprise edition’s pricing is based on a yearly support subscription fee metered by the number of CPUs (aka ’sockets’) on the server(s) you’re installing on. Up to 4 cores are allowed per CPU.


14. We have one price sheet for all customers.

There is indeed only one price-sheet…

15. Our top executives are on Skype, Twitter, or some similar channel, and: Feel free to contact them directly at any time.

Absolutely, the most complete list of Twitter ID’s is on our Wiki.