Archive for December, 2008

Great Surf / Share Learning Resource - Ed Loves Java

Friday, December 12th, 2008

I recently discovered a great set of blog posts digging into Surf and Share.  Ed of EdLovesJava Blog has been playing around and doing some interesting things, first playing with the Surf platform, and now moving over to Share.  There are six blog posts on Learning Surf, and two on Share.   A good reading to expand your knowledge of Surf. 

For a more introductory stuff on Surf, see the Surf Platform Category on the Wiki

There is also a webinar I did a few weeks ago - that one runs through a basic technical overview of Surf and some basic extensions.

Occasional Windows Issue

Wednesday, December 3rd, 2008

I use Windows XP for development, and every so often, I had a strange issue.  I would kill Alfresco, try to start it up again, but would get errors consistent with Alfresco still being up (JVM Bind error below).  It’s caused by java trying to bind a port number, but the port already been taken. 

java.net.BindException: Address already in use: JVM_Bind
    at java.net.PlainSocketImpl.socketBind(Native Method)
    at java.net.PlainSocketImpl.bind(PlainSocketImpl.java:359)
    at java.net.ServerSocket.bind(ServerSocket.java:319)

I double check that all of my java processes are killed, but no cigar.  I even tried redirecting tomcat to another port until next reboot, but then I’d d also have to redirect RMI ports as well.


I finally figured out the problem.  The port references were kept by OpenOffice process soffice.bin.  Killing it resolved the issue. 


XML Metadata Extraction for WCM

Monday, December 1st, 2008

While the XSDs in the WCM (AVM) are the equivalent of content models in DM, there is no effective way to search them. More specifically, it’s useful to be able to search based on a specific metadata elements in the generated XMLs, something that you need to do frequently in highly dynamic sites. In this post, we’ll discuss the sample I created for this that’s in the content community here (registration required).

In this example, I have a WCM content type defined through XSD press_release.xsd.  I want to extract some metadata from it, for example, expiration date of the press release. The extraction process works similarly to the way things function on the DM side, with some gotchas. When extracting metadata, you need somewhere (properties) to store it.

For WCM (where all the XML nodes are stored as wcm:avmplaincontent type), is by creating an aspect with the properties that you want to extract.   This is important – it has to be an aspect. This aspect will automatically get applied appropriately, as we’ll see later. In the included example, I have a XSD that creates a press release.  I want to extract and index three properties – abstract (string), expiration date (date), and numtimes (int). Here is the code (I removed all the indexing properties for simplicity)

<aspects>

<aspect name="my:press_release_metadata">

<title>Sample Aspect for WCM - Press Release</title>

<properties>

<property name="my:abstract">

<type>d:text</type>

</property>

<property name="my:expiration_date">

<type>d:datetime</type>

</property>

<property name="my:numtimes">

<type>d:int</type>

</property>

</properties>

</aspect>

</aspects>

You’ll also need to expose your aspect properties in the UI through web-client-config-custom.xml

<config evaluator="aspect-name" condition=" my:press_release_metadata">

<property-sheet>

<show-property name="my:abstract" />

<show-property name="my:expiration_date" />

<show-property name="my:numtimes" />

</property-sheet>

</config>

Once I have the aspect in my content model (customModel.xml, which I introduce to the Data Dictionary through custom-model-context.xml), I can start configuring extraction process as outlined in wcm-xml-metadata-extracter-context.xml.  There are two key sections:

  1. Selector section (extracter.xml.sample.selector.XPathSelector bean), which looks inside the XML and maps it to the correct Extractor Bean.  Since all the XForms of any type get saved as XML, we need to select the appropriate one (in this case pr:press_release). This configuration associates the specific XForm with a specific extraction definition.

    <bean id="extracter.xml.sample.selector.XPathSelector" class="org.alfresco.repo.content.selector.XPathContentWorkerSelector" init-method="init"> <property name="workers"> <map> <entry key="/pr:press_release"> <ref bean="extracter.xml.sample.AlfrescoCustomModelMetadataExtracter" /> </entry> </map> </property> </bean>

  2. Extractor bean for each of the Web Content Types you defined.  These have two parts:

    A. xpathMappingProperties – take an xpath expression that can extract value out of XML file and store it into internal Map.  So, for example, the abstract property can be found through xpath expression "/press_release/abstract".  It then gets stored into "abstract" internal map property.

    <prop key="abstract">/press_release/abstract</prop>

    Note that we have to also specify namespace so Alfresco can resolve them appropriately:   

    <prop key="namespace.prefix.pr">http://www.alfresco.org/alfresco/pr</prop>

B. mappingProperties – takes the properties out of internal map, and puts it into the specified data dictionary property.  Here is the key – the extractor finds the corresponding aspect and automatically ("automagically") adds it to the Alfresco node. Before setting it, it checks for the target data type, and attempts to convert it to that type.  In the example we are using, it takes out the internal map property "abstract" and sets it to property my:abstract. In this case the property is a string, so no conversion is really required.

       <prop key="abstract">my:abstract</prop>

Note on Converting Dates: When I did this initially, I got an exception for converting dates.  This is because the XForms store dates in the format of 2008-04-28, and the automatic cast did not work.  To remedy that, I added a configuration setting where I specified the correct date format for the extractor to use: <property name="supportedDateFormats"> <list> <value>yyyy-MM-dd</value> </list> </property> Note, that metadata extraction runs when you create content (in the user sandbox).   They key here is that the aspect gets applied automatically by the extraction process – you don’t need to make sure it’s added. You don’t even see mention of the aspect anywhere in the configuration files. This is what is should look like in the example:


The Lucene indexing will happen when you promote the content to the staging sandbox.

The indexing is governed by the settings on the properties you define on properties of the aspects (namely <index> element):

<property name="my:expiration_date">

<type>d:datetime</type>

<index enabled="true">

<atomic>true</atomic>

<stored>false</stored>

<tokenised>false</tokenised>

</index>

</property>

You cannot search based on these properties from the Search UI, since this relates to the WCM content, but you can query this using the Node Browser for testing, or, of course, the ultimate goal is likely to expose this through some web scripts.


Alfresco Home | Legal | Privacy | Accessibility | Site Map | RSS  RSS

© 2009 Alfresco Software, Ltd, All Rights Reserved