7 thoughts on “Metadata Embedding

  1. Axel Faust

    Embedding / transferring metadata into OOXML documents is a request that is relevant quite often. In the past, we’ve used custom integrations with DOCX4J (Apache License) to transfer metadata into customer content. Combined with metadata extractors, this could enhance the SharePoint protocol experience (effectively providing the same Content Type Document Properties feature). As far as “other formats” are concerned, I can’t think of any that would be nearly as relevant / significant.
    Concerning EXIF, I can’t wait for a GeoTagger UI extension in Share to manage / correct those files where my camera just wouldn’t get a clear GPS signal.

    Reply
  2. nafets

    Question from a non-programmer: Is it possible to call exiftool from within say a rule and have it extract custom metadata from e.g. a pdf (i.e. XMP tags) and get those to populate the Alfresco metadata fields?
    Cheers

    Reply
  3. Ray Gauss II Post author

    Hi nafets,

    The Media Management module does something similar using exiftool, though you wouldn’t normally have a rule interact directly with the command line exiftool.

    Normally the way extraction works is that once the content is ingested the metadata extraction is automatically performed, without the need for an explicit rule, by executing an action which:

    1. Finds the proper metadata extractor based on the content’s mimetype and your repository’s configuration and tells that extractor to extract the metadata.

    2. That extractor extracts then maps ‘raw’ metadata fields to Alfresco data model properties based on its mapping configuration.

    Now, in the case of PDFs, Apache Tika (the primary underlying metadata extraction library) can already read custom XMP, so all you have to do is add your custom XMP fields to the mapping config for the PDF extractor!

    See http://wiki.alfresco.com/wiki/Metadata_Extraction#Configuring_the_Extractor

    Hope that helps,

    Ray

    Reply
  4. Mittal

    This is great feature indeed!!!! which opens up whole new field of opportunities to implement various complex requirements within alfresco were earlier not able to meetup, various use cases where earlier we were not able to implement (or could not even imagine ) within Alfresco is possible now. It will encourage customer to go for Alfresco. Good work Alfresco Team. keep it up :)

    Reply
  5. stefan de laet

    Hi Ray,

    I am investigating how to embed exif tags from alfresco metadata changes.
    If I understand the exiftool code correctly, it only supports embedding of iptc and XMP tags ( in ExiftoolTikaIptcMapper.java ) but ignores changes to exif tags, right?
    Am I correct that, in order to embed exif tag changes, I would need to create a new ExiftoolTikaMapper class and use this as mapper for the ExiftoolExternalEmbedder constructor?
    Or is there an easier way?

    kr

    Stefan

    Reply
    1. Ray Gauss II Post author

      Hi Stefan,

      You are correct. You would extend ExiftoolTikaIptcMapper or create a new ExiftoolTikaMapper for EXIF exclusively and pass that into the ExiftoolExternalEmbedder constructor.

      As you’ve probably seen, the main job of that mapper is to translate between Exiftool fields to/from Tika fields.

      Regards,

      Ray

      Reply
      1. stefan de laet

        Thank you Ray, Just in case you may want to include this exifdata embedding feature in future releases, here is the code I ve added to ExiftoolTikaIptcMapper to make it work:

        _tikaToExiftoolMetadataMap.put(
        Property.internalTextBag("tiff:ImageWidth"),
        Arrays.asList(
        Property.internalTextBag("ImageWidth")));
        _tikaToExiftoolMetadataMap.put(
        Property.internalTextBag("tiff:ImageLength"),
        Arrays.asList(
        Property.internalTextBag("ImageLength")));
        _tikaToExiftoolMetadataMap.put(
        Property.internalTextBag("tiff:Make"),
        Arrays.asList(
        Property.internalTextBag("Make")));
        _tikaToExiftoolMetadataMap.put(
        Property.internalTextBag("tiff:Model"),
        Arrays.asList(
        Property.internalTextBag("Model")));
        _tikaToExiftoolMetadataMap.put(
        Property.internalTextBag("tiff:Software"),
        Arrays.asList(
        Property.internalTextBag("Software")));
        _tikaToExiftoolMetadataMap.put(
        Property.internalTextBag("tiff:Orientation"),
        Arrays.asList(
        Property.internalTextBag("Orientation#")));//# allows writing numbers
        _tikaToExiftoolMetadataMap.put(
        Property.internalTextBag("tiff:XResolution"),
        Arrays.asList(
        Property.internalTextBag("XResolution")));
        _tikaToExiftoolMetadataMap.put(
        Property.internalTextBag("tiff:YResolution"),
        Arrays.asList(
        Property.internalTextBag("YResolution")));
        _tikaToExiftoolMetadataMap.put(
        Property.internalTextBag("tiff:ResolutionUnit"),
        Arrays.asList(
        Property.internalTextBag("ResolutionUnit#")));//# allows writing numbers
        _tikaToExiftoolMetadataMap.put(
        Property.internalTextBag("exif:Flash"),
        Arrays.asList(
        Property.internalTextBag("Flash")));
        _tikaToExiftoolMetadataMap.put(
        Property.internalTextBag("exif:ExposureTime"),
        Arrays.asList(
        Property.internalTextBag("ExposureTime")));
        _tikaToExiftoolMetadataMap.put(
        Property.internalTextBag("exif:FNumber"),
        Arrays.asList(
        Property.internalTextBag("FNumber")));
        _tikaToExiftoolMetadataMap.put(
        Property.internalTextBag("exif:FocalLength"),
        Arrays.asList(
        Property.internalTextBag("FocalLength")));
        _tikaToExiftoolMetadataMap.put(
        Property.internalTextBag("exif:IsoSpeedRatings"),
        Arrays.asList(
        Property.internalTextBag("ISOSpeedRatings"),
        Property.internalTextBag("ISO")));
        _tikaToExiftoolMetadataMap.put(
        Property.internalTextBag("exif:DateTimeOriginal"),
        Arrays.asList(
        Property.internalTextBag("DateTimeOriginal")));

        Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>