Exporting Data from Adobe Experience Manager


Adobe Experience Manager does not have a formal way to export data to files. There are a couple of ways to pull this off.

Nodes can be saved outside the repository in two basic ways: serialized representations of the nodes or as simple folders and files. The serialized nodes will retain most of the information of the node and its properties. In most cases serialized data can be used to recreate a replica of the node in the repository. A simple file/folder representation of a node will be saved in two ways: as a file if the node type is nt:File or as a folder if the node is any other type. Information about properties are not retained. Only the hierarchy and binary/text data of the nt:File nodes are retained. Unless the original node is the simplest folder/file structure, bringing the nodes back into AEM or Sling probably will not replicate the original node.

The simplest serialization of nodes is the JSON representation of nodes. Where it is possible to get a JSON version of the node, using the .json suffix will return JSON data. Selectors can be used to determine how deep to serialize the tree or whether the JSON should be formatted in a way that is easy for people to read. See the Apache Sling documentation for rendering content for a quick description. JSON representations of nodes can be imported into repositories.

The other serialization format I will mention is the Jackrabbit FileVault format. This serialization is a mixture of metadata files, folders, XML descriptors and binary files. Like serializing nodes to a file/folder structure, nt:File nodes’s binary data are saved files and all other nodes are saved as folders. XML descriptors contain the additional information that is not saved to the file/folders.

Packages from Package Manager are zip files of serialized FileVault formatted data. Expand the zip and the metadata, files, folders and descriptor are there. The Vault console tool can be used to extract serialized data from a repository without the use of packages. Package Manager has both a GUI interface and, for the GUI-adverse, a RESTful API. Both are described in Adobe documentation.

The simplest way to get data from a repository in file/folder form without descriptor files or metadata is to use WebDAV. WebDAV should not be enabled within production publish instances for security reasons. Its use may or may not be acceptable within a company’s security restriction in other environments. Using WebDAV is a relatively easy way to place content into and get content out of the repository. The documentation for WebDAV use can be found with Adobe.

There is another tool that can be used as well, the Static Agent. This replication agent does not replicate to another server. It writes the replicated node to the file system. After it is enabled and configured, activated content will be saved to the file system as file/folder content as if it were cached on a web server. As a replication agent, content can be passed directly to it. This way of using the Static Agent is much more complicated. Yogesh Upadhyay describes how to set up the Static Agent, use it for activation or call it directly: How to use static agent in CQ / WEM.



About The Author

Deke departed Southern Alabama as a young man, leaving the humidity and fire ants behind. The fire ants are catching up with him. He works for Adobe. Despite that, his opinions expressed on this site are his own and should never, ever, be attributed or blamed on Adobe. Ask him what he thinks of chiggars sometime. Home Page | GitHub | Adobe Blog | Twitter | LinkedIn