Repo Exporter Guide

The Sophora Repo Exporter makes it easy to export parts of a repository. This is useful i.e. for copying current live data to a fresh test system.

Table of Contents

The Sophora Repo Exporter is a standalone tool. It connects to a Sophora server to export data to XML. A flexible configuration allows to configure the part of the repository which is meant to be exported.

Capabilities

The Sophora Repo Exporter exports a part of a repository. The repository can hold a wide variety of documents and configuration data. Therefore the exporter can be configured to export exactly the data you want. The following parts can be configured and combined as you want:

  • Complete administration areas
  • Single items of administrative data
  • Documents

Here is the list of administrative data that can be exported:

  • Node types and their appropriate configuration
  • Structure (and their hierarchy documents)
  • Roles
  • Users
  • Dictionaries
  • Proposal sections
  • A full set of all system documents

For the documents you can select precisely what documents you want to export:

  • By ID (UUID, Sophora ID or external ID)
  • By query (XPath or Solr)
  • By tag
  • By document type
  • By date (modification date, creation date)
  • By relative time since modification
  • By structure path
  • By document URL
  • Referenced documents by defining a recursion level
  • Only documents which have changed since the last export

For the simple exports from the DeskClient you can also configure which version of the Sophora-XML should be produced, what properties should be ignored and which text properties should be treated as references.

Installation

While installing the Sophora Repo Exporter, it is recommended to use the following folder hierarchy:

cms-install-directory
		apps
				sophora-exporter-1.54.0
						sophora-exporter.sh
						...
				sophora-exporter > Symbolic link to sophora-exporter-1.54.0
				...
		repoexporter
				config
						sophora-exporter.json
				logs
				sophora-exporter.sh > Symbolic link to ../apps/sophora-exporter/sophora-exporter.sh
		...

This hierarchy is analogous to the directory structure of the Sophora server.

Configuration

By directing the Repo Exporter what to export you need to create a configuration file. In that file everything that should be exported will be defined. Most parts are optional and can be combined with all other optional parts.

Syntax

The format of the configuration file is JSON. This means that the whole configuration is surrounded by curly brackets ({}). Each keyword is followed by a colon (:) and optionally surrounded by quotes ("). The values must be given in quotes, if it is text. Options that belong together are surrounded by curly brackets. A list of configuration options is given in square brackets ([]). Different configuration blocks are separated by a comma (,).

For example a single "documents" configuration which defines a list of UUIDs of documents to export is configured as followed.
Note that the first keyword "documents" is without and "uuids" is within quotes. This is just as an example that both is possible. The quotes are correct JSON syntax. Omitting them can be done for convenience. Choose the style that you prefer.

documents: [
		{
			"uuids": [
				"78f91fd7-740e-419b-9698-29856b56f4d6",
				"a595f746-d893-4c4c-8a25-fb82bd69f314"
			]
		}
	]

Required Settings

You must specify to which Sophora server the Repo Exporter should connect. You need to specify the URL for connecting and the username with the password for the login.

"sophoraServer": {
		"host": "http://localhost:1196",
		"username": "admin",
		"password": "admin"
	},

You also need to define to which folder files will be written. In that folder subfolders will be created for the different configuration blocks. The keyword is "exportDir":

"exportDir": "/path/to/exportData",

Export Settings

There are three main configuration parts in order to define data to export. Each has its own configuration block with its specific settings.

  1. Complete administration areas
  2. Single items of administrative data
  3. Documents

Administration Areas

Under "adminExport" you specify that a whole area of the administration view should be exported. The special keyword "full" exports all areas, which is the same as using the context menu item "Export all..." in the DeskClient.

Options for "adminExport"
OptionDescription
fullThe whole administration data is exported. This includes all options below. Users will be exported with password hashes.
fullWithoutPasswordsThe same as 'full' but without user passwords.
nodetypesAll document types and their configuration (CND, node type configurations and default configurations).
structureAll sites and their full structure (all structure nodes).
categoriesThe legacy categories.
allSystemDocumentsAll system documents as select values, paragraph types and configuration documents (all documents with the mix-in sophora-mix:systemDocument).
rolesAll roles.
usersAll users with their hashed passwords.
usersWithoutPasswordsAll users without their passwords.
proposalSectionsAll proposal sections (no proposals!).
dictionariesAll dictionaries.

Example:

"adminExport": [
		"roles",
		"dictionaries",
		"proposalSections"
	]

Administrative Data Elements

Use "adminElementExport" to export single items of administrative data. This can be a single node type, specific users or only a part of the structure. So for each option you can define a list of items which should be exported.

Options for "adminElementExport"
OptionValuesDescription
nodetypesList of document type namesThe CNDs and node type configurations of the given document types will be exported.
structureNodesList of UUIDs or pathsThe given structure nodes and their substructure will be exported.
exportRelevantIndexDocumentstrue or falseFor each exported structure node the default document will also be exported. (default false)
exportRelevantHierarchyDocumentstrue or falseFor each exported structure node the hierarchy document will also be exported. (default false)
rolesList of UUIDsThe given roles with their configuration.
usersWithPasswordsList of user namesThe given users will be exported with all its configuration including the hashed passwords.
usersWithoutPasswordsList of user namesThe given users will be exported without their passwords.
proposalSectionsList of UUIDsThe given proposal sections will be exported.

Example:

"adminElementExport": { 
    	"nodetypes": [
			"example-nt:story"
    	],
    	"structureNodes": [
			"eb55f5da-f4f8-4f18-8965-bfcaf9e9d10a",
			"/test/structure/path"
    	],
    	"proposalSections": [
			"2b852949-5056-4a09-8b50-54cc00f8bd58"
    	]
    }

Documents

In a "documents" block a list of multiple document filters can be specified. Inside a single block multiple keywords can be used to specify which documents are meant to be exported. All these keywords are ANDed together to collect documents.

Options for "documents"
OptionValuesDescription
uuidsList of UUIDsDocuments can directly be given by their UUID of the source repository.
externalIdsList of external IDsDocuments can directly be given by their unique external ID across repositories.
xpathQueriesList of XPath queriesA full JCR query which searches for arbitrary documents in the source repository.
solrQueriesList of Solr queriesA Solr query which searches for arbitrary documents in the source repository.
documentUrlsList of document URLsDocuments can be given by their URL from the delivery. Note that the Sophora server must know the deliveries for this feature.
tagsList of tagsAll documents with all of the given tags will be exported.
indexDocumentstrue or falseExport the default documents of all structure nodes (default false).
hierarchyDocumentstrue or falseExport the hierarchy documents of all structure nodes (default false).
exportRelevantStructuretrue or falseIf true, for all exported documents their corresponding structure node hierarchy is exported (default false).
exportRelevantIndexDocumentstrue or falseIf true and exportRelevantStructure is also true, for each relevant structure node its default document will also be exported. (default false)
exportRelevantHierarchyDocumentstrue or falseIf true and exportRelevantStructure is also true, for each relevant structure node its hierarchy document will also be exported (default false).
criteriaList of criteria blocksSpecify documents by some criteria (see below).
maxRecursionDepthintegerIf greater than zero, referenced documents will be exported for each level until the given depth is reached. The referenced documents are exported in individual files.
uuidsUrlURLA web address which returns a documents configuration block. The defined options beside the URL will be merged with the fetched options.

The options of a "criteria" block are ANDed together, so all given options must match a document. Different blocks can be listed which are ORed.

Options for "criteria"
OptionValuesDescription
documentTypesnode type namesMatches all documents with any of the given types
structurePathstructure pathMatches all documents located below the given structure path recursively
minModificationDatedate with optional timeMatches all documents modified since the given date (can also be combined with maxModificationDate)
maxModificationDatedate with optional timeMatches all documents modified until the given date (can also be combined with minModificationDate)
minCreationDatedate with optional timeMatches all documents created since the given date (can also be combined with maxCreationDate)
maxCreationDatedate with optional timeMatches all documents created until the given date(can also be combined with minCreationDate)
modifiedInLastDayspositive integerMatches all documents modified in the last given days

Example:

"documents": [
		{
			"uuids": [
				"78f91fd7-740e-419b-9698-29856b56f4d6",
				"a595f746-d893-4c4c-8a25-fb82bd69f314"
			],
			"exportRelevantStructure": true,
			"exportRelevantIndexDocuments": true,
			"exportRelevantHierarchyDocuments": true
		}, {
			"externalIds": [
				"68ad8b0b-dc2e-4a29-a6aa-e22f582dfd6f",
				"sophora.configuration.relevance",
				"xpathQueries": [
					"//element(*, sophora-content-nt:story)[@sophora-content:importend = 'true']"
				]
		}, {
			"solrQueries": [
				"sophora_id_s: le-bon-marche-logo100"
			],
			"criteria": [{
				"modifiedInLastDays": 7
			}]
		}, {
			"documentUrls": [
				"http://www.subshell.com/demosite/trendcities/london/New-Shapes-New-Styles-from-London-City-Report100.html",
				"http://www.subshell.com/demosite/trendcities/index.html"
			]
		}, {
			"criteria": [{
				"documentTypes": ["sophora-demo-nt:textfields", "sophora-demo-nt:datefields"],
				"structurePath": "/demosite/home"
			}]
		}, {
			"criteria": [{
				"minModificationDate": "2014-08-01", "maxModificationDate": "2014-08-02"
			}, {
				"minCreationDate": "2014-06-01T12:34", "maxCreationDate": "2014-06-02T16:54:32.999"
			}]
		}, {
			"tags": [
				"vision",
				"hamburg"
			],
			"uuids": [
				"e6907385-410d-4611-a73b-736928fffe17"
			],
			"maxRecursionDepth": 3
		}
	]

Other Options

The following options can be given to fine-tune the export. They must be specified inside the main configuration block and cannot be nested. They apply to all exported elements.

OptionValuesDescription
deltaExporttrue or false (Default: false)Only documents modified since the last export will be exported (again).
daemonModetrue or falseListens for document modifications and exports documents immediately if they match the options. Can be combined with deltaExport. The execution will continue until you wish to stop the Exporter.
exportDocumentsWithTimestamptrue or falseAppends a timestamp to the export filenames
propertiesNotToExportInSophoraXmlList of property namesThese properties will not be exported to XML. If not defined a default set will be used.
stringToReferencePropertiesMap of document types to a list of property namesFor each document type you can specify string properties which hold a reference to other documents.
groovyExtensionDirPath to a directoryGroovy scripts in this directory can modify the created xml. They are executed for every document in the created xml files.
maxRecursionDepthPerFileintegerIf greater than zero, referenced documents are exported for each level until the given depth is reached. The referenced documents are includes in the xml file of the main document. (Since 1.53.3, 1.54.4, 2.0.2, 2.1.0)
includeLiveVersionInXmltrue or false (Default: true)When true, the live version of the exported document is included in the xml.
xmlVersionVersionThe Sophora XML version to create, like "2.8". This can be used if the exported data is intended to be imported in a repository which runs an older Sophora version. If not set, the newest Sophora XML version will be used.

Groovy scripts

(Since: 1.53.3, 1.54.4, 2.0.2, 2.1.0)
Groovy scripts can modify the generated xml. The scripts are executed for each document included in the created xml files. The following packages are available:

  • org.jdom2
  • com.subshell.sophora.api
  • com.subshell.sophora.api.content
  • com.subshell.sophora.api.content.value
  • com.subshell.sophora.api.exceptions
  • com.subshell.sophora.api.nodetype
  • com.subshell.sophora.api.search
  • com.subshell.sophora.api.structure

The following variables are also accessible:

  • sophoraClient: An instance of ISophoraClient
  • config: The exporter configuration
  • document: The INode of the exported document
  • xml: The Jdom element of the created xml

This example script adds the url of the document in the created xml as an additional property:

def live = (document.getBoolean("sophora:isOnline") == true)
def url = sophoraClient.getDocumentUrl(document.getUUID(), live)
if (url != "") {
	def ps = xml.getChild("properties", xml.getNamespace())
	def p = new Element("property", xml.getNamespace())
	p.setAttribute("name", "ext:sourceurl")
	ps.addContent(p)
	def v = new Element("value", xml.getNamespace())
	p.addContent(v)
	v.setText(url)
}

Exporting

To run the Sophora Repo Exporter you need a configuration file as described above. Execute the tool with the following command line in its folder:

# Linux or other UNIX-like OS
./sophora-exporter.sh
# Windows
sophora-exporter.bat

If you want to use a custom name of the configuration file, use the following command line:

# Linux or other UNIX-like OS
./sophora-exporter.sh -Dapp.config=myexporter-config.json
# Windows
sophora-exporter.bat  -Dapp.config=myexporter-config.json

Replacing a Repository

The Sophora Repo Exporter also includes a script to replace a complete repository (reset_repository.sh). This script will do the following steps:

  • stop a local importer
  • stop a local server
  • backup the current repository in a backup folder with time stamp
  • delete the cache and the log folder of the local server
  • clear incoming/failure/success/temp folder of the importer
  • restart the local server and importer
  • start the Sophora Repo Exporter which will export a (remote) repository specified by its configuration (provided by the user)
  • copy all exported data into the watchfolder of the importer

A configuration to export everything of a repository may look like this:

{
	"sophoraServer": {
		"host": "remote.host.com:1199",
		"username": "admin",
		"password": "admin"
	},
	"deltaExport": true,
	"exportDir": "exportData",
	"adminExport": [
		"full"
	],
	"documents": [
		{
			"xpathQueries": [
				"element(*, sophora-mix:document)"
			]
		}
	]
}
Note that you may have to adapt the folder names in the script if they differ from the default names.
The configuration of the importer has to include the following entry:
sophora.importer.watchfolder.includeSubfolder=true
This has to be done because the exporter will create sub folders which include the actual exported XML files.