Importer: Composition of the Import XML

How to import documents using Sophora's import xml.

Table of Contents

The general structure of a Sophora import XML file looks like the following:

<?xml version="1.0" encoding="UTF-8"?>
<document nodeType="NODETYPE"
          xmlns="http://www.sophoracms.com/import/2.8"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <properties>
    [...]
  </properties>
  <childNodes>
    <childNode nodeType="NODETYPE" name="CHILDNODE-NAME">
      [...]
    </childNode>
  </childNodes>
  <resourceList>
    <document nodeType="NODETYPE" externalID="EXTERNAL-ID">
      [...]
    </document>
  </resourceList>
  <fields>
    [...]
  </fields>
  <instructions>
    [...]
  </instructions>
</document>

The root element is <document>. Its attribute nodeType specifies the node type this document shall be assigned to. This property is required for new documents and all child nodes. Such a document comprises its properties (<properties> element), the entailed childnodes (<childNodes> element) and assigned documents (<resourceList> element). Furthermore, the <fields> element contains the document's metadata.

Normally the version of the Sophora XML is given as namespace in the form <major>.<minor>. For testing it might be necessary to declare the location of the schema explicitly for validating the XML in your development environment. In that case you declare the full file name including the bugfix version and the file extension in the root element, like the following example:
<?xml version="1.0" encoding="UTF-8"?>
<document nodeType="sophora-content:story"
          xmlns="http://www.sophoracms.com/import/2.8"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://www.sophoracms.com/import/2.8 http://www.sophoracms.com/import/2.8/sophora-import-2.8.0.xsd">

Properties <properties>

Properties represent the primary characteristics of a document given in individual <property> elements. Which properties are available for which kind of documents is defined in the node type configuration within Sophora (see here for configuration details).

The subsequent example is the node type configuration of an image object:

<'sophora-content-nt'='http://www.subshell.com/sophora-content-nt/1.0'>
<'sophora-extension-nt'='http://www.subshell.com/sophora-extension-nt/1.0'>
<'sophora-content'='http://www.subshell.com/sophora-content/1.0'>
 
['sophora-content-nt:imageobject'] > 'sophora-extension-nt:image'
  orderable
  - 'sophora-content:tags' (string)
  - 'sophora-content:title' (string)
  - 'sophora-content:chargeable' (boolean)
  - 'sophora-content:credit' (string)
  - 'sophora-content:source' (string)
  - 'sophora-content:url' (string)
  - 'sophora-content:displayStyle' (string)

The import XML of an according document may look like the following:

<?xml version="1.0" encoding="UTF-8"?>
<document nodeType="sophora-content-nt:imageobject"
          xmlns="http://www.sophoracms.com/import/2.8"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <properties>
    <property name="sophora-content:title">
      <value>Die Menschen in Peking strömen in den Olympiapark.</value>
    </property>
    <property name="sophora-extension:alttext">
      <value>Menschenmenge in Peking</value>
    </property>
    <property name="sophora-content:chargeable">
      <value>true</value>
    </property>
  </properties>
  <childNodes>
    <childNode nodeType="sophora-extension-nt:imagedata" name="sophora-extension:imagedata">
      [...]
    </childNode>
  </childNodes>
  <resourceList>
  </resourceList>
  <fields>
    [...]
  </fields>
  <instructions>
    [...]
  </instructions>
</document>
Inheritance
The property sophora-extension:alttext (amongst others) has not been defined in the examplary node type configuration above. Nonetheless this property may be used here because this document type extends the sophora-extension-nt:image node type and thereby inherits its properties.

This is the configuration of the super node type sophora-extension-nt:image:

<'sophora-extension-nt'='http://www.subshell.com/sophora-extension-nt/1.0'>
<'sophora-extension'='http://www.subshell.com/sophora-extension/1.0'>
 
['sophora-extension-nt:image']
  orderable
  - 'sophora-extension:caption' (string)
  - 'sophora-extension:alttext' (string)
  - 'sophora-extension:iptc' (string)
  + 'sophora-extension:imagedata' ('sophora-extension-nt:imagedata') multiple
If a property of an existing document is protected in the repository you have to explicitly unprotect the property - see Protecting and unprotecting Properties.

XHTML Tags in Property Values

Certain properties within Sophora may contain XHTML tags for special text formatting (e.g. copytext or teaser). Permitted elements are: <ul>, <li>, <strong>, <em> and <br/>. Therefore, <value> elements may incorporate such HTML tags. For example:

[...]
  <properties>
    <property name="sophora-content:shorttext">
      <value>Die Menschen in Peking <strong>strömen</strong> in den Olympiapark.<br />In großen Mengen!</value>
    </property>
    [...]
  </properties>
[...]

Date Values

By default, dates are specified in the ISO 8601 format:

[...]
  <properties>
    <property name="sophora-content:date">
      <value>2008-08-05T09:00:00+02:00</value>
    </property>
    [...]
  </properties>
[...]

If the Importer cannot read a date string accordingly (cannot apply the ISO 8601 date pattern), it tries to read the date in a "pseudo" ISO 8601 format, in which a space is used instead of the character 'T'. If this is not succesful neither, the importer tries to parse the date string using the format "dd.MM.yyyy HH:mm" whereas the "HH:mm" indication is optional. This allows to provide dates in the following way as well:

[...]
  <properties>
    <property name="sophora-content:date">
      <value>05.08.2008 09:00</value>
    </property>
    [...]
  </properties>
[...]

Binary Properties

Binary properties must be provided with the additional property mimetype, which advices the importer to import the referenced binary file with the given mime type.

The following example shows a typical image import. In the imagadata childnode you can see the binary property sophora-extension:binarydata with its attribute mimetype (set to "image/jpeg"):

<?xml version="1.0" encoding="UTF-8"?>
<document nodeType="sophora-content:image" xmlns="http://www.sophoracms.com/import/2.8">
  <properties>
     [...]
  </properties>
  <childNodes>
    <childNode nodeType="sophora-extension-nt:imagedata" name="sophora-extension:imagedata">
      <properties>
        <property name="sophora-extension:imagetype">
          <value>original</value>
        </property>
        <!-- The binary property with its attribute "mimetype" relates to a binary file in the filesystem. -->
        <property name="sophora-extension:binarydata" mimetype="image/jpeg">
          <value>image_4711_binary_1.jpeg</value>
        </property>
      </properties>
      <childNodes />
      <resourceList />
    </childNode>
  </childNodes>
  <resourceList />
  <fields>
     [...]
  </fields>
  <instructions>
    [...]
  </instructions>
</document>

The referenced image file image_4711_binary_1.jpeg has to lie in the same directory as the import XML file. Alternatively, you can specify a relative path like 'images/import/image_4711_binary_1.jpeg'. Due to security reasons a relative path cannot reference a file in a higher folder hierarchy. (This is only possible if you have explicitly configured this folder as accessible via the property 'sophora.importer.fileaccess.basedir' - see Properties in the instance configuration file(s) 'sophora-importer_instance-NNN.properties' ).

Additionally you may opt to reference binary data via URLs:

<!-- HTTP URL -->
<property name="sophora-extension:binarydata" mimetype="image/jpeg">
  <value>http://www.example.com/my-picture.jpg</value>
</property>
 
<!-- HTTPS URL -->
<property name="sophora-extension:binarydata" mimetype="image/jpeg">
  <value>https://www.example.com/my-picture.jpg</value>
</property>
 
<!-- File URL -->
<property name="sophora-extension:binarydata" mimetype="image/jpeg">
  <value>file:C:/temp/image_4711_binary_1.jpeg</value>
</property>
 
<!-- inline data -->
<property name="sophora-extension:binarydata" mimetype="image/gif">
  <value>data:;base64,R0lGODlhEAAQAMQAAP797/332f732f322f322vztufzkmfvjmfvkmf3tufvdg/zjmbuCFb2EFrl/
FLJ4E7V7FLl/FbuBFruCFqtwEapwEahuEa5zEqtwErJ3E6pxEqZrEKhtEf///wAAAAAAACH5BAEA
AB0ALAAAAAAQABAAAAVVYCeOZGmeKNk0adkAbCu+sDndzC0B/ERGvKCQ5xhBBoMAIRlACgQDiChT
KCQSVqy1+hBdDoeFYXFAGMaGCwlDoWAqGopijpFx5hZZZ6PY6Pd+f4ItIQA7</value>
</property>

If you use a file url - for example 'file:C:/temp/image_4711_binary_1.jpeg' (on a windows operating system) - you can only point to binary files which are located on the same directory as the import XML file (or recursively in subfolders of this directory) or which are located in an additional accessible folder (or recursively in subfolders of this directory) by configuring the property 'sophora.importer.fileaccess.basedir' (see Properties in the instance configuration file(s) 'sophora-importer_instance-NNN.properties').

Inline binary data uses the "data" URI scheme. Please note that contrary to the specification of the scheme, only the following form is supported by the importer:

data:;base64,...your data here...

Automatic Downscaling of Oversized Images on Import

In order to have oversized images scaled down automatically on import, the attribute autoScale="true" needs to be set in the binary property of the image. Under these condition any image that excesses the dimensions configured in the original image variant will be scaled down. The proportions of the image are maintained.

The optional attribute autoScale is only valid on binary properties that contain image files.

Example:

[...]
        <property name="sophora-extension:binarydata" mimetype="image/jpeg" autoScale="true">
          <value>image_4711_binary_1.jpeg</value>
        </property>
    [...]

Additional Information at Reference Property Values

If you export Sophora XML via the Sophora Deskclient or programatically via the Sophora Client you may notice the following attributes when exporting documents with references to other documents:

[...]
        <property name="sophora:reference">
          <value site="demosite" structureNode="/multimedia/images" sophoraId="image142" uuid="dd74be59-c921-4d88-aa2c-7094b6dd1384">image_4711</value>
        </property>
    [...]

The optional attributes "site", "structureNode", "sophoraId" and "uuid" contain meta information about the referenced document. These meta data are exported if a sophora version greater than or equal version 2.1 is selected at the export. The attributes are only set at reference properties.

The attributes site and structureNode are only used to reference structure node documents. In this case the importer uses at first the structure node which is specified via the site and the structureNode attribute. Only if this structure does not exist the specified external id is taken for the reference. The attributes sophoraId and uuid has no impact at the import at all. Please note that the text node of the element value contains the external id of the referenced document.

Multi Values

Multi values can be provided easily by adding multiple <value> elements to the <property> element:

[...]
  <properties>
    <property name="MULTIVALUE_PROPERTY_NAME">
      <value>Value 1</value>
      <value>Value 2</value>
      <value>Value 3</value>
    </property>
    [...]
  </properties>
[...]

Removing Properties

When updating an existing document via the Importer, you can remove properties from this document. To do so set the attribute "remove" of a property to true. If the file import creates a new document in the repository, such a property would simply be ignored.

The example below displays the removal of property sophora-content:name under the assumption that the underlying document already exists:

[...]
  <properties>
    <property name="sophora-content:name" remove="true" />
    [...]
  </properties>
[...]

In such a case, the value of the property will be ignored.

The removal of properties works in the same way for childnode properties of the document and also when merging document structure over several levels.

Protecting and unprotecting Properties

It is possible to protect and unprotect properties via the optional attribute protectionInstruction (refer also to the user and the administration manual). The possible values for protectionInstruction are:

ValueDescription
noneIf the corresponding property in the repository is protected, its value will not be changed to the value in the XML.
Skipping the attribute protectionInstruction has the same meaning as using the value none!
unprotectIf the corresponding property in the repository is protected, the protection will be removed and then the value from the XML will be set. After setting the value, the protection will not be re-established.
protectIf the corresponding property in the repository is not protected (or if the document is newly created), the value from the XML will be set and after setting the value, the property will be protected.
If the corresponding property in the repository is protected, the value from the XML will be ignored!
forceProtectIf the corresponding property in the repository is protected, the protection will be forced to be removed and then the value from the XML will be set. After setting the value, the property will be protected regardless of whether the property was protected before or not.
reprotectIf the corresponding property in the repository is protected, the protection will be removed and then the value from the XML will be set. After setting the value, the property will only be protected if the property was protected before. (Therefore the property will never be protected if the document is newly created.)

It is possible to use the attribute protectionInstruction without providing a value to the property. In this way you can modify the protection of  a property without changing its value.

The follwing XML snippet shows different protection use cases:

[...]
  <properties>
    <!-- Unprotect property (if it is protected) and setting value to "Hallo Welt!". -->
    <property name="sophora-content:headline1" protectionInstruction="unprotect">
      <value>Hallo Welt!</value>
    </property>
    <!-- Setting value to "Hallo Welt!" and afterwards protecting the property. (If the property is
         protected in the repository, its value will not be changed!) -->
    <property name="sophora-content:headline2" protectionInstruction="protect">
      <value>Hallo Welt!</value>
    </property>
    <!-- Unprotect property (if it is protected), setting value to "Hallo Welt!" and
         afterwards protecting the property.  -->
    <property name="sophora-content:headline3" protectionInstruction="forceprotect">
      <value>Hallo Welt!</value>
    </property>
    <!-- Unprotect property (if it is protected), setting value to "Hallo Welt!" and
         afterwards protecting the property only if it was protected before.  -->
    <property name="sophora-content:headline3" protectionInstruction="reprotect">
      <value>Hallo Welt!</value>
    </property>
    <!-- Protect property if it is not already protected. The value of the property will not be modified. -->
    <property name="sophora-content:headline4" protectionInstruction="protect" />
    <!-- Remove property value and then protect property. (The property will only be removed, if it
         is not already protected!) -->
    <property name="sophora-content:headline5" protectionInstruction="protect" remove="true" />
    [...]
  </properties>
[...]
You can only protect properties of nodetypes which implement the mixin sophora-mix:protectable.

The protection of properties is only possible on document level - box properties or other childnode properties cannot be protected or unprotected! (If you use the attribute protectionInstruction in these cases, it will be ignored.)

Childnodes <childNodes>

The element <childNodes> comprises content that has not been modeled as a separate document and is therefore existence-dependend on the parent document.

An image object, for example, comprises the corresponding image data (childnode of type sophora-extension-nt:imagedata).

In addition to the node type the cildnode's name has to be provided (the name attribute of an <childNode> element) since it is possible to add childnodes, which have the same type but different names:

<?xml version="1.0" encoding="UTF-8"?>
<document nodeType="sophora-content-nt:imageobject"
          xmlns="http://www.sophoracms.com/import/2.8"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <properties>
    [...]
  </properties>
  <childNodes>
    <childNode nodeType="sophora-extension-nt:imagedata" name="sophora-extension:imagedata">
      <properties>
        <property name="sophora-extension:imagetype">
          <value>original</value>
        </property>
        <property name="sophora-extension:binarydata">
          <value>olympiapeking_menschenmenge.jpg</value>
        </property>
        <property name="sophora:mimetype">
          <value>image/jpeg</value>
        </property>
      </properties>
      <childNodes/>
      <resourceList/>
    </childNode>
  </childNodes>
  <resourceList>
  </resourceList>
  <fields>
    [...]
  </fields>
  <instructions>
    [...]
  </instructions>
</document>

Resource List <resourceList>

Within the resource list (<resourceList> element) arbitrary documents from the Sophora repository may be referenced:

[...]
  <resourceList>
    <document nodeType="NODETYPE">
      [...]
    </document>
    <document nodeType="NODETYPE">
      [...]
    </document>
  </resourceList>
  [...]

Placing documents in a <resourceList> element is useful if you want to import a document which is directly related to one or more other document(s). As an example consider an image gallery (a document itself!) which is to be created together with all the images that belong to the image gallery - in this case the image documents could be placed in the <resourceList> element of the image gallery document.

Due to the nested relation in the import XML, those documents that are located deeper in the XML level are imported first. The next section therefore deals with the linking of different documents in more detail.

Linking of Different Documents Within Sophora XML

To link different documents within the Sophora XML you can use the optional attribute externalID of the <document> element. When this external id attribute is used in a reference property somewhere else in the XML document, the Importer automatically hooks the two documents.

In the following example an image gallery containing a reference to an image object is created:

<?xml version="1.0" encoding="UTF-8"?>
<document nodeType="sophora-content-nt:imagegallery"
          xmlns="http://www.sophoracms.com/import/2.8"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <properties>
    [...]
  </properties>
  <childNodes>
    <childNode nodeType="sophora-content-nt:imageref" name="sophora-content:image">
      <properties>
        <property name="sophora:reference">
          <value>refImage01</value>
        </property>
      </properties>
      <childNodes/>
      <resourceList>
        <document nodeType="sophora-content-nt:imageobject" externalID="refImage01">
          <properties>
            [...]
          </properties>
          <childNodes>
            <childNode nodeType="sophora-extension-nt:imagedata" name="sophora-extension:imagedata">
              <properties>
                [...]
              </properties>
              <childNodes/>
              <resourceList/>
            </childNode>
          </childNodes>
          <resourceList/>
          <fields>
            [...]
          </fields>
          <instructions>
            [...]
          </instructions>
        </document>
      </resourceList>
    </childNode>
  </childNodes>
  <resourceList/>
  <fields>
    [...]
  </fields>
  <instructions>
    [...]
  </instructions>
</document>

Using this mechanism you can even import documents that reference eachother. This might be the case when two news documents reference each other as a teaser.

Referencing Existing Documents

Referencing existing documents by external ID

When importing documents you can also connect them with documents that already exist in the repository.

If you want to update an existing document, simply specify the document external ID as attribute (externalID) of the main <document> element. Note that the nodeType property is not required when updating existing documents.

In the following example an existing document with the external ID "image4711" will be updated by the import:

<?xml version="1.0" encoding="UTF-8"?>
<document externalID="image4711"
          xmlns="http://www.sophoracms.com/import/2.8"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <properties>
    [...]
  </properties>
  <childNodes>
    [...]
  </childNodes>
  <resourceList />
  <fields>
    [...]
  </fields>
  <instructions>
    [...]
  </instructions>
</document>

If you wish to reference an existing document from the repository as a childnode, provide its external ID in the corresponding reference property of the childnode. To include an existing image (ID="image4711") in a newly created image gallery, the XML should contain the following snippet:

<?xml version="1.0" encoding="UTF-8"?>
<document xmlns="http://www.sophoracms.com/import/2.8"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <properties>
    [...]
  </properties>
  <childNodes>
    <childNode nodeType="sophora-content-nt:imageref" name="sophora-content:image">
      <properties>
        <property name="sophora:reference">
          <value>image4711</value>
        </property>
      </properties>
      <childNodes/>
      <resourceList />
    </childNode>
  </childNodes>
  <resourceList/>
  <fields>
    [...]
  </fields>
  <instructions>
    [...]
  </instructions>
</document>

Conditional Import: Importing a document only if it exists in the repository already

Sometimes you may want to import a document only if it exists in the repository already - otherwise the import of the document (and all dependent content in the resource list of the document) should be skipped.

To achieve this goal you can use the optional attribute importOnlyIfDocumentExists (default value: false) which is a direct attribute of the element <document>.

In the following example the documents with the external ids story4711 and story4711-image are only imported, if the external id story4711 exists in the repository already. However, the document with the external id story4811 is imported because it is not enclosed in the resource list of document story4711 but placed on the same XML level (underneath the element <documents>) as document story4711.

<?xml version="1.0" encoding="UTF-8"?>
<documents xmlns="http://www.sophoracms.com/import/2.8"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <document externalID="story4711"
            importOnlyIfDocumentExists="true">
    <properties>
      [...]
    </properties>
    <childNodes>
      [...]
    </childNodes>
    <resourceList>
      <document nodeType="sophora-content-nt:imageobject"
                externalID="story4711-image">
        [...]
      </document>
    </resourceList>
    <fields>
      [...]
    </fields>
    <instructions>
      [...]
    </instructions>
  </document>
  <document nodeType="sophora-content-nt:story"
            externalID="story4811">
    [...]
  </document>
</documents>>
The field <forceCreate>true</forceCreate> has no effect if you use it together with importOnlyIfDocumentExists="true" and the document does not exist: In this case the document is not newly created (you can say "importOnlyIfDocumentExists 'wins' over forceCreate!").

It makes little sense if you set importOnlyIfDocumentExists="true" but do not provide the attribute externalID. In this case the import of the document is skipped as well.

Referencing Existing Documents by xPath Expression

In some situations you may need a more flexible mechanism to reference existing documents than just identifying them by external ID. For this purpose you can declare arbitrary xPath expressions, which you must provide with a unique 'idString'. The importer resolves the xPath expressions and replaces every occurrence of the 'idString' with the external ID(s) of the xPath expression result(s).

A xPath expression is declared in the special element <documentIdentificationExpression> which is a direct child of the element <documentIdentificationExpressions>. The latter is the optional first Element of the element <documents>.

The first example shows the update of a document with the sophora id 'broadcastimage142':

<?xml version="1.0" encoding="UTF-8"?>
<documents xmlns="http://www.sophoracms.com/import/2.8" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <documentIdentificationExpressions>
    <documentIdentificationExpression idString="$externalId1$">element(*, sophora-content-nt:imageobject)[@sophora:id = 'broadcastimage142']</documentIdentificationExpression>
  </documentIdentificationExpressions>
  <document externalID="$externalId1$">
    <properties>
      [...]
    </properties>
    <childNodes>
      [...]
    </childNodes>
    <resourceList />
    <fields>
      [...]
    </fields>
    <instructions>
      [...]
    </instructions>
  </document>
</documents>

If you wish to reference an existing document from the repository as a childnode or as a reference property, provide the 'idString' of the xPath expression element in the corresponding reference property. To include an existing image (with a sophora id of 'broadcastimage142') in a newly created image gallery, the XML should contain the following snippet:

<?xml version="1.0" encoding="UTF-8"?>
<documents xmlns="http://www.sophoracms.com/import/2.8" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <documentIdentificationExpressions>
    <documentIdentificationExpression idString="$externalId1$">element(*, sophora-content-nt:imageobject)[@sophora:id = 'broadcastimage142']</documentIdentificationExpression>
  </documentIdentificationExpressions>
  <document>
    <properties>
      [...]
    </properties>
    <childNodes>
      <childNode nodeType="sophora-content-nt:imageref" name="sophora-content:image">
        <properties>
          <property name="sophora:reference">
            <value>$externalId1$</value>
          </property>
        </properties>
        <childNodes/>
        <resourceList />
      </childNode>
    </childNodes>
    <resourceList/>
    <fields>
      [...]
    </fields>
    <instructions>
      [...]
    </instructions>
  </document>
</documents>

The last snippet shows a more complex example: Every story document (i.e. the node has the type 'sophora-content-nt:story') with the headline 'Test' (see first element <documentIdentificationExpression>) is updated by setting it's headline to the value 'A real headline'. Additionally a 'teasersInTeaser' childnode is set at every of this story documents which points to the story with the sophora id 'teaserinteaser100' (see second element <documentIdentificationExpression>):

<?xml version="1.0" encoding="UTF-8"?>
<documents xmlns="http://ww.sophoracms.com/import/2.8">
  <documentIdentificationExpressions>
    <documentIdentificationExpression idString="$externalId1$" maxNumberOfResults="unbounded">element(*, sophora-content-nt:story)[@sophora-content:headline = 'Test']</documentIdentificationExpression>
    <documentIdentificationExpression idString="$externalId2$">element(*, sophora-content-nt:story)[@sophora:id = 'teaserinteaser100']</documentIdentificationExpression>
  </documentIdentificationExpressions>
  <document externalID="$externalId1$">
    <properties>
      <property name="sophora-content:headline">
        <value>A real headline</value>
      </property>
    </properties>
    <childNodes>
      <childNode nodeType="sophora-content-nt:teaserRef" name="sophora-content:teasersInTeaser">
        <properties>
          <property name="sophora:reference">
            <value>$externalId2$</value>
          </property>
        </properties>
        <childNodes />
        <resourceList />
      </childNode>
    </childNodes>
    <resourceList />
    <fields>
      [...]
    </fields>
    <instructions>
      [...]
    </instructions>
  </document>
</documents>

The following table explaines the element <documentIdentificationExpression> in details - mandatory content is marked as bold:

ContentDescription
(text content of the element)The text content of the element <documentIdentificationExpression> defines the JCR xPath query expression which is used to make a search against the repository. The returned nodes must be sophora documents.
A very useful tool for testing jcr xPath query expressions against a repository is Toromiro.
idString (attribute)The mandatory attribute 'idString' defines the string which is used as placeholder in the xml document and is replaced with the external ID(s) of the xPath expression result(s).
minNumberOfResults (attribute)The optional attribute 'minNumberOfResults' (default: 1) specifies how many results at least are expected by the xPath expression. If fewer results are found the import will abort with a error message.
maxNumberOfResults (attribute)The optional attribute 'maxNumberOfResults' (default: 1) specifies how many results at most are expected by the xPath expression. If more results are found the import will abort with a error message. If set to the value 'unbounded' no restriction is made.
numberOfResultsToProcess (attribute)The optional attribute 'numberOfResultsToProcess' (default: 'unbounded') specifies how many of the found results are used to replace the 'idString' of the xml document. If this attribute is not set, all results are used.
In some situations you might want to know how many documents are affected by your xPath expression before starting the update process - for this purpose you could set 'numberOfResultsToProcess' to '0'. So you can control in the log file of the importer how many search results your xPath expression returns - without affecting any documents.
createIfNoDocumentFound (attribute)The optional attribute 'createIfNoDocumentFound' (default: 'false') specifies whether a document should be newly created if the xPath expression returns no result.
Notice: It only makes sense to set this attribute to 'true' if the attribute 'minNumberOfResults' has the value '0'.

Protecting and unprotecting Documents

It is possible via the optional attribute protectionInstruction at the element <document> to protect and unprotect a whole document in the same way as you are protecting and unprotecting properties. The possible values for protectionInstruction are:

ValueDescription
noneIf a document with the given external id already exists in the repository and if this document is protected, the import of the document cannot be executed and an import error is thrown.
Skipping the attribute protectionInstruction has the same meaning as using the value none!
unprotectIf a document with the given external id already exists in the repository and if this document is protected, the protection will be removed and then the document's XML will be imported. After the import of the document, the protection will not be (re-)established.
protectIf a document with the given external id already exists in the repository and if this document is not protected (or if the document is newly created), the document's XML will be imported and then the document will be protected.
If a document with the given external id already exists in the repository and if this document is protected, the import of the document cannot be executed and an import error is thrown.
forceProtectThe document's XML will be imported in any case and then the document will be protected.
reprotectThe document's XML will be imported in any case. If a document with the given external id had already existed in the repository and if this document was protected, the protection will be re-established. In any other case (document was not protected in the repository / document was newly created) the document will not be protected.

The follwing XML snippet shows the use case of forcing the imported document to be protected (no matter if it was protected before or not):

<?xml version="1.0" encoding="UTF-8"?>
<document nodeType="sophora-content:story" protectionInstruction="forceProtect" xmlns="http://www.sophoracms.com/import/2.8">
  <properties>
     [...]
  </properties>
  <childNodes>
     [...]
  </childNodes>
  <resourceList />
  <fields>
     [...]
  </fields>
  <instructions>
    [...]
  </instructions>
</document>
You can only protect documents of nodetypes which implement the mixin sophora-mix:protectable.

Managing the Update Behaviour of Documents

When updating an existing document (e.g. by using its external ID; see previous section) the default behaviour is as follows:

  • Properties within the repository are merged with those from the import XML whereas the imported properties are preferred. That means properties from the import XML override properties in the repository, if they exist in both the XML and the repository. Properties that are not mentioned in the XML remain untouched. New properties (which do not exist in the repository yet) are added.
  • If a document in the XML file has one or more childnodes of a certain name (attribute „name"), all such named childnodes of the corresponding repository document are removed and replaced by the childnodes from the XML.
  • If the childnodes of the repository document, which have a certain name, are not contained in the corresponding XML document (i.e.: no childnode with this certain name exists is the XML document), the existing repository childnodes remain untouched.

Explicitly Removing Existing Childnodes

If you wish to remove childnodes with a certain name from an existing document, insert an <updateBehaviour> element to the <childNodes> element. Add a <childNode> element in here and specify the attribute "name" as well as the attribute "behaviour", which must be set to the value "delete":  behaviour="delete".

In the subsequent example, all audios and videos (as childnodes) should be removed from a podcast document (external ID is "podcast_mittags_vorgelesen"):

<?xml version="1.0" encoding="UTF-8"?>
<document externalID="podcast_mittags_vorgelesen"
          xmlns="http://www.sophoracms.com/import/2.8"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <properties>
    [...]
  </properties>
  <childNodes>
    <updateBehaviour>
      <childNode name="sophora-content:audiolist" behaviour="delete" />
      <childNode name="sophora-content:videolist" behaviour="delete" />
    </updateBehaviour>
  </childNodes>
  <resourceList/>
  <fields>
    [...]
  </fields>
  <instructions>
    [...]
  </instructions>
</document>

Merging Existing and New Childnodes

If you want to merge childnodes of an existing document with childnodes from an import XML, use the <updateBehaviour> element within the <childNodes> to define this behaviour. Therefore, add a <childNode> element and set its "behaviour" attribute to "merge". To define, which childnodes should be merged at all, you have to set the mandatory "name" attribute of the <childNode> element to the childnode name of the childnodes to be merged.

But how is detected, which childnode from the XML belongs to which childnode in the repository?

  • If a childnode id is given in the XML (property sophora:childnodeId), identical childnodes in the repository are identified via this childnode id and merged.
  • If no childnode id is given in the XML, identical childnodes in the repository are identified via referencing identical documents - so referenced documents in the repository document and the import XML are merged (and not added twice). (This proceeding is relevant only if the childnode is a reference node!)
  • Alternatively you can define the identity of childnodes (from the import XML and the repository document) by using a merge property (attribute mergeProperty="PROPERTY_NAME"). This feature might be useful, if rows of a dynamic table or the content of boxes should be updated.

Using the optional attribute "insertPosition" configures whether an additional/new childnode is added at the beginning or at the end of the list of childnodes. Possible values are "start" and "end" whereas the latter is the default (used when this attribute is omitted).

The attribute "maxNumber" is also optional. It defines the maximum number of childnodes of this very name. Depending on the insert position ("start" or "end") the list of childnodes of this name is reduced accordingly (at the beginning or end), if the maximum number of childnodes is exceeded.

The following example displays a standard way of merging using the Sophora reference property (no childnode ids are given in the XML). It shows a podcast document (external ID "podcast_mittags_vorgelesen") where an audio is added as a childnode (ID="podcast_audio_42"). Already existing audios should not be affected or modified. Furthermore, the podcast should not contain more than 30 audios and new ones shall be added at the beginning:

<?xml version="1.0" encoding="UTF-8"?>
<document externalID="podcast_mittags_vorgelesen"
          xmlns="http://www.sophoracms.com/import/2.8"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <properties>
    [...]
  </properties>
  <childNodes>
    <updateBehaviour>
      <childNode name="sophora-content:audiolist" behaviour="merge" insertPosition="start" maxNumber="30" />
    </updateBehaviour>
    <childNode nodeType="sophora-content-nt:audioref" name="sophora-content:audiolist">
      <properties>
        <property name="sophora:reference">
          <value>podcast_audio_42</value>
        </property>
        <property name="sophora-content:headline">
          <value>Überschriebene Headline</value>
        </property>
      </properties>
      <childNodes />
      <resourceList />
    </childNode>
  </childNodes>
  <resourceList/>
  <fields>
    [...]
  </fields>
  <instructions>
    [...]
  </instructions>
</document>

If the podcast already contains the audio "podcast_audio_42" childnode, it will not be added a second time. Instead, the properties are merged (in this case the "sophora-content:headline" property will be overridden).

The next example shows the merge with an arbitrary property. Consider a document with the external ID "icehockey_vancouver_group_a": It contains a dynamic table with results in the different rows which should be updated now. Assume that the rows have the childnode name "olympia:matchRow" and the nodetype is "olympia-nt:olyMatchRow". The property "olympia:matchId" is used as the mergeProperty.

<?xml version="1.0" encoding="UTF-8"?>
<document externalID="icehockey_vancouver_group_a"
          xmlns="http://www.sophoracms.com/import/2.8"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <properties>
    [...]
  </properties>
  <childNodes>
    <updateBehaviour>
      <childNode name="olympia:matchRow"
                 behaviour="merge"
                 mergeProperty="olympia:matchId" />
    </updateBehaviour>
    <childNode nodeType="olympia-nt:olyMatchRow" name="olympia:matchRow">
      <properties>
        <property name="olympia:result">
          <value>3:1</value>
        </property>
        <property name="olympia:matchId">
          <value>eh_v_gr_a_0001</value>
        </property>
      </properties>
      <childNodes />
      <resourceList />
    </childNode>
  </childNodes>
  <resourceList/>
  <fields>
    [...]
  </fields>
  <instructions>
    [...]
  </instructions>
</document>

The next example presents the merge over multiple levels of childnodes. Here, an article has an assigned box. Within this box is a reference to a citation document. The update of the citation document requires a merge on two levels: First, the box needs an identifier. This is achieved by setting mergeProperty="sophora-extension:title". On the second level, the Importer's standard behaviour applies because the childnodes can be identified by the reference property "sophora:reference". Therefore, no explicit merge property is necessary.

<?xml version="1.0" encoding="UTF-8"?>
<document externalID="test_00001"
          xmlns="http://www.sophoracms.com/import/2.8"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <properties>
    [...]
  </properties>
  <childNodes>
    <updateBehaviour>
      <childNode name="core:box"
                 behaviour="merge"
                 mergeProperty="sophora-extension:title" />
    </updateBehaviour>
    <childNode nodeType="core-nt:citationBox" name="core:box">
      <properties>
        <property name="sophora-extension:title">
          <value>Zitate</value>
        </property>
      </properties>
      <childNodes>
        <updateBehaviour>
          <childNode name="sophora-extension:teaser" behaviour="merge" />
        </updateBehaviour>
        <childNode nodeType="core-nt:citationRef" name="sophora-extension:teaser">
          <properties>
            <property name="sophora:reference">
              <value>citation_007</value>
            </property>
            <property name="core:headline">
              <value>Geschüttelt oder gerührt</value>
            </property>
            <property name="sophora:overridingProperties">
              <value>core:headline</value>
            </property>
          </properties>
          <childNodes />
          <resourceList />
        </childNode>
      </childNodes>
      <resourceList />
    </childNode>
  </childNodes>
  <resourceList/>
  <fields>
    [...]
  </fields>
  <instructions>
    [...]
  </instructions>
</document>

While merging, individual nodes may be removed specifically. This can be achieved by setting a childnode's "remove" attribute to true. Every childnode on this level with the same identity is removed, if it references the same document or if it matches the merge property's value (when this node has a configured <updateBehaviour> with an according merge property set).

The last example of this section explains how the explicit removal of a childnode works. In the following XML snippet are three different actions specified:

  1. Removing a reference to a text teaser: On the first level of the childnodes is defined that all document's childnodes named "core:textteaser" should be removed, if they refer to the document with the external ID "story_9908".
  2. Removing rows from a dynamic table for audio metadata: Also on the first level, it is defined that all childnodes named "core:audiodata" should be removed, if if their merge property "core:name" has the value "audio_file_12432".
  3. Removing audio references from an audio box: On the second level of the childnodes is defined that childnodes named "core:teaser" and referencing document ID "con2748038" are removed, if they are childnodes of the a childnode "core:box", which has the value "Test" assigned to his merge property "core:title".
<?xml version="1.0" encoding="UTF-8"?>
<document xmlns="http://www.sophoracms.com/import/2.8" externalID="audio_4711">
  <properties />
  <childNodes>
    <updateBehaviour>
      <childNode name="core:audiodata" behaviour="merge" insertPosition="end" mergeProperty="core:name" />
      <childNode name="core:box" behaviour="merge" insertPosition="end" mergeProperty="core:title" />
      <childNode name="core:textteaser" behaviour="merge" />
    </updateBehaviour>
    <childNode nodeType="core-nt:audiodata" name="core:audiodata" remove="true">
      <properties>
        <property name="core:name">
          <value>audio_file_12432</value>
        </property>
      </properties>
      <childNodes />
      <resourceList />
    </childNode>
    <childNode nodeType="core-nt:audiolist" name="core:box">
      <properties>
        <property name="core:title">
          <value>Test</value>
        </property>
      </properties>
      <childNodes>
        <updateBehaviour>
          <childNode name="core:teaser" behaviour="merge" />
        </updateBehaviour>
        <childNode nodeType="core-nt:audioref" name="core:teaser" remove="true">
          <properties>
            <property name="sophora:reference">
              <value>con2748038</value>
            </property>
          </properties>
          <childNodes />
          <resourceList />
        </childNode>
      </childNodes>
      <resourceList />
    </childNode>
    <childNode nodeType="core-nt:storyref" name="core:textteaser" remove="true">
      <properties>
        <property name="sophora:reference">
          <value>story_9908</value>
        </property>
      </properties>
      <childNodes />
      <resourceList />
    </childNode>
  </childNodes>
  <resourceList />
  <fields>
    [...]
  </fields>
  <instructions>
    <lifecycleActivities />
    <proposals />
    <stickyNotes />
  </instructions>
</document>

Protecting and unprotecting Childnodes

There are two ways to protect/unprotect childnodes:

  • On document level you can protect/unprotect childnodes via the name of the childnode.
  • On every level of the component tree you can protect/unprotect component boxes via the childnode id of the box.

Protecting and unprotecting Childnodes on Document Level

To protect/unprotect childnodes on document level via the name of the childnode you have to provide an element <childNode> inside an element <updateBehaviour> in the way it is described in chapter Managing the Update Behaviour of Documents. With the optional attribute protectionInstruction at the element <childNode> you can provide the wanted protection behaviour. Analogous to protecting/unprotecting properties (see chapter Protecting and unprotecting Properties) and protecting/unprotecting the whole document (see chapter Protecting and unprotecting Documents) there are five possible values for the attribute protectionInstruction:

ValueDescription
noneIf the corresponding childnode name in the repository is protected, the existing childnodes with the given name are not modified or removed. Childnodes with the given name in the import XML are ignored.
Skipping the attribute protectionInstruction has the same meaning as using the value none!
unprotectIf the corresponding childnode name in the repository is protected, the protection will be removed and then the corresponding childnodes from the XML will be set (regarding the attribute behaviour which can be set to merge or delete - see Managing the Update Behaviour of Documents). After setting the childnodes, the protection of the childnode name will not be (re-)established.
protectIf the corresponding childnode name in the repository is not protected (or if the document is newly created), the corresponding childnodes from the XML will be set and after setting them, the childnode name will be protected.
If the corresponding childnode name in the repository is protected, the corresponding childnodes from the XML will be ignored!
forceProtectIf the corresponding childnode name in the repository is protected, the protection will be forced to be removed and then the corresponding childnodes from the XML will be set (regarding the attribute behaviour which can be set to merge or delete - see Managing the Update Behaviour of Documents). After setting the childnodes, the childnode name will be protected regardless of whether the childnode name was protected before or not.
reprotectIf the corresponding childnode name in the repository is protected, the protection will be removed and then the corresponding childnodes from the XML will be set (regarding the attribute behaviour which can be set to merge or delete - see Managing the Update Behaviour of Documents). After setting the childnodes, the childnode name will only be protected if it was protected before. (Therefore the childnode name will never be protected if the document is newly created.)

The following example shows different protection use cases:

<document nodeType="sophora-content-nt:story" externalID="import_000001" xmlns="http://www.sophoracms.com/import/2.8">
  <properties>
    [...]
  </properties>
  <childNodes>
    <updateBehaviour>
      <!-- Remove the copytext in the repository (if it exists) and protect it after setting it new. -->
      <childNode name="sophora-content:copytext" behaviour="delete" protectionInstruction="forceProtect" />
      <!-- Merge the teaser images in the repository (if existing) with the teaser images in the XML and protect them after merging.
           (But: If the teaser images are already protected in the repository, the existing teaser images are not touched and the
           teaser images in the XML are ignored!) -->
      <childNode name="sophora-content:teaserImage" behaviour="merge" protectionInstruction="protect" />
      <!-- Remove the protection of the story teasers in the repository (if it exists) and set the story teasers from the XML. -->
      <childNode name="sophora-content:teaser" behaviour="delete" protectionInstruction="unprotect" />
      <!-- Remove the protection of the dynamic table in the repository (if it exists), remove all existing rows of the dynamic table
           and afterwards set the rows of the dynamic table from the XML. Protect the dynamic table if it was protected before. -->
      <childNode name="sophora-content:dynamictableRow" behaviour="delete" protectionInstruction="reprotect" />
    </updateBehaviour>
    <childNode nodeType="sophora-extension-nt:copytext" name="sophora-content:copytext">
      [...]
    </childNode>
    <childNode nodeType="sophora-content-nt:imageRef" name="sophora-content:teaserImage">
      [...]
    </childNode>
    <childNode nodeType="sophora-content-nt:storyRef" name="sophora-content:teaser">
      [...]
    </childNode>
    <childNode nodeType="sophora-content-nt:dynamictableRow" name="sophora-content:dynamictableRow">
      [...]
    </childNode>
    [...]
  </childNodes>
  [...]
</document>
Notice that using the attribute protectionInstruction at an element <updateBehaviour> has only effect for childnodes on document level!

Protecting and unprotecting Component boxes via Childnode Id

To protect/unprotect a component box in the component tree you have to provide the attribute protectionInstruction at the corresponding element <childNode>. This childnode must have a property sophora:childNodeId to match an existing box in the repository. If you don't provide a childnode id while using the protectionInstruction protect or forceProtect, the child node id will be created during the import process. Analogous to protecting/unprotecting childnodes via childnode name (see previous chapter), properties (see chapter Protecting and unprotecting Properties) and protecting/unprotecting the whole document (see chapter Protecting and unprotecting Documents) there are five possible values for the attribute protectionInstruction:

ValueDescription
noneIf the corresponding box in the repository is protected, this existing box with the given childnode id is not touched. The box in the XML is ignored.
Skipping the attribute protectionInstruction has the same meaning as using the value none!
unprotectIf the corresponding box in the repository (identified via childnode id) is protected, the protection will be removed and then the box from the XML will be set (regarding the attribute behaviour of a possible childnode behaviour for this childnode name, which can be set to merge or delete - see Managing the Update Behaviour of Documents). After setting the box, the protection of the box will not be (re-)established.
protectIf the corresponding box in the repository (identified via childnode id) is not protected (or if the document is newly created), the box from the XML will be set and after setting it, the box will be protected.
If the corresponding box in the repository (identified via childnode id) is protected, the box from the XML will be ignored!
forceProtectIf the corresponding box in the repository (identified via childnode id) is protected, the protection will be forced to be removed and then the corresponding box from the XML will be set (regarding the attribute behaviour of a possible childnode behaviour for this childnode name, which can be set to merge or delete - see Managing the Update Behaviour of Documents). After setting the box, the box will be protected regardless of whether the box was protected before or not.
reprotectIf the corresponding box in the repository (identified via childnode id) is protected, the protection will be removed and then the corresponding box from the XML will be set (regarding the attribute behaviour of a possible childnode behaviour for this childnode name, which can be set to merge or delete - see Managing the Update Behaviour of Documents). After setting the box, the box will only be protected if it was protected before. (Therefore the box will never be protected if the document is newly created.)

The following example shows different protection use cases:

<document nodeType="sophora-content-nt:story" externalID="import_000001" xmlns="http://www.sophoracms.com/import/2.8">
  <properties>
    [...]
  </properties>
  <childNodes>
    <!-- Remove the existing box with childnode id "-5896417165016167967" from repository, set this box from XML
         and protect it after setting it. -->
    <childNode nodeType="sophora-content-nt:storylist" name="sophora-content:box" protectionInstruction="forceProtect">
      <properties>
        <property name="sophora:childNodeId">
          <value>-5896417165016167967</value>
        </property>
        <property name="sophora-extension:title">
          <value>Box title</value>
        </property>
      </properties>
      <childNodes>
        <childNode nodeType="sophora-content-nt:storyref" name="sophora-extension:teaser">
          <properties>
            <property name="sophora:reference">
              <value>external_id_000001</value>
            </property>
          </properties>
          <childNodes />
          <resourceList />
        </childNode>
      </childNodes>
      <resourceList />
    </childNode>
    <!-- Create this box new (no childnode id is set!) and protect it. -->
    <childNode nodeType="sophora-content-nt:storylist" name="sophora-content:box" protectionInstruction="protect">
      <properties>
        <property name="sophora-extension:title">
          <value>Box title</value>
        </property>
      </properties>
      <childNodes>
        <childNode nodeType="sophora-content-nt:storyref" name="sophora-extension:teaser">
          <properties>
            <property name="sophora:reference">
              <value>external_id_000002</value>
            </property>
          </properties>
          <childNodes />
          <resourceList />
        </childNode>
      </childNodes>
      <resourceList />
    </childNode>
    <!-- Remove the existing box with childnode id "-5896417165016167968" from repository, set this box from XML
         and protect it after setting it. (But: If the box exists in the repository and is already protected
         in the repository, the existing box is not touched and the box in the XML is ignored!) -->
    <childNode nodeType="sophora-content-nt:storylist" name="sophora-content:box" protectionInstruction="protect">
      <properties>
        <property name="sophora:childNodeId">
          <value>-5896417165016167968</value>
        </property>
        <property name="sophora-extension:title">
          <value>Box title</value>
        </property>
      </properties>
      <childNodes>
        <childNode nodeType="sophora-content-nt:storyref" name="sophora-extension:teaser">
          <properties>
            <property name="sophora:reference">
              <value>external_id_000003</value>
            </property>
          </properties>
          <childNodes />
          <resourceList />
        </childNode>
      </childNodes>
      <resourceList />
    </childNode>
  </childNodes>
</document>
Technically it is possible to use the attribute protectionInstruction at any childnode where the property sophora:childNodeId can be set. But this makes little sense because the deskclient supports only the protection of boxes in the component tree - the protection of particular components is not possible.

Fields <fields>

Within the <fields> element, several meta-information about the document to import can be provided. The subsequent table contains the possible child elements in the exact order in which they must appear inside the <fields> element. If not stated differently, the elements may be empty.

FieldDescriptionExample
<site>The site to import the document to. This may only be emtpy, if the site can be determined from the <structureNode> element, an existing document is updated (the existing site structure is kept), the site is defined by a site mapping (see site mappings) or the property sophora.importer.defaultSite is defined (see Properties in the instance configuration file(s) 'sophora-importer_instance-NNN.properties').<site>demo</site>
<structureNode>The structure node of the site. May only be empty, if an existing document is updated (the existing location is kept), the structure node is defined by a site mapping or the property sophora.importer.defaultStructureNode is defined (see Properties in the instance configuration file(s) 'sophora-importer_instance-NNN.properties'\).<structureNode>/multimedia/bilder </structureNode>
<categories>The categories that are assigned to this document. Every assigned category has to be provided using an individual <category> element as a child element. The content of a <category> element has to contain the entire path of the category (path elements are separated by semicolons). If no <category> element is present, either no categories are assigned or the existing ones are kept (if it is an update of an existing document).
Additionally, by using the attribute remove="true" for the categories element you can reset a document's categories.
<categories>
 <category>Politics</category>
 <category>Politics;ForeignAffairs</category>
 <category>Law</category>
 <category>Law;Justice;judgements</category>
</categories>

or
<categories remove="true" />
<idstem>The ID stem is the basis of the Sophora ID. It may be comprised of lower and upper case letters in combination with "_" and "-". This element may only be empty, if an existing document is updated or the property sophora:id is set. In the latter case the Importer tries to use this Sophora ID. If this very Id already exists in the repository, the import fails. Note: This explicit setting of the Sophora ID only works when the property sophora.contentmanager.migrationMode is set to true. This feature might be useful when migrating old data.<idstem>importimage</idstem>
<forceLock>Defines whether to break a document's lock if it is locked by another user. If this value is false and the document to import is locked by another Sophora user, the import fails. If it is true, the lock is broken, i.e. taken from the user. Possible values are true or 1 and false or 0. The default value is false.

Optional attribute timeout:
This attribute specifies a timeout in minutes for the importer for trying to obtain the lock of the document. If the document is locked by a user, the importer will show a dialog in the DeskClient to the user holding the lock, asking her to release it. The dialog will be shown every two minutes until the lock is released or the timeout is reached. If the timeout is reached, the lock will be broken if forceLock is true; otherwise, the import will fail. While waiting for the user to release the lock, the current import is put on hold and the importer will continue to process other imports.
If the timeout attribute is not set on the forceLock element and the document is locked, the importer will immediately break the lock or fail the import, depending on the element value.

Optional attribute retryInterval:
This attribute specifies an interval in minutes that the importer will use when re-trying to obtain the lock of the document. (Default: 2)
<forceLock>true</forceLock>
<forceCreate>Determines whether a new document will be created, if the specified external ID already exists. Possible values are true or 1 and false or 0 whereas false is the default value.
If the external ID of the document to import is given and the repository already contains this ID, the external ID of the document to import is modified and a new document will be created.
<forceCreate>true</forceCreate> |
<channels>Lists of channel names for which the document is activated or deactivated. See "Channels" below.

The following example shows typical fields of an import XML. Such XML will force the lock of the document to import (achieved locks of other Sophora users will be broken!) and will not create new document automatically, if the given external ID already exists in the Sophora repository:

<?xml version="1.0" encoding="UTF-8"?>
<document nodeType="sophora-content-nt:imageobject"
          xmlns="http://www.sophoracms.com/import/2.8"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <properties>
    [...]
  </properties>
  <childNodes>
    [...]
  </childNodes>
  <resourceList />
  <fields>
    <site>news</site>
    <structureNode>/multimedia/bilder</structureNode>
    <categories>
      <category>Inland</category>
      <category>Sport;Fußball;Bundesliga</category>
    </categories>
    <idstem>bundesliga</idstem>
    <forceLock>true</forceLock>
    <forceCreate>false</forceCreate>
    <enabledChannels />
    <disabledChannels />
  </fields>
  <instructions>
    [...]
  </instructions>
</document>

Instructions <instructions>

Instructions (<instructions>) can be used to define actions that are executed when a document is imported, irrespective of the fact whether a new document is created or an existing one is updated.

Possible use cases for instructions may be to add a document to one or more proposal section(s) or to publish it immediately.

The different instructions can only be executed, if the fundamental import is successful.

Instructions Concerning a Document's Life Cycle (Publishing, Deletion etc.)

For every document to import you can define life cycle concerning actions. Each of these actions has to be provided via a <lifecycleActivity> element. Such element requires the "type" attribute to be specified.

<lifecycleActivity> elements are placed within the <lifecycleActivities> element. You can define as many activities as you want. They will be performed in the order in which you put them in the XML. The following example shows the common case of publishing a document after importing it:

<?xml version="1.0" encoding="UTF-8"?>
<document nodeType="sophora-content:story"
          xmlns="http://www.sophoracms.com/import/2.8"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <properties>
    [...]
  </properties>
  <childNodes>
    [...]
  </childNodes>
  <resourceList>
    [...]
  </resourceList>
  <fields>
    [...]
  </fields>
  <instructions>
    <lifecycleActivities>
      <lifecycleActivity type="publish" />
    </lifecycleActivities>
    <proposals />
    <stickyNotes />
  </instructions>
</document>

In the table below, all possible value for the type attribute are given. (It doesn't matter whether the import created a new document or was an update of an existing document.)

TypeDescriptionResulting labeling in the deskclientPreconditionsExample
publishThe imported document will be published respectively put to the state 'publishAt' (if a 'publishAt'-Date is set) or to the state 'prePublished' (if the document type of the document has the sophora mixin 'prePublishRequired') .Green status icon (possibly with clock or cog wheel icon)The document is not deleted, is not in the state 'prePublished' and this very version of the document is not online already.<lifecycleActivity type="publish" />
finishPrePublishingThe imported document will be published, also if it is in the state 'prePublished'.Green status iconThe document is not deleted and this very version of the document is not online already.<lifecycleActivity type="finishPrePublishing" />
releaseThe imported document will be released.Dark yellow status icon with three dotsThe document is not deleted, is not released already or in the state 'publishAt' or 'prePublished' and this very version of the document is not online already.<lifecycleActivity type="release" />
setOfflineThe imported document will be set offline.Red overlay upon the status iconThe Document is not deleted, is not offline already and has a live version.<lifecycleActivity type="setOffline" />
deleteThe imported document will be deleted.Red status icon with "X" on itThe Document is not deleted already.<lifecycleActivity type="delete" />
deletePermanentlyThe imported document will be moved from the trash bin to the delete archive respectively deleted permanently (depending on your server configuration).Red status icon with "X" on itThe Document is in the trash bin.<lifecycleActivity type="deletePermanently" />
restoreThe imported document will be restored.Brown "in process" icon with a yellow exclamation mark overlayDocument is deleted.<lifecycleActivity type="restore" />
createVersionCreates a version of the given document.Does not affect the status of the document.Document is in working state and was altered since its last versioning.<lifecycleActivity type="createVersion" />
keepStatePreserves the previous state of the document.Depends on previous state.none.<lifecycleActivity type="keepState" />
keepState (with fallback)A fallback activity can be defined for the keepState activity, in case the document does not already exist. A fallback activity can be any lifecycle activity except for keepState.Depends on previous state.none.<lifecycleActivity type="keepState" fallback="publish" />
The life cycle activities are performed after the imported document is saved so that they cannot cause the Importer to fail. If a precondition of a life cycle action is violated, an corresponding warning is written to the logfile and the Importer proceeds with the next task.

The subsequent example displays a (slighty made-up) example where the imported document will be completed, published twice, set offline, deleted and restored again:

<?xml version="1.0" encoding="UTF-8"?>
<document nodeType="sophora-content:story"
          xmlns="http://www.sophoracms.com/import/2.8"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <properties>
    [...]
  </properties>
  <childNodes>
    [...]
  </childNodes>
  <resourceList>
    [...]
  </resourceList>
  <fields>
    [...]
  </fields>
  <instructions>
    <lifecycleActivities>
      <lifecycleActivity type="release" />
      <lifecycleActivity type="publish" />
      <!-- This action is not possible, because the document is published already: -->
      <lifecycleActivity type="publish" />
      <lifecycleActivity type="setOffline" />
      <lifecycleActivity type="delete" />
      <lifecycleActivity type="restore" />
    </lifecycleActivities>
    <proposals />
    <stickyNotes />
  </instructions>
</document>

The third activity (the second "publish") doesn't break down the whole process but rather produces a warning that the document could not be published again. The subsequent activities are executed properly.

Adding a Document to Proposal Sections

For every imported document you can create one or more proposals. Each proposal can be sent to several proposal sections. The XML fragment for a single proposal looks like this:

<proposal startDate="2009-08-20T09:00:00+02:00" endDate="2009-09-20T09:00:00+02:00">
  <proposalSections>
    <proposalSection priority="LOW">
      <path>Politik</path>
      <path>Inland</path>
    </proposalSection>
  </proposalSections>
  <comment>Bitte schnellstmöglich bearbeiten.</comment>
  <sender>mueller</sender>
</proposal>

The following table summarises the possible elements and attributes of this XML part:

Element/AttributeMandatoryDescriptionExample
<proposal>NoThe root element for a proposal. It has to be placed within the <proposal> element.<proposal endDate="2009-09-20T09:00:00+02:00">
 [...]
</proposal>
startDate (attribute)NoDetermines when a document starts being offered in the proposal section. Format: "ISO 8601"startDate=
"2009-08-20T09:00:00+02:00"
endDate (attribute)NoDetermines when a document stops being offered in the proposal section. Format: "ISO 8601"endDate=
"2009-09-20T09:00:00+02:00"
newDocumentProposal (attribute)NoThis attribute defines whether only newly created documents are proposed. Is this attribute true (or "1"), no proposal is created, if the import just updates an existing document. Default value is false.newDocumentProposal="true"
<proposalSections>YesContains the proposal sections where the document will be offered.<proposalSections>
  [...]
</proposalSections>
<proposalSection>YesDefines the individual proposal section. This element is required at least one time.<proposalSection>
  [...]
</proposalSection>
<path>YesThe <path> elements within <proposalSection> specify the path of the proposal section.<proposalSection>
  <path>Politik</path>
  <path>Inland</path>
</proposalSection>
priority (attribute)NoAttribute of the <proposalSection> element. It specifies the priority of the proposal within a particular section. Supported values: "LOW", "MEDIUM", "HIGH". It defaults to "MEDIUM" if not present.<proposalSection priority="HIGH">
<comment>NoA comment about this proposal. This element may be empty.<comment>Bitte schnellstmöglich bearbeiten.</comment>
<sender>NoThe originator of this proposal which has to be a valid Sophora user from the repository. If this element is omitted, the user who connected the Importer with the repository is used instead.<sender>mueller</sender>

The subsequent example presents a document where three different proposals will be created after importing. The first proposal is placed in two different proposal sections ("Mediathek" and "Marktplatz/Medien") whereas the other two proposals are assigned to only one proposal section each ("Sport" and "Team Kultur/Ausstellungen" respectively). Various priorities are assigned to proposals in different sections.

<?xml version="1.0" encoding="UTF-8"?>
<document nodeType="sophora-content:story"
          xmlns="http://www.sophoracms.com/import/2.8"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <properties>
    [...]
  </properties>
  <childNodes>
    [...]
  </childNodes>
  <resourceList>
    [...]
  </resourceList>
  <fields>
    [...]
  </fields>
  <instructions>
    <lifecycleActivities />
    <proposals>
      <proposal startDate="2009-08-20T09:00:00+02:00" endDate="2009-09-20T09:00:00+02:00">
        <proposalSections>
          <proposalSection priority="HIGH">
            <path>Mediathek</path>
          </proposalSection>
          <proposalSection>
            <path>Marktplatz</path>
            <path>Medien</path>
          </proposalSection>
        </proposalSections>
        <comment>Bitte schnellstmöglich bearbeiten.</comment>
        <sender>mueller</sender>
      </proposal>
      <proposal endDate="2009-09-23T09:00:00+02:00">
        <proposalSections>
          <proposalSection priority="MEDIUM">
            <path>Sport</path>
          </proposalSection>
        </proposalSections>
      </proposal>
      <proposal>
        <proposalSections>
          <proposalSection priority="LOW">
            <path>Team Kultur</path>
            <path>Austellungen</path>
          </proposalSection>
        </proposalSections>
      </proposal>
    </proposals>
    <stickyNotes />
  </instructions>
</document>

Sticky Notes of a Document

Sticky notes can also be created, edited and removed when importing a document.

The XML fragment to add or edit a sticky note looks like this.

<stickyNote stickyNoteId="my-sticky-note" >This is an important message.</stickyNote>

If a sticky note with the given ID already exists for the document it will be overwritten with the one from the XML. Otherwise a new sticky note with the given ID will be created.

To remove a sticky note from a document via the importer you will have to use the remove attribute:

<stickyNote stickyNoteId="my-sticky-note" remove="true"></stickyNote>

The following table summarises the possible attributes of this XML part:

AttributeMandatoryDescriptionExample
stickyNoteIdYesThe ID of the sticky note to add/edit/remove.stickyNoteId="my-sticky-note"
colorNoThe background color of the sticky note. Must be of the format "red,green,blue", where the three values are integers in the range 0-255.color="255,255,0"
removeNoIf set to "true" the sticky note will be removed from the documentremove="true"

The subsequent example presents a document where three different sticky notes will be added, edited and removed. As you may notice the sticky note elements to add and edit a sticky note look exactly the same. Editing a sticky note will be determined by checking if it already exists in the document. This is an internal process.

<?xml version="1.0" encoding="UTF-8"?>
<document nodeType="sophora-content:story"
          xmlns="http://www.sophoracms.com/import/3.3"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <properties>
    [...]
  </properties>
  <childNodes>
    [...]
  </childNodes>
  <resourceList>
    [...]
  </resourceList>
  <fields>
    [...]
  </fields>
  <instructions>
    <lifecycleActivities />
    <proposals />
    <stickyNotes>
        <stickyNote stickyNoteId="new-sticky-note" >New Sticky Note</stickyNote>
        <stickyNote stickyNoteId="edit-sticky-note" color="123,123,123" >Edited Sticky Note</stickyNote>
        <stickyNote stickyNoteId="delete-sticky-note" remove="true" ></stickyNote>
    </stickyNotes>
  </instructions>
</document>

Multiple Documents in XML

In some cases it might be handy to provide multiple documents within one XML import file. To do so, you have to encapsulate the <document> elements in the parent element <documents>:

<?xml version="1.0" encoding="UTF-8"?>
<documents xmlns="http://www.sophoracms.com/import/2.8"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <document nodeType="sophora-content:story">
    <properties>
      [...]
    </properties>
    <childNodes>
      [...]
    </childNodes>
    <resourceList>
      [...]
    </resourceList>
    <fields>
      [...]
    </fields>
    <instructions>
      [...]
    </instructions>
  </document>
  [...]
  <document nodeType="sophora-content:video">
    [...]
  </document>
  [...]
  <document nodeType="sophora-content:image">
    <properties>
      [...]
    </properties>
    <childNodes>
      [...]
    </childNodes>
    <resourceList>
      [...]
    </resourceList>
    <fields>
      [...]
    </fields>
    <instructions>
      [...]
    </instructions>
  </document>
</documents>

Importing Nodetype Information

Be Careful
The import of nodetype information is a powerful feature which can make migrations much easier. But in every day import scenarios you may not want to allow nodetype modifications (and creations) via import. Therefore you should consider carefully which roles and rights you give to the sophora user the importer is started with (see the sophora configuration property sophora.contentmanager.user). The modifications of nodetypes is bound to the sophora system right "administrator" - so if the importer user does not have this right, it is not possible to modify nodetypes.

It is possible to import nodetype definitions, default property configurations, default childnode configurations and nodetype configurations. Therefore, the root element of an import XML file has to be <sophora>. If you want to import nodetype information you have to put the element <nodetypes> as direct and first child element of <sophora>.
The following table shows, which element you have to place inside the <nodetypes> element in order to import the desired node type information:

Importing...Element inside the element
Nodetype definitions (CNDs)<nodetypeDefinitions>
Default property configurations<defaultPropertyConfigurations>
Default childnode configurations
Nodetype configurations<nodetypeConfigurations>

Each of those four elements must have exactly one element as child, namely the element <data>. The required attribute "specificationMethod" specifies how the data is represented: Either as file "reference" - in this case the text content of the element <data> is just a (relative or absolute) file reference to the corresponding CND or XML file - or "inline". In the latter case the whole content of the CND file or the XML configuration file (including the inline DTD!) is placed as CDATA content in the element <data>.

The first example shows the import of a nodetype definition (CND) via reference:

<?xml version="1.0" encoding="UTF-8"?>
<sophora xmlns="http://www.sophoracms.com/import/2.8"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <nodetypes>
    <nodetypeDefinitions>
      <data specificationMethod="reference">story.cnd</data>
    </nodetypeDefinitions>
  </nodetypes>
</sophora>

The referenced file story.cnd has to lie in the same directory as the import XML file. Alternatively, you can specify a relative path like import/cnds/story.cnd or an absolute path like file:C:/importer/import/cnds/story.cnd. Due to security reasons a relative or an absolute path cannot reference a file in a higher folder hierarchy.

The second example has the same effect as the first example - but in this case the content of the beforehand referenced CND file is put inline in the sophora import XML document:

<?xml version="1.0" encoding="UTF-8"?>
<sophora xmlns="http://www.sophoracms.com/import/2.8"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <nodetypes>
    <nodetypeDefinitions>
      <data specificationMethod="inline"><![CDATA[<'content-nt'='http://www.content.de/content-nt/1.0'>
<'nt'='http://www.jcp.org/jcr/nt/1.0'>
<'content'='http://www.content.de/content/1.0'>
<'sophora-nt'='http://www.subshell.com/sophora-nt/1.0'>
 
['content-nt:story'] > nt:base
  orderable
  - content:emailContact (string)
  - content:contentType (string)
  - content:referenceDate (date)
  + content:broadcasts ('sophora-nt:reference') multiple]]></data>
    </nodetypeDefinitions>
  </nodetypes>
</sophora>

The next example shows the import of nodetype configurations via reference:

<?xml version="1.0" encoding="UTF-8"?>
<sophora xmlns="http://www.sophoracms.com/import/2.8"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <nodetypes>
    <nodetypeConfigurations>
      <data specificationMethod="reference">content-nt_storyRef.config.xml</data>
    </nodetypeConfigurations>
  </nodetypes>
</sophora>

In the following example the content of the config XML file of the previous example is put inline in the sophora import XML document:

<?xml version="1.0" encoding="UTF-8"?>
<sophora xmlns="http://www.sophoracms.com/import/2.8"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <nodetypes>
    <nodetypeConfigurations>
      <data specificationMethod="inline"><![CDATA[<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE descriptions [
<!ELEMENT descriptions (nodetype*)>
 
[...]
 
<!ATTLIST childnode summary (true|false) #IMPLIED>]>
 
<descriptions>
  <nodetype name="content-nt:storyRef" authorisation="false" unsearchable="false">
    <mixins />
    <label>Artikelreferenz</label>
    <searchResultProperties />
    <referenceNodeTypes />
    <variantValidChildNodeNames />
    <icon>
      <binarydata mimetype="image/gif">data:;base64,R0l [...]</binarydata>
    </icon>
    <displayTab name="system">
      <property name="sophora:overridingProperties" />
      <property name="sophora:reference" />
    </displayTab>
    <displayTab name="base">
      <property name="content:headline" />
      <property name="content:subHeadline" />
      <property name="content:teaserHeadline" />
      <property name="content:teaserText" />
      <property name="content:teaserLabel" />
      <property name="content:teaserIcon" />
      <property name="content:teaserAttribute" />
      <childnode name="content:teaserImages" summary="false">
        <inputFieldType>com.subshell.sophora.eclipse.imageReferenceEditorComponent</inputFieldType>
      </childnode>
    </displayTab>
    <displayTab name="meta">
      <property name="sophora:endDate" />
      <property name="sophora:startDate" />
    </displayTab>
  </nodetype>
</descriptions>]]></data>
    </nodetypeConfigurations>
  </nodetypes>
</sophora>

It is also possible to put the different kinds of nodetype modifications within one sophora XML document. Additionally you can have mutiple elements of the same kind of nodetype modification.
The following example shows this use case: Two different nodetype definitions are imported together with default property configurations, default childnode configurations and nodetype configurations. All data is specified by using file references - it would also be possible to use inline data or to mix the use of file references and inline data.

<?xml version="1.0" encoding="UTF-8"?>
<sophora xmlns="http://www.sophoracms.com/import/2.8"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <nodetypes>
    <nodetypeDefinitions>
      <data specificationMethod="reference">story.cnd</data>
    </nodetypeDefinitions>
    <nodetypeDefinitions>
      <data specificationMethod="reference">image.cnd</data>
    </nodetypeDefinitions>
    <defaultPropertyConfigurations>
      <data specificationMethod="reference">default_property_configurations.config.xml</data>
    </defaultPropertyConfigurations>
    <defaultChildnodeConfigurations>
      <data specificationMethod="reference">default_childnode_configurations.config.xml</data>
    </defaultChildnodeConfigurations>
    <nodetypeConfigurations>
      <data specificationMethod="reference">nodetype_configurations.config.xml</data>
    </nodetypeConfigurations>
  </nodetypes>
</sophora>

Importing Structure Nodes

It is also possible to import structure nodes. Therefore, the root element of an import XML file has to be <sophora>. This element contains the two child elements <structure> and <documents> for the structure node information and for the structure node documents respectively. A structure node is connected with its structure node document via the external ID of the document. If no structure node document is given in the import XML, the Importer implicitly creates a new one. The following example demonstrates the buildup of structure import XML:

<?xml version="1.0" encoding="UTF-8"?>
<sophora xmlns="http://www.sophoracms.com/import/2.8"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <structure>
        <site path="/demosite" menuText="Home" url="http://localhost:8080/live"
            isActive="true" state="inProcess" isLiveVersionAvailable="true" redirectUrl="" redirectActive="false"
            structureNodeDocumentExternalId="f3d1d72c-5667-493d-899c-338c49e65466" hierarchyDocumentExternalId=""
            defaultDocumentExternalId="1d61d398-ed81-4969-b5f9-0bb4f335a7ee">
            <imagevariant imagevariantExternalId="2971a523-7b7a-4bc3-8cbf-fc0ef3862b28" />
            <imagevariant imagevariantExternalId="8a2104b0-d3ec-48cd-951a-fc263aa6af4e" />
            <imagevariant imagevariantExternalId="8538bdca-8d9e-45c6-8b8e-73b62c5c326b" />
            <imagevariant imagevariantExternalId="aeeacd6f-a5b4-4a26-9d0f-91d78c41c7af" />
            <imagevariant imagevariantExternalId="fc5a3cb3-320d-480d-b8d6-7cb705c359ec" />
            <imagevariant imagevariantExternalId="95cd64c6-6611-3f2b-a958-375aa61ee7a8" />
            <imagevariant imagevariantExternalId="39249fa6-b17b-4581-adc5-171e52441208" />
        </site>
        <structureNode path="/demosite/home" menuText="Home"  isActive="true" [...]/>
        <structureNode path="/demosite/trendcities" menuText="City Reports" isActive="true" [...]/>
        <structureNode path="/demosite/trendcities/copenhagen" menuText="Copenhagen" isActive="true" [...]/>
        <structureNode path="/demosite/trendcities/copenhagen/cphvision" menuText="CPH Vision" isActive="true" [...]/>
        <structureNode path="/demosite/trendcities/barcelona" menuText="Barcelona" isActive="true" [...]/>
        <structureNode path="/demosite/trendcities/paris" menuText="Paris" isActive="true" [...]/>
        <structureNode path="/demosite/trendcities/london" menuText="London" isActive="true" [...]/>
        <structureNode path="/demosite/forecasts" menuText="Forecast" isActive="true" [...]/>
        <structureNode path="/demosite/forecasts/2012" menuText="2012" isActive="true" [...]/>
        <structureNode path="/demosite/forecasts/2013" menuText="2013" isActive="true" [...]/>
        <structureNode path="/demosite/timeline" menuText="Timeline" isActive="true" [...] />
        <structureNode path="/demosite/demo" menuText="" isActive="false" [...]/>
    </structure>
    <documents>
        <document nodeType="sophora-nt:structureNodeDocument" externalID="f3d1d72c-5667-493d-899c-338c49e65466">
            <properties />
            <childNodes />
            <resourceList />
            <fields>
                <site>demosite</site>
                <structureNode>/</structureNode>
                <categories />
                <idstem>structurenodedocument</idstem>
                <forceLock>true</forceLock>
                <forceCreate>false</forceCreate>
                <enabledChannels>mediathek</enabledChannels>
                <disabledChannels />
            </fields>
            <instructions>
                <lifecycleActivities />
                <proposals />
                <stickyNotes />
            </instructions>
        </document>
        [...]
    </documents>
</sophora>

Sites and Structure Nodes

The elements <site> and <structureNode> are encapsulated by the <structure> element. Additionally to the attributes of a <structureNode> element, the <site> element has the attribute "url". Furthermore, a site contains images variants as childelements <imagevariant> which have to be provided with an image variant's external Id. If the site already exists, the image variants are replaced by those from the XML. This only happens when at least one image variant is given in the import XML. Otherwise the image variants of the existing site are kept.

In the following explanations about the structure import it will not be distinguished between site and structure. They are treated equally

Properties of Sites and Structure Nodes

The properties of sites and structure nodes are modeled as attributes in the Sophora XML. Solely the image variants are modeled as childelements. The table below gives an overview about the existing attributes.

AttributeSiteStructure nodeDescription
path[YES][YES]The path, including the site.
menuText[YES][YES]This text appears in the menus of the deliveries.
url[YES][NO]The URL of the site.
isActive[YES][YES]Defines whether the according menu item shall be displayed.
state[YES][YES]A structure node's state ("inProcess", "published" or "disabled").
isLiveVersionAvailable[YES][YES]Whether a structure node has been published already.
redirectUrl[YES][YES]A redirect Url for this structure node.
redirectActive[YES][YES]Whether a redirection is activated or not.
structureNodeDocumentExternalId[YES][YES]The external ID of the structure node document.
hierarchyDocumentExternalId[YES][YES]The external ID of the hierarchy document.
defaultDocumentExternalId[YES][YES]The external ID of the index document.

Ordering Structure Nodes in the XML

The order of the structure nodes has to be respected.

Structure node elements have to be specified according to the depth-first search principle. Structure nodes are created in the exact order in which they occure in the XML. Therefore, a child structure node cannot be created before its parent node.

Merging Structure Node Properties

When a structure node that already exists is imported, the properties given in the XML and in the repository are merged. An overview of the possible cases is given below. The new version of the structure node refers to the import XML and the old version is the one from the repository.

  1. A property is set in the in the new version but not in the old: The property is set (initially) in the repository
  2. A property already exists in the old version and a) the according attribute does not exist in the new version: The property will not be modified. b) the according attribute is present but contains an empty string: The property will be deleted.
  3. A property exists in both the new and the old version: The property value in the repository will be overwritten by the new version's value.

Merging structure node documents works in the same way as the merge of common documents (see section Managing the Update Behaviour of Documents\).

Merging Structures

With an import operation you cannot delete structure nodes. If a structure node from the repository does have childnodes but the import XML lacks these childnodes, they remain unmodified in the repository.

The Status of a Structure Node

The status of a new structure node depends on the status of its structure node document. If there is no structure node document, the structure node's status is taken. The decision is made according to the value of the attribute isLiveVersionAvailable of the <structureNode> element. Is it set to true, the structure node will be published.

The procedure implies that the structure after importing does not have the exact same status as the structure that has been exported. This is due to the export mechanism for structure nodes: While exporting a structure node its current status is exported as well. The last live version is not considered.

Importing Disabled Structure Nodes

It is possible to import disabled structure nodes. In that case the attribute state of a <structureNode> element has the value "disabled". If a structure node is enabled in the new version (the version that updates an existing document in the repsoitory ) but the old version is disabled, the structure node will be updated so that it is enabled afterwards.

Importing Stucture Nodes That Are Marked as Deleted

Structure nodes in Sophora cannot be deleted immediately. Instead, they are marked as deleted. The same happens to their structure node documents.

If you import a structure node that is marked as deleted in the repository, it will be recovered. If the new version of the structure node has no reference to a structure node document, a new one is created and referenced. However, if an external ID of the structure node document is provided in the new version and it matches the external ID of the old structure node document, the old structure node document is recovered as well. Subsequently, both structure node documents are merged (which works in the same way as normal documents; see above). This is only valid if no alternative rules are specified (see Managing the Update Behaviour of Documents\).

Referencing Documents

During the import operation the Importer tries to refer to the default document, to the structure node document and to the hierarchy document using corresponding UUIDs. If an according document does not exist (yet) in the repository, an external reference is established (using the external ID). As soon as the referenced document is imported the external reference is removed and the UUID is taken instead.

Importing Multiple Sites

It is possible to import an entire strucutre tree with all sites from a repository.

Importing Users and Roles

Importing Roles

To import roles, the root element of an import XML file has to be <sophora>. This element must contain a child element <roles> which encapsulate the <role> elements. Each <role> element represents one role to import. A role is identified by his roleId (since version 2.4 of the Sophora-Import Xml. In older versions, the name is the identifier. Since 2.4 the name must be unique. If a role with the given name already exists, but with a different roleId, the import will fail.). If a role with the given roleId already exists it will be overwritten with the role specified in the XML. A role consists of the parts <systemPermissions>, <structureNodePermission>, <documentPermissions>, <tabPermissions> and <proposalSectionPermissions>. For further information about these permissions see the documentation for administrators. All of these permissions consist of a set of concrete permissions. If a role grants e.g. all system permissions, not every permission needs to be mentioned in the xml. Instead the pseudo permission 'all' can be used.

A structure permission specifies the individual permissions per structure node. These permissions can be passed to sub nodes by using the attribute  applyToAllSubNodes="true".

If the role exists, the permissions are overwritten. After the import, the role will have only the permissions from the import XML. If e.g. a proposal section exists in the repository but is not mentioned in the import XML, or has no <permission> elements in the XML, the role will have no permissions for it after the import.

If referenced nodetypes or referenced structure nodes do not exist, they are ignored.

The following example shows an import XML for importing a role:

<?xml version="1.0" encoding="UTF-8"?>
<sophora xmlns="http://www.sophoracms.com/import/2.8"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
[...]
 <roles>
    [...]
    <role name="example" roleId="exampleRoleId">
      <systemPermissions>
        <permission>breakLock</permission>
        <permission>editCategory</permission>
        <permission>editHtmlParagraph</permission>
      </systemPermissions>
 
      <structureNodePermissions>
        <structureNodePermission structureNode="/demosite">
          <permission>all</permission>
        </structureNodePermission>
        <structureNodePermission structureNode="/demosite/home">
          <permission>editStructure</permission>
          <permission>editNavigation</permission>
          <permission>editConfiguration</permission>
        </structureNodePermission>
        <structureNodePermission structureNode="/demosite/trendcities" applyToAllSubNodes="true">
          <permission>readDocuments</permission>
        </structureNodePermission>
       [...]
       </structureNodePermissions>
 
      <documentPermissions>
        <documentPermission nodetype="sophora-content-nt:story">
          <permission>all</permission>
        </documentPermission>
        <documentPermission nodetype="sophora-demo-nt:basicfields">
          <permission>release</permission>
          <permission>publish</permission>
          <permission>restore</permission>
          <permission>delete</permission>
          <permission>save</permission>
          <permission>create</permission>
          <permission>read</permission>
        </documentPermission>
        <documentPermission nodetype="sophora-content-nt:filter">
          <permission>restore</permission>
          <permission>delete</permission>
        </documentPermission>
        [...]
      </documentPermissions>
 
      <tabPermissions>
        <tabPermission tabExternalId="6729d6b8-cd5b-3de4-b835-0963e3062d44">
          <permission>all</permission>
        </tabPermission>
        <tabPermission tabExternalId="56dc55eb-56c8-34b2-8fe9-39956895bb36">
          <permission>readTab</permission>
        </tabPermission>
        <tabPermission tabExternalId="external_id_tab_0001">
          <permission>readTab</permission>
        </tabPermission>
        [...]
      </tabPermissions>
 
      <proposalSectionPermissions>
        <proposalSectionPermission>
          <proposalSection>
            <path>homepage</path>
          </proposalSection>
          <permission>readProposals</permission>
        </proposalSectionPermission>
        <proposalSectionPermission>
           <proposalSection>
            <path>homepage</path>
            <path>readtopublish</path>
          </proposalSection>
        <permission>addProposals</permission>
        </proposalSectionPermission>
        <proposalSectionPermission>
          <proposalSection>
            <path>news</path>
          </proposalSection>
        <permission>all</permission>
        </proposalSectionPermission>
        <proposalSectionPermission>
          <proposalSection>
            <path>news</path>
            <path>sport</path>
          </proposalSection>
          <permission>readProposals</permission>
          <permission>editProposals</permission>
          <permission>addProposals</permission>
        </proposalSectionPermission>
        <proposalSectionPermission>
          <proposalSection>
            <path>news</path>
            <path>sport</path>
            <path>handball</path>
          </proposalSection>
          <permission>readProposals</permission>
          <permission>editProposals</permission>
        </proposalSectionPermission>
        <proposalSectionPermission applyToAllSubSections="true">
          <proposalSection>
            <path>news</path>
            <path>sport<path>
            <path>bundesliga<path>
          </proposalSection>
          <permission>readProposals</permission>
          <permission>editProposals</permission>
          <permission>addProposals</permission>
        </proposalSectionPermission>
      </proposalSectionPermissions>
    </role>
    [...]
  </roles>
</sophora>
Permissions

The following sections shows the different permissions and their valid values.

System permissions
  • administrator
  • breakLock
  • deleteReferenced
  • deleteFromTrash
  • editCategory
  • editHtmlParagraph
  • finishPrePublish
  • massImageUpload
  • massOperations
  • setOfflineReferenced
  • all
Structure permissions
  • editDocuments
  • readDocuments
  • editStructure
  • editNavigation
  • editConfiguration
  • publishDefaultDocument
  • all
Document permissions
  • release
  • publish
  • restore
  • delete
  • save
  • create
  • read
  • offline
  • clone
  • protect
  • all
Tab permissions
  • readTab
  • editTab
  • all
Proposal Section permissions
  • readProposals
  • editProposals
  • addProposals
  • deleteProposals
  • all

Importing Users

To import users, the root element of an import XML file has to be <sophora>. This element must contain a child element <users> which encapsulate the <user> elements. Each <user> element represents one user to import. During the import process each user is identified by its name. So if a user with a given name already exists this user will be overwritten. But when a property is absent in the XML, then it is not overwritten.  Due to  security reasons the passwords are not exported per default. Nevertheless it is possible to export the password as hashes by activating the corresponding option in the export dialog. These passwords are automatically imported. Referenced roles or sites which do not exist are ignored.

The following example demonstrates the buildup of users import XML:

<?xml version="1.0" encoding="UTF-8"?>
<sophora xmlns="http://www.sophoracms.com/import/2.8"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  [...]
  <users>
    <user>
      <username>doe</username>
      <passwordChangeable>true</passwordChangeable>
      <firstName>John</firstName>
      <lastName>Doe</lastName>
      <comment>comment</comment>
      <company>Doe Services</company>
      <department>engineering</department>
      <mail>doe@services.com</mail>
      <phone>0123456789</phone>
      <initials>j.d</initials>
      <validUntil>2099-07-26T15:32:00.000+02:00</validUntil>
      <incorrectLogins>2</incorrectLogins>
      <lastLogin>2012-09-07T09:37:53.716+02:00</lastLogin>
      <roles>
        <role name="admin" roleId="adminRoleId"/>
        <role name="ReadOnlyRole" roleId="readOnlyRoleId"/>
      </roles>
      <previews>
        <preview externalID="f3d1d72c-5667-493d-899c-338c49e65466">http://www.example.org/previewurl</preview>
        <preview externalID="550e8400-e29b-41d4-a716-446655440000">http://www.example.org/previewurl2</preview>
      </previews>
    </user>
    [...]
  </users>
  [...]
</sophora>

Importing Categories

To import categories, the root element of an import XML file has to be <sophora>. This element must contain a child element <categories> which encapsulates the <category> elements. Each <category> element represents one category. The content of a <category> element has to contain the entire path of the category (path elements are separated by semicolons). If a given category already exists in the repository this category is ignored during the import process.

The following example demonstrates the XML of a category import:

<?xml version="1.0" encoding="UTF-8"?>
<sophora xmlns="http://www.sophoracms.com/import/2.8"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <categories>
    <category>Styles</category>
    <category>Styles;Casual</category>
    <category>Styles;Catwalk</category>
    <category>Styles;Feminine</category>
    <category>Styles;Formal</category>
    <category>Styles;Printed / Embellished</category>
    <category>Types</category>
    <category>Types;Accessories</category>
    <category>Types;Accessories;Bags</category>
    <category>Types;Accessories;Belts</category>
    <category>Types;Accessories;Jewellry</category>
    <category>Types;Accessories;Shoes</category>
    <category>Types;Blouses</category>
    <category>Types;Dresses</category>
    <category>Types;Jackets</category>
    <category>Types;Knitwear</category>
    <category>Types;Outerware</category>
    <category>Types;Prints</category>
    <category>Types;Shirts</category>
    <category>Types;Skirts</category>
    <category>Types;Sweatshirts</category>
    <category>Types;T-Shirts</category>
  </categories>
</sophora>

Importing Proposal Sections

It is also possible to import an entire structure of proposal sections. To import proposal sections, the root element of an import XML file has to be <sophora>. This element must contain a child element <proposalSections> which encapsulates the <proposalSection> elements. Each <proposalSection> element represents one proposal section to import. During the import process each proposal section is identified by its full path name. So if a proposal section with a given name and path already exists this proposal section won't be imported.

The following example demonstrates the buildup of proposal sections import XML:

<?xml version="1.0" encoding="UTF-8"?>
<sophora xmlns="http://www.sophoracms.com/import/2.8"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <proposalSections>
    <proposalSection>
      <path>broken-links</path>
    </proposalSection>
    <proposalSection>
      <path>Homepage</path>
    </proposalSection>
    <proposalSection>
      <path>Homepage</path>
      <path>News</path>
    </proposalSection>
    <proposalSection>
      <path>Homepage</path>
      <path>Contact</path>
    </proposalSection>
  </proposalSections>
</sophora>

Mergeable Properties and Child Nodes

Properties or child nodes that are marked as mergeable will not be imported by the importer. Instead they will be written into a merge info field of the imported document. This field will be used by the DeskClient to propose those properties and child nodes to be merged manually.

To mark properties or child nodes to be mergeable you can use the attribute merge.

<?xml version="1.0" encoding="UTF-8"?>
<documents xmlns="http://ww.sophoracms.com/import/3.4">
  <document externalID="$externalId1$">
    <properties>
      <property name="sophora-content:headline" merge="true">
        <value>The new headline</value>
      </property>
    </properties>
    <childNodes>
      <childNode nodeType="sophora-extension-nt:copytext" name="sophora-content:copytext" merge="true">
        </properties>
        <childNodes>
          <childNode nodeType="sophora-extension-nt:paragraph" name="sophora-extension:paragraph">
            <properties>
              <property name="sophora-extension:style">
                <value>headline</value>
              </property>
              <property name="sophora-extension:text">
                <value>...</value>
              </property>
            </properties>
            <childNodes />
            <resourceList />
          </childNode>
        </childNodes>
        <resourceList />
      </childNode>
      <childNode nodeType="sophora-content-nt:storyref" name="sophora-content:teaser" merge="true">
        <properties>
          <property name="sophora:reference">
            <value>eeeed196-343f-4c8a-bbe1-632bb0ef5fa9</value>
          </property>
        </properties>
        <childNodes />
        <resourceList />
      </childNode>
      <childNode nodeType="sophora-content-nt:storyref" name="sophora-content:teaser" merge="true">
        <properties>
          <property name="sophora:reference">
            <value>3ea5af02-b6e3-4609-84f0-1e2b67a357f1</value>
          </property>
        </properties>
        <childNodes />
        <resourceList />
      </childNode>
    </childNodes>
    <resourceList />
    <fields>
      [...]
    </fields>
    <instructions>
      [...]
    </instructions>
  </document>
</documents>
Note that you can only mark properties, the copytext and component lists (child nodes with a reference type on the document level) to be mergeable.

Asking the user to release a document lock

If the importer tries to import a document which is locked by another Sophora user, the importer can ask the user to release the lock using a dialog shown in the DeskClient. To use this feature, include the forceLock field with a timeout attribute. Until the timeout is reached, the importer will show the dialog to the user every two minutes. During this time, the importer will continue to process other imports, if they don't change the same document. Once the user releases the document lock, the importer will continue with the blocked import. In the example below, the importer will try to obtain the lock for 10 minutes:

<?xml version="1.0" encoding="UTF-8"?>
<document nodeType="sophora-content-nt:imageobject"
          xmlns="http://www.sophoracms.com/import/2.8">
  [...]
  <fields>
    <site>news</site>
    <structureNode>/multimedia/bilder</structureNode>
    <idstem>bundesliga</idstem>
    <forceLock timeout="10">true</forceLock>
    <forceCreate>false</forceCreate>
    <enabledChannels />
    <disabledChannels />
  </fields>
</document>

For more details, see the description of the forceLock element.

Channels

The document can be activated or deactivated for specific channels. This is done using the <channels> element in the <fields> element:

<?xml version="1.0" encoding="UTF-8"?>
<document nodeType="sophora-content-nt:imageobject"
          xmlns="http://www.sophoracms.com/import/3.7">
  [...]
  <fields>
    [...]
    <channels>
      <enabledChannels remove="true">
        <channel>mediathek</channel>
        <channel startDate="2012-09-07T09:37:53.716+02:00" endDate="2012-10-07T09:37:53.716+02:00">teletext</channel>
      </enabledChannels>
      <disabledChannels remove="true">
        <channel>rss</channel>
      </disabledChannels>
    </channels>
    [...]
  </fields>
</document>

The <enabledChannels> element contains zero or more <channel> elements. Each <channel> element specifies the name of a channel which the document should be activated for.

The optional remove attribute of the <enabledChannels> element can be used to reset existing channels.

The optional startDate and endDate attributes of the <channel> element can be used to enable the channel during that time only.

The <disabledChannels> element contains zero or more <channel> elements. Each <channel> element specifies the name of a channel which the document should be deactivated for.

The optional remove attribute of the <disabledChannels> element can be used to reset existing channels.

Sophora XML Schema

The Sophora XML Schema is available for download: sophora-import-3.8.0.xsd