Import XML

Managing the Update Behaviour of Documents

How to use the Importer’s options to update existing documents e.g., removing childnodes or merging new and existing childnodes.

When updating an existing document (e.g. by using its external ID; see previous section) the default behaviour is as follows:

  • Properties within the repository are merged with those from the import XML whereas the imported properties are preferred. That means properties from the import XML override properties in the repository, if they exist in both the XML and the repository. Properties that are not mentioned in the XML remain untouched. New properties (which do not exist in the repository yet) are added.
  • If a document in the XML file has one or more childnodes of a certain name (attribute „name"), all such named childnodes of the corresponding repository document are removed and replaced by the childnodes from the XML.
  • If the childnodes of the repository document, which have a certain name, are not contained in the corresponding XML document (i.e.: no childnode with this certain name exists is the XML document), the existing repository childnodes remain untouched.

Explicitly Removing Existing Childnodes

If you wish to remove childnodes with a certain name from an existing document, insert an <updateBehaviour> element to the <childNodes> element. Add a <childNode> element in here and specify the attribute "name" as well as the attribute "behaviour", which must be set to the value "delete":  behaviour="delete".

In the subsequent example, all audios and videos (as childnodes) should be removed from a podcast document (external ID is "podcast_mittags_vorgelesen"):

<?xml version="1.0" encoding="UTF-8"?>
<document externalID="podcast_mittags_vorgelesen"
          xmlns="http://www.sophoracms.com/import/4.2"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <properties>
    [...]
  </properties>
  <childNodes>
    <updateBehaviour>
      <childNode name="sophora-content:audiolist" behaviour="delete" />
      <childNode name="sophora-content:videolist" behaviour="delete" />
    </updateBehaviour>
  </childNodes>
  <resourceList/>
  <fields>
    [...]
  </fields>
  <instructions>
    [...]
  </instructions>
</document>

Merging Existing and New Childnodes

If you want to merge childnodes of an existing document with childnodes from an import XML, use the <updateBehaviour> element within the <childNodes> to define this behaviour. Therefore, add a <childNode> element and set its "behaviour" attribute to "merge". To define, which childnodes should be merged at all, you have to set the mandatory "name" attribute of the <childNode> element to the childnode name of the childnodes to be merged.

But how is detected, which childnode from the XML belongs to which childnode in the repository?

  • If a childnode id is given in the XML (property sophora:childnodeId), identical childnodes in the repository are identified via this childnode id and merged.
  • If no childnode id is given in the XML, identical childnodes in the repository are identified via referencing identical documents - so referenced documents in the repository document and the import XML are merged (and not added twice). (This proceeding is relevant only if the childnode is a reference node!)
  • Alternatively you can define the identity of childnodes (from the import XML and the repository document) by using a merge property (attribute mergeProperty="PROPERTY_NAME"). This feature might be useful, if rows of a dynamic table or the content of boxes should be updated.
Using the optional attribute "insertPosition" configures whether an additional/new childnode is added at the beginning or at the end of the list of childnodes. Possible values are "start" and "end" whereas the latter is the default (used when this attribute is omitted).

The attribute "maxNumber" is also optional. It defines the maximum number of childnodes of this very name. Depending on the insert position ("start" or "end") the list of childnodes of this name is reduced accordingly (at the beginning or end), if the maximum number of childnodes is exceeded.

The following example displays a standard way of merging using the Sophora reference property (no childnode ids are given in the XML). It shows a podcast document (external ID "podcast_mittags_vorgelesen") where an audio is added as a childnode (ID="podcast_audio_42"). Already existing audios should not be affected or modified. Furthermore, the podcast should not contain more than 30 audios and new ones shall be added at the beginning:

<?xml version="1.0" encoding="UTF-8"?>
<document externalID="podcast_mittags_vorgelesen"
          xmlns="http://www.sophoracms.com/import/4.2"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <properties>
    [...]
  </properties>
  <childNodes>
    <updateBehaviour>
      <childNode name="sophora-content:audiolist" behaviour="merge" insertPosition="start" maxNumber="30" />
    </updateBehaviour>
    <childNode nodeType="sophora-content-nt:audioref" name="sophora-content:audiolist">
      <properties>
        <property name="sophora:reference">
          <value>podcast_audio_42</value>
        </property>
        <property name="sophora-content:headline">
          <value>Überschriebene Headline</value>
        </property>
      </properties>
      <childNodes />
      <resourceList />
    </childNode>
  </childNodes>
  <resourceList/>
  <fields>
    [...]
  </fields>
  <instructions>
    [...]
  </instructions>
</document>

If the podcast already contains the audio "podcast_audio_42" childnode, it will not be added a second time. Instead, the properties are merged (in this case the "sophora-content:headline" property will be overridden).

The next example shows the merge with an arbitrary property. Consider a document with the external ID "icehockey_vancouver_group_a": It contains a dynamic table with results in the different rows which should be updated now. Assume that the rows have the childnode name "olympia:matchRow" and the nodetype is "olympia-nt:olyMatchRow". The property "olympia:matchId" is used as the mergeProperty.

<?xml version="1.0" encoding="UTF-8"?>
<document externalID="icehockey_vancouver_group_a"
          xmlns="http://www.sophoracms.com/import/4.2"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <properties>
    [...]
  </properties>
  <childNodes>
    <updateBehaviour>
      <childNode name="olympia:matchRow"
                 behaviour="merge"
                 mergeProperty="olympia:matchId" />
    </updateBehaviour>
    <childNode nodeType="olympia-nt:olyMatchRow" name="olympia:matchRow">
      <properties>
        <property name="olympia:result">
          <value>3:1</value>
        </property>
        <property name="olympia:matchId">
          <value>eh_v_gr_a_0001</value>
        </property>
      </properties>
      <childNodes />
      <resourceList />
    </childNode>
  </childNodes>
  <resourceList/>
  <fields>
    [...]
  </fields>
  <instructions>
    [...]
  </instructions>
</document>

The next example presents the merge over multiple levels of childnodes. Here, an article has an assigned box. Within this box is a reference to a citation document. The update of the citation document requires a merge on two levels: First, the box needs an identifier. This is achieved by setting mergeProperty="sophora-extension:title". On the second level, the Importer's standard behaviour applies because the childnodes can be identified by the reference property "sophora:reference". Therefore, no explicit merge property is necessary.

<?xml version="1.0" encoding="UTF-8"?>
<document externalID="test_00001"
          xmlns="http://www.sophoracms.com/import/4.2"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <properties>
    [...]
  </properties>
  <childNodes>
    <updateBehaviour>
      <childNode name="core:box"
                 behaviour="merge"
                 mergeProperty="sophora-extension:title" />
    </updateBehaviour>
    <childNode nodeType="core-nt:citationBox" name="core:box">
      <properties>
        <property name="sophora-extension:title">
          <value>Zitate</value>
        </property>
      </properties>
      <childNodes>
        <updateBehaviour>
          <childNode name="sophora-extension:teaser" behaviour="merge" />
        </updateBehaviour>
        <childNode nodeType="core-nt:citationRef" name="sophora-extension:teaser">
          <properties>
            <property name="sophora:reference">
              <value>citation_007</value>
            </property>
            <property name="core:headline">
              <value>Geschüttelt oder gerührt</value>
            </property>
            <property name="sophora:overridingProperties">
              <value>core:headline</value>
            </property>
          </properties>
          <childNodes />
          <resourceList />
        </childNode>
      </childNodes>
      <resourceList />
    </childNode>
  </childNodes>
  <resourceList/>
  <fields>
    [...]
  </fields>
  <instructions>
    [...]
  </instructions>
</document>

While merging, individual nodes may be removed specifically. This can be achieved by setting a childnode's "remove" attribute to true. Every childnode on this level with the same identity is removed, if it references the same document or if it matches the merge property's value (when this node has a configured <updateBehaviour> with an according merge property set).

The last example of this section explains how the explicit removal of a childnode works. In the following XML snippet are three different actions specified:

  1. Removing a reference to a text teaser: On the first level of the childnodes is defined that all document's childnodes named "core:textteaser" should be removed, if they refer to the document with the external ID "story_9908".
  2. Removing rows from a dynamic table for audio metadata: Also on the first level, it is defined that all childnodes named "core:audiodata" should be removed, if if their merge property "core:name" has the value "audio_file_12432".
  3. Removing audio references from an audio box: On the second level of the childnodes is defined that childnodes named "core:teaser" and referencing document ID "con2748038" are removed, if they are childnodes of the a childnode "core:box", which has the value "Test" assigned to his merge property "core:title".
<?xml version="1.0" encoding="UTF-8"?>
<document xmlns="http://www.sophoracms.com/import/4.2" externalID="audio_4711">
  <properties />
  <childNodes>
    <updateBehaviour>
      <childNode name="core:audiodata" behaviour="merge" insertPosition="end" mergeProperty="core:name" />
      <childNode name="core:box" behaviour="merge" insertPosition="end" mergeProperty="core:title" />
      <childNode name="core:textteaser" behaviour="merge" />
    </updateBehaviour>
    <childNode nodeType="core-nt:audiodata" name="core:audiodata" remove="true">
      <properties>
        <property name="core:name">
          <value>audio_file_12432</value>
        </property>
      </properties>
      <childNodes />
      <resourceList />
    </childNode>
    <childNode nodeType="core-nt:audiolist" name="core:box">
      <properties>
        <property name="core:title">
          <value>Test</value>
        </property>
      </properties>
      <childNodes>
        <updateBehaviour>
          <childNode name="core:teaser" behaviour="merge" />
        </updateBehaviour>
        <childNode nodeType="core-nt:audioref" name="core:teaser" remove="true">
          <properties>
            <property name="sophora:reference">
              <value>con2748038</value>
            </property>
          </properties>
          <childNodes />
          <resourceList />
        </childNode>
      </childNodes>
      <resourceList />
    </childNode>
    <childNode nodeType="core-nt:storyref" name="core:textteaser" remove="true">
      <properties>
        <property name="sophora:reference">
          <value>story_9908</value>
        </property>
      </properties>
      <childNodes />
      <resourceList />
    </childNode>
  </childNodes>
  <resourceList />
  <fields>
    [...]
  </fields>
  <instructions>
    <lifecycleActivities />
    <proposals />
    <stickyNotes />
  </instructions>
</document>

Last modified on 10/16/20

The content of this page is licensed under the CC BY 4.0 License. Code samples are licensed under the MIT License.

Icon