Sophora Backup and Recovery Tutorial

An overview how to back up and recover content within and with the help of Sophora.

Table of Contents

System Overview

Which Components Are Required for a Backup?

Architectural Viewpoint

  • Master
  • Fall-back slave – The fall-back slave is used as substitue when the master is temporarily unavailable; for example when updating the master.
  • Backup system – Is also a fall-back slave but with a different role. It should be exclusively run as backup system.
  • Staging slaves (existing n times)

From the Components' Point of View

For each master and for each archive repository:

  • Repository database
  • Repository with the Lucene index within the file system

General Remarks

Lucene Index and Database

When a database backup of the repository is executed, the Lucene index has to be synchronised. This can be achieved by bringing in a Lucene index backup at the same time; e.g. a copy of the entire repository, including configuration files. Alternatively, the entire index can be deleted so that it is rebuilt from scratch when the system is restarted. This operation might take a while since it depends on the size of the repository. Thus, this is usually not viable.

Data Replication

To transport content from the master repository to the delivery servers the Java Message Service (JMS) is applied (refer to http://www.oracle.com/technetwork/java/index-jsp-142945.html). By that, the transportation mechanism is not located at database level but shifted to the application. Sophora therefore uses the JMS implementation ActiveMQ (see http://activemq.apache.org/).

Two use cases have to be distinguished:

  1. Replication of content for the fall-back or backup system
  2. Publishing content (i.e. staging)

The replication operates in a master slave configuration. That means changes in the master repository are replicated to all connected slave systems.

Besides the concerned documents the replicated data also contains system properties such as user information and node types. This ensures that the fall-back system is a complete copy of the master.

In normal mode the main system acts as master and the fall-back systems are slaves. If the master system is unavailable (for whatever reason), the fall-back system can be declared as the master temporarily. Testing the availability and switching the master slave roles is done manually by a Sophora administrator. She is also responsible to inform the editorial office/staff which system currently acts as master. Editorial journalists are supposed to work on the master system exclusively.

If the designated master system is available again, it first has the role of a slave and needs to synchronise with the fall-back system. To guarantee a complete synchronisation both systems have to be locked for a short period. Afterwards the roles can be switched back. Do not forget to inform the editorial office about the switchover so that they continue working on the correct master.

The ActiveMQ database does not need to be backed up since it is only used for a short time when the master system is turned off. In the ideal case the stored queue in this database is empty which implies that the connected slaves are synchronised. If that is not the case, inserting a filled queue after a break-down is rather destructive than helpful since the synchronisation of the slaves will not be executed appropriately/correctly.

Archive Repository

The entire system is executable even without an archive repository. An archive repository only contains older, non-live versions of documents. This repository may be deleted.

Creating a Backup

Creating a component's backup needs to be done in a consistent state. This is achieved, if the backup is created when the component is shut down. Thereby, no inconsistencies can occure while the backup process is running.

Backing Up a Master

The following directories need to be saved:

  • <sophora.home>/repository
  • <sophora.home>/repository_archive
  • <sophora.home>/config

If the database for JackRabbit has been configured as an embedded database, the database files are located in the directories repository and repository_archive. Thus, they are enclosed by the backup. Otherwise, the database files of the main repository as well as for the archive repository have to be backed up additionally.

Backing Up a Staging Slave

For the creation of a slave's backup there are two possibilities:

  • A staging slave that is going to be backed up needs to be disconnected from the delivery and shut down. Afterwards, its data is copied to the backup location and the slave can be restarted. Next, the staging slave synchronises automatically with the master server to catch up with changes that happened in the meantime. Such backup operation may be automated using a script in order to repeat/renew the backup at regular intervals; e.g. every 24 hours.
  • A separated staging slave, which never actually connects to the delivery, might be run as backup staging slave concurrently. By that, the backup staging slave synchronises as the other staging slave but never forward the received data to a delivery. The actual backup procedure is the same as in the first alternative with the benefit that a potential blackout in the delivery is avoided entirely.

However, the following directories of the staging slave to back up have to be secured:

  • <sophora.home>/repository
  • <sophora.home>/config

For each configured/connected delivery (i.e. Tomcat context) one additionally has to save the following files:

  • Docbase of the Tomcat context containing the templates
  • <localConfigDirectory> (as defined in the Tomcat context's configuration)
  • <sophora.delivery.cache.directory> (as defined in the sophora.properties within <localConfigDirectory>)

If the configuration of the Tomcat (and Apache) is not generated automatically by a configuration system, you should also secure the according files.

A staging slave backup cannot be restarted as a (new) master server! A staging slave only contains published content but not working copies or older versions of documents. Neither does it contain system data such as user information or any deskclient configurations.

Backing Up the Dashboard

The Dashboard also has its own data management about the information collected over time. These information can be found in

  • <dashboard.home>/data

Create a backup of this directory after stopping the Dashboard process.

Recovery

A component's backup can only be restored when the concerning process has been stopped beforehand. When a master system is recovered the connected slaves also need to be recovered.

Restoring a Master's Backup

To restore the backup of a master the following directories need to be copied:

  • <sophora.home>/repository
  • <sophora.home>/repository_archive
  • <sophora.home>/config

Following directories have to be removed:

  • <sophora.home>/cache
  • <sophora.home>/data

If the backup is restored from a backup or fall-back system, you might need to adjust some ports and IP addresses within the sophora.properties file.

If an embedded database is used for the repositories, these information are already contained in the restored data. Otherwise the database file corresponding to the used database system have to be restored from the backup.

It is essential that the database connection parameters are customised according to the new environment before starting the server.

Such parameters can be found in the following files:

  • <sophora.home>/repository_archive/repository.xml
  • <sophora.home>/repository_archive/workspaces/default/workspace.xml
  • <sophora.home>/repository/repository.xml
  • <sophora.home>/repository/workspaces/default/works

Restoring a Staging Slave's Backup

The subsequent directories have to be restored:

  • <sophora.home>/repository
  • <sophora.home>/config

Following directories have to be removed:

  • <sophora.home>/cache
  • <sophora.home>/data

If the backup has been created on a different machine, you might need to adjust the configured ports and IP addresses with the sophora.properties file.

Additionally, you have to restore the following directories for each configured Tomcat context:

  • Docbase of the Tomcat context containing the templates
  • <localConfigDirectory> (as defined in the Tomcat context's configuration)
  • <sophora.delivery.cache.directory> (as defined in the sophora.properties within <localConfigDirectory>)

Here again you might have to adjust ports and IP addresses within <localConfigDirectory>/sophora.properties.

Breakdown Scenarios

Scenario 1: The Master Drops Out

In such case, none of the editorial journalists can access the system so that no changes can be done to the repository. The delivery is not affected from a breakdown of the master.

  • Usually the master can simply be restarted. The staging and fall-back slaves as well as the deskclient log in again on their own.
  • If the master cannot be restarted, the fall-back slave should be declared as the new master. Therefore, you need to customise its configuration and restart it. The staging slaves, deskclients and all other Sophora modules (like the Sophora Indexer or the Sophora Importer) also need to be restarted manually in order to connect to the new master. As soon as the problems with the original master are solved it can be announced as new fall-back slave and thereby synchronises with the new master automatically after restarting. If you want the former master become master again, you have to stop both the new master and fall-back slave (old master), change their mode of operations and restart them afterwards.
  • If the fall-back slave is unavailable too, you have to use the backup system as temporary master (refer to the previous item).

Scenario 2: A Staging slave Drops Out

When a staging slave drops out the delivery on this server fails. Thus, it should be removed from the according delivery (group) until the problem has been fixed.

Usually the concerned staging slave can simply be restarted. It automatically synchronises with the Sophora master. Afterwards, this staging slave can be reintegrated into the delivery.

If the staging slave cannot be restarted: Restore this staging slave with the backup of a different slave (see section "Backing Up a Staging Slave"). Restart the staging slave again so that it synchronises with the master (done automatically beginning at the configured date). Finally, it can be reintegrated into the delivery.

Alternatively you can also restore the staging slave using the backup system and process as in the previous step.

Scenario 3: Master and Fall-Back Slave Drops Out And the Backup System Is Not Updated Or Unavailable as Well

In such scenario the Sophora system has to be restored from the backup system. All information that have not been integrated in the backup before the total crash are lost. That means the system can only be restored with an outdated version since no synchronisation can be done starting from the point in time of the breakdown.

The staging slaves might contain more recent content than the master does. Nonetheless, they need, in order to get back to a consistent state, to synchronise with the outdated master. Otherwise, a correct application flow of the staging slaves cannot be guaranteed.

A backup of the repository and the repository directories of the master need to exist for this case.

A backup of the ActiveMQ queue must not be restored. In fact, you should even clean it because it is most likely not synchronous to the backup.

When the backups have been restored the master can be restarted. Afterwards, the fall-back slaves and backup systems can also be restarted by and by as well as the Sophora components connected to the master.