Recommended configuration for productive Sophora environments

Contains the most important settings for servers and deliveries to take into account when running Sophora in productive environments.

Table of Contents

The configuration parameters given here must be seen as a guideline. This list is not necessarily complete and should be revised for every installation. The best configuration for your enviroment may differ from those depicted here. In case of doubts about certain configuration parameters, contact Subshell.

Java Version

It is recommanded to use the latest version of Oracle Java 8.

Java Garbage Collection

The exact parameters for the Java Garbage Collection highly depend on the environment (CPU, memory), Java version (especially on whether you choose an OracleJDK or an OpenJDK) and the use case (how many concurrent user, how many imports). We recommend using the ConcMarkSweepGC.

Sophora Server

The Java VM arguments should be maintained in the file under the configuration key vmargs=...

Maximum number of open files

The maximum number of open files must be increased on unix system. Add the following lines to the file /etc/security/limits.conf.

*               soft    nofile            4096
*               hard    nofile            4096

Check the result with the command ulimit -a.

Master and Slaves

Persistence Technology

We recommend using Oracle or MySQL as database technology.

-d64 -server -XX:+UseConcMarkSweepGC -XX:MaxMetaspaceSize=512M -XX:+ExplicitGCInvokesConcurrentAndUnloadsClasses -XX:+CMSParallelRemarkEnabled -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=70 -XX:+ScavengeBeforeFullGC -XX:+CMSScavengeBeforeRemark -Xss640K -Xmn3G -Xms12G -Xmx12G -XX:-OmitStackTraceInFastThrow -XX:+PrintTenuringDistribution -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:-UseGCOverheadLimit -Djava.awt.headless=true -XX:+PrintGCApplicationStoppedTime -XX:+HeapDumpOnOutOfMemoryError

Master, Slaves and Stagingslaves

Jackrabbit Configuration

The following jackrabbit related configuration should be adopted in the following files:

  • ./repository/repository.xml
  • ./repository/workspaces/default/workspace.xml
  • ./repository/workspaces/live/workspace.xml
  • ./repository_archive/repository.xml
  • ./repository_archive/workspaces/default/workspace.xml
		<param name="bundleCacheSize" value="1024" />

Search Index

ResultFetchSize and MaxMergeDocs

The parameters resultFetchSize and maxMergeDocs should be removed. Please note, that the first start of the Sophora server after removing the parameter maxMergeDocs will take very long. The Lucene indexes are grouped into larger segments. This may take 60 minutes or longer depending on the size of the indexes.

<SearchIndex  class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
	<param name="initializeHierarchyCache" value="false" />

Important Security Parameters to set

  • sophora.solr.username
  • sophora.solr.password
  • sophora.replication.userName
  • sophora.replication.password
  • sophora.jmx.username
  • sophora.jmx.password


Persistence Technology

We recommend using LevelDB as database technology.

-d64  -XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8
-XX:+UseConcMarkSweepGC -XX:+CMSScavengeBeforeRemark -XX:PretenureSizeThreshold=64m
-XX:CMSFullGCsBeforeCompaction=1 -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=70
-XX:CMSTriggerPermRatio=80 -XX:CMSMaxAbortablePrecleanTime=6000 -XX:+CMSParallelRemarkEnabled
-XX:+ParallelRefProcEnabled -XX:+UseLargePages -XX:+AggressiveOpts -Xss640K -Xms6G -Xmx6G
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:-UseGCOverheadLimit

The best GC-Settings however might depend on your very installation, the Java version (especially on whether you choose an Oracle JDK or an OpenJDK) and the amount of incoming requests. We propose to try different GC-settings for fine-tuning.

In case you are using DerbyDB instead of LevelDB additionally use this parameter:


Tomcat Configuration

Tomcat tag pooling should be disabled.

Document Cache

The delivery holds a cache for document requests to its staging slave. This cache is crucial for the performance of the delivery. Its size must be adjusted considering the available heap space for the java process. Take the following configuration as guidance:

Configuration of the Delivery Cache

Java VM Arguments -Dderby.system.durability=test -Dnet.sf.ehcache.use.classic.lru=true
-Djava.awt.headless=true -Djava.rmi.server.hostname=.... -Xmx8G -Xms8G -XX:ReservedCodeCacheSize=512M 
-XX:+UseG1GC -XX:+UseStringDeduplication