Add-on YouTube Connector: Documentation

The YouTube Connector (internal name "AVTool") is an independent Java program that can upload videos to YouTube.

Table of Contents

Features

This is a paid Sophora add-on. For details, please refer to our website.

Installation

When you install the YouTube Connector, it is recommended to use the following folder hierarchy:

cms-install-directory
		youtube-connector
				youtube-connector-2.5.7-executable.conf
				youtube-connector-2.5.7-executable.jar
				application.properties
				mediaconfig.xml
				data
				groovy
				logs
        ...

Starting and Stopping

You can use init.d to start and stop the YouTube Connector. Simply create a symlink in /etc/init.d which points to the jar. It should be started by root and changes the user to the owning user of the file.

To start the YouTube Connector directly, you have to change the location of the pid and log files. Put a .conf file with the same name alongside the jar with a content like the following:

JAVA_OPTS="-Xmx1g -Davtool.groovy.dir=/cms-install-directory/youtube-connector/groovy"
PID_FOLDER="/cms-install-directory/youtube-connector/logs/"
LOG_FOLDER="/cms-install-directory/youtube-connector/logs/"

To start the YouTube Connector invoke the youtube-connector-2.5.7-executable.jar as follows:

> cd cms-install-directory/youtube-connector
> ./youtube-connector-2.5.7-executable.jar start

To stop the YouTube Connector, enter the following:

> cd cms-install-directory/youtube-connector
> ./youtube-connector-2.5.7-executable.jar stop

For all possible options look at the Spring Boot documentation.

Document Model

For each video there must be a corresponding Sophora document. The configuration of the document type must either comply to a few rules, which are stated below, or the video document must be processed by a preprocessor script. Every time the YouTube Connector processes a video, the preprocessor script is called with the ISophoraDocument node of the video document. The script can then change the node to match the rules used by the YouTube Connector.

Table of Files

The video files which are available for a document have to be listed in a dynamic table. Each child node representing a row of the table has to have the name avtool:file and has to be of type avtool-nt:file. The following properties are supported:

  • avtool:format (String): The video format of the file. The YouTube Connector supports videos in different formats, e.g. MP4 hq, MP4 low, Windows Media. The configuration of the YouTube channel contains a sorted list of formats which are enabled for this channel. The YouTube connector will go through this list of formats in the given order and upload the first file that matches this format.
  • avtool:sequence (String): The name of the video sequence. This is usually the file name without suffix and used as the file ID for the YouTube Content ID API.
  • avtool:name (String): The name of the file, including the suffix. May be a sub path inside the server folders.

Property for YouTube Channel Selection

The YouTube Connector supports multiple YouTube channels, with the restriction that each video document can only be associated with one channel at a time. The video document must have a property to associate a YouTube channel (youtubeAccountPropertyName in the mediaconfig.xml). Only videos for which a YouTube channel is set are uploaded to YouTube. The property must be a string property containing the ID of a YouTube-Channel-Configuration document, which are found in the Sophora Admin View. Usually, this property is configured as a Select Value property.

Properties for YouTube Video ID and Status Messages

After uploading a video to YouTube, the Connector writes information about the upload back into the video document into three string properties. The names of these properties must be given in the configuration file. The following information is written into the document:

  1. The YouTube ID of the uploaded video (youtubeIdPropertyName in the configuration).
  2. A result code string (ERR_TYPE in the configuration).
  3. An error message in case of errors (ERR_MSG in the configuration).

The actual string values of the result codes must be configured in the mediaconfig.xml.

Result codes
Configuration KeyDescription
OKThe upload or update of the video was successful.
MANYThere were multiple errors.
TAGGINGThere was an error tagging the video file.
TRANSPORT_MEDIASERVERThere was an error downloading the video file from the media server.
TRANSPORT_STREAMINGSERVERNot used for YouTube.
YOUTUBEThere was an error uploading the video to YouTube.
OTHEROther errors, e.g. due to erroneous configuration.

Pre-Publishing Workflow

If you wish to use the pre-publishing workflow for video documents, the video document type must have the mixin sophora-mix:prePublishRequired. If that is the case, the video document will go into the pre-published state when it is published by a user using the DeskClient. The YouTube Connector will then upload the video to YouTube and, once it is finished, finish the publication. If the mixin is not configured, the video is uploaded when the document is published as well, but the document will directly go into the published state.

Configuration

The YouTube Connector uses two configuration files: The "youtube-connector.properties" and the "mediaconfig.xml". Samples are given below and can also be found in the distribution of the YouTube Connector in the folder "config".

youtube-connector.properties

PropertyDescription
vmargsAdditional arguments to the JVM.
sophoraServer.hostURL to the Sophora server
sophoraServer.username
sophoraServer.password
Credentials for the Sophora server
sophora.client.dataDirDefines a directory which may be used by the Sophora Client Api for persisting information like the available nodes in a cluster. The directory must be specified over an absolute path.
Example: /cms-install-directory/youtube-connector/data
mediaConfigPath to the mediaconfig.xml file.
server.portTCP port on which a status website is provided. If not set, a webserver is started on port 8080.
digest.dirDirectory for hashes of media files. These hashes are used to avoid unnecessary transfers of unchanged media files.
job.store.pathPath to the file for the persistent queue. Events that trigger an operation of the YouTube Connector (e.g., publishing media documents) are added to the queue and will then be processed by the YouTube Connector.
Example: queue.xml
tagging.propertyPrefix.regexpDuring tagging, this regular expression distinguishes between references to Sophora properties and normal, static text.
Example: ^(your-prefix:|sophora:).*$
tagging.scriptClassPrefix.regexpDuring tagging, this regular expression distinguishes between references to Groovy scripts and normal, static text.
Example: ^(scriptClass:)(.+)$
youtube.chunkingEnables/disables chunking ("resumable media upload protocol") during YouTube upload.
Setting this to false can avoid problems with certain proxy servers. However, this has the disadvantage that during the upload, the entire media file will be buffered in memory.
document.preprocessorScriptsA comma-separated list of class names. The classes have to be located in the Groovy script folder and are required to implement the interface IPreprocessorScript. The YouTube connector applies these script in the specified order to documents before processing. This way, virtual properties, which are not in the CND/repository, can be set. Because the YouTube connector expects a specific document model, a preprocessor script might be required to create the properties and child nodes, which conform to this model, on the fly.
uploader.retry.countIf the upload of a file fails because of I/O problems, the upload can be done again the given amount of times.
uploader.retry.waitSecondsTime in seconds to wait between uploads if an I/O problem occured and uploader.retry.count is greater than zero.

Example configuration

# Connection to the Sophora master server.
sophora.master.url = http://localhost:1196
sophora.master.user = youtubeconnector
sophora.master.password = XXXXXXX
sophora.client.dataDir = data
# URL/Path to the config for media formats and -servers.
mediaConfig = file:config/mediaconfig.xml
# Web server
server.port = 5063
# JMX
jmx.registry.port = 5060
rmi.registry.port = 5061
jmx.registry.username = 
jmx.registry.password = 
# Upload protocol to use for YouTube videos.
# false = the entire media content will be uploaded in a single request (not resumable)
# true = the request will use the resumable media upload protocol to upload in data chunks
# For details see https://developers.google.com/youtube/v3/guides/using_resumable_upload_protocol 
youtube.chunking = true
# tries after first IO error
uploader.retry.count = 2
uploader.retry.waitSeconds = 15
# Path for the MD5 digest database.
digest.dir = /cms-install-directory/youtube-connector/digest
# Path for the persistent queue file.
job.store.path = /cms-install-directory/youtube-connector/queue.xml
# Test mode: Set to true to only log operations instead of executing them.
logonly.fileoperations = false
logonly.publishdocuments = false
# In the tagging configuration of the YouTube channel configuration, values with these prefixes are considered to be
# property references.
tagging.propertyPrefix.regexp = ^(example:|sophora:|youtube.).*$
# In the tagging configuration of the YouTube channel configuration, values with these prefixes are considered to be
# references to Groovy tagging scripts.
tagging.scriptClassPrefix.regexp = ^(scriptClass:)(.+)$
# Groovy scripts (comma-separated) that can modify the document before processing
document.preprocessorScripts = 
# Path of the proposal section, path elements must be separated by ';'.
# When a transport error occurs, a proposal will be created in this section.
# Leave empty to not create proposals.
proposalsection.path =

mediaconfig.xml

The mediaconfig.xml file (location set by mediaConfig property in the youtube-connector.properties file) describes the document types processed by the YouTube Connector and available media formats. It also contains the global YouTube configuration.

Example Configuration

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xmlns:util="http://www.springframework.org/schema/util"
	xsi:schemaLocation="http://www.springframework.org/schema/beans
		http://www.springframework.org/schema/beans/spring-beans.xsd
		http://www.springframework.org/schema/util http://www.springframework.org/schema/util/spring-util.xsd">
	<!--####################################################################
		Document type and media formats
		#################################################################### -->
	<!-- List of document types -->
	<util:list id="documentDescriptionList">
		<!-- Video -->
		<bean class="com.subshell.sophora.avtool.api.MediaDocumentDescription">
			<property name="formats">
				<set>
					<value>youtube</value>
				</set>
			</property>
			<!-- Only events for this document type are handled. -->
			<property name="nodeTypeName" value="example-nt:video" />
			<!-- Optional: Name of a playout channel. The presence of a document on this channel corresponds to the visibility of the media files on YouTube. -->
			<property name="playoutChannelName" value="YouTube" />
		</bean>
	</util:list>
	<!-- Feedback to DeskClient: property names -->
	<util:map id="deskclientFeedbackPropConf" key-type="com.subshell.sophora.avtool.DeskclientFeedbackProperty">
		<!-- Name of the property into which the error message is written -->
		<entry key="ERR_MSG" value="example:avtoolResultText" />
		<!-- Name of the SelectValue property into which the error code is written -->
		<entry key="ERR_TYPE" value="example:avtoolResultCode" />
	</util:map>
	<!-- Feedback to DeskClient. Map: error type to error string (you can use a select value for readable labels). -->
	<util:map id="deskclientFeedbackErrTypeConf">
		<entry key="TRANSPORT_MEDIASERVER" value="transport_mediaserver" />
		<entry key="YOUTUBE" value="youtube" />
		<entry key="OTHER" value="other" />
		<entry key="MANY" value="many" />
		<entry key="OK" value="ok" />
	</util:map>
	<!--####################################################################
		Source for video files
		#################################################################### -->
	<bean id="localTransporter" class="com.subshell.sophora.avtool.transporter.LocalFSTransporterFactory" />
	<bean id="mediaServerDescription" class="com.subshell.sophora.avtool.api.ServerDescription">
		<!-- Server name from which the video files should be retrieved. When local file transfer is used, 
			 this field is only used for logging. -->
		<property name="host" value="mediaserver" />
		<property name="basedir" value="src/test/resources/mediasource" />
		<property name="transporterFactory" ref="localTransporter" />
		<property name="formatDirectoryMapping">
			<map>
				<entry key="youtube" value="." />
			</map>
		</property>
		<!-- Optional: A regex to parse the date from a file's name, which will then be applied to all entries
			 in the formatDirecoryMapping that contain format specifiers, e.g. "%1$tY/%1$tm%1$td".
			 The date in the file name must be written in the format "yyyyMMdd". -->
		<property name="dateRegex" value="#?.{3}(\d{8}).+" />
		<property name="concatVideosScript" value="/opt/addOpenerCloser.sh" />
	</bean>
	<util:list id="streamingServerDescriptionList" />
	<!--####################################################################
		YouTube
		#################################################################### -->
	<!-- YouTube settings for all channels -->
	<bean id="youtubeGlobalConfig" class="com.subshell.sophora.avtool.api.youtube.YoutubeConfiguration">
		<!-- After uploading a video to YouTube, the YouTube id is written back to the document into this property.
			 Required for removing videos from YouTube when the document goes offline. -->
		<property name="youtubeIdPropertyName" value="example:youtubeId" />
		<!-- Property of the document that indicates which channel to use. -->
		<property name="youtubeAccountPropertyName" value="youtube.channel" />
		<!-- The YouTube action to take when a video document is deleted or set offline. Options are DELETE, SET_PRIVATE, and DO_NOTHING. -->
		<property name="deleteEventAction" value="DELETE" />
		<property name="offlineEventAction" value="SET_PRIVATE" />
		<!-- Application for adding opener and closer to the beginning/end of video. The specific videos are configured per channel. -->
		<property name="concatVideosScript" value="/cms-install-directory/ytc_opener_und_closer/addOpenerCloser_ffmpeg.sh" />
	</bean>
</beans>

Playout channel

In the above MediaDocumentDescription bean (com.subshell.sophora.avtool.api.MediaDocumentDescription) the name of a playout channel can be set, which is optional. With this feature, files will only be uploaded to YouTube if the document is enabled in the configured playout channel, which will be checked when the document is published. You can also use the timed channel affiliation settings for an automatic upload/removal of the files.

Please keep in mind that if you change the name of the playout channel in the DeskClient, you have to change it in the configuration file as well.

Date Pattern Matching

In the above mediaServerDescription bean (com.subshell.sophora.avtool.api.ServerDescription) an optional regular expression to parse the date from file names can be given. This is useful if the files are structured according to their dates in different subfolders. The regex defines the position of the date in the file name (which must be in the format "yyyyMMdd"). Once the date has been found, it will be applied to the entries in the format directory mapping that contain format specifiers, e.g. "%1$tY/%1$tm%1$td".

Please note that if you configure the date regex property, exceptions will be thrown if file names do not match the given regular expression.

Timeouts

In the above mediaServerDescription bean (com.subshell.sophora.avtool.api.ServerDescription) additional timeout properties can be set:

ServerDescription Timeout Properties
NameDescription
connectTimeoutTimeout (in milliseconds) to open a connection. A timeout of zero is interpreted as an infinite timeout. Default is 1 minute.
socketTimeoutTimeout (in milliseconds) for data transfers. A timeout of zero is interpreted as an infinite timeout. Default is 5 minutes.

Example:

<?xml version="1.0" encoding="UTF-8"?>
<bean id="mediaServerDescription" class="com.subshell.sophora.avtool.api.ServerDescription">
    <property name="host" value="mediaserver" />
    <property name="transporterFactory" ref="localTransporter" />
    <property name="connectTimeout" value="30000" /><!-- 30 seconds -->
    <property name="socketTimeout" value="600000" /><!-- 10 minutes -->
</bean>

YouTube Channel Configuration in the Sophora DeskClient

The YouTube Channels are configured directly in the Sophora Admin View, using the document type "YouTube Channel" (sophora-extension-nt:youtubeChannelConfiguration). This document type is created by the YouTube Connector when it connects to the Sophora Master for the first time.

Each YouTube channel configuration document has an ID (property sophora-extension:id). In each video, there must be a property referencing the ID of a YouTube channel document. The name of this property is defined by the property youtubeAccountPropertyName in the global YouTube configuration in the mediaconfig.xml. If this property is not set in a video document, it will not be uploaded to YouTube.

Properties in the YouTube channel configuration document
PropertyDescription
IDThe ID used to reference this channel by the video documents. This can be an arbitrary string.
YouTube Channel ID (optional)Only required if the account has access to multiple channels. For determining the ID from YouTube, see https://support.google.com/youtube/answer/3250431.
Client ID, Client secret, AuthCodeSee the section on authentication.
Video formatThe first format from this list that is checked in the video document will be uploaded to YouTube.
MIME typeThis MIME type is used for all formats uploaded to YouTube.
TitleThe title of the video.
DescriptionThe description of the video.
CategoryThe category of the video.
KeywordsThe keywords for this video.
VisibilityIf set to "private", the YouTube Connector will upload videos to YouTube, but not make them publicly available.
LocationThe location name of the video.
Latitude, LongitudeThe geo location of the video.
ThumbnailThe name of the childnode of the video document that contains the reference to the thumbnail image of the video.
Image variant for thumbnailThe name of the image variant that will be used for the thumbnail image of the video. If this property has not been set, the "original" image variant will be used by default.
Teaser image overlayThe external ID of the image document that contains the overlay for the thumbnail image of the video. If left blank, no overlay image will be placed on the thumbnail image.
Caption child nodeThe name of the childnode of the video document that contains the binaries of the caption.
Name of the CaptionThis name is visible to the YouTube user as an option during playback.
Content OwnerUsed to manage the rights for this channel. You must be a YouTube Partner to use this.
Match PolicyThe name of the policy which is applied to videos of other users at YouTube.
Usage PolicyThe name of the policy which is applied to own videos at YouTube.
Service Account IdIf the Client ID does not belong to a partner account, the service account is used to access the YouTube Content ID API. For creating a service account see https://developers.google.com/youtube/partner/guides/oauth2_for_service_accounts#setup.
Service Account KeyThe private key file in PKCS#12 format (without password) to authenticate the service account.
Intro- and Outro-VideosIf you have configured an application for concatenating videos using the property concatVideosScript in the global YouTube configuration in the mediaconfig.xml, the videos given here will be given to this application for each video processed by the YouTube Connector.
When uploading videos with thumbnails, please notice the following:

  • Both the Teaser image of the video and the overlay image should be as large as possible, as the generated thumbnail image will also be used as the preview image in the embedded YouTube video player. The combined size of both images should not exceed 2 MB in total due to upload limitations on YouTube.
  • Preferably, the images should follow a 16:9 aspect ratio.
  • The overlay image should contain transparency and be in one of the supported image formats (.GIF or .PNG). It should also have the same dimensions as the Teaser image. For this purpose it is recommended to use a custom image variant for both images instead of the "original" image variant, that complies with the abovementioned suggestions.

For further recommendations, please also check the best practices from YouTube.

The Video Data Section

The properties in the "Video Data" section of the channel configuration define the metadata for each video that is uploaded to YouTube. Each field can either reference a property in the video document, a groovy script, or contain plain string content.

If the property value matches the regular expression given by the property tagging.scriptClassPrefix.regexp in the youtube-connector.properties, it references a Groovy tagging script. See Tagging Script.
If the property value matches the regular expression given by the property tagging.propertyPrefix.regexp in the youtube-connector.properties, it is assumed to be the name of a property in the video document.
If neither regular expression matches, the value is used as-is for all videos.

YouTube Content Partner

If you are a YouTube Content partner and want to use the YouTube Content ID API to claim your videos and apply policies, you need to specify a content owner. Also additional authorization is necessary. You have two options for authenticating:

  1. Access the YouTube Data API and YouTube Content ID API with different accounts.
  2. Access the YouTube Data API and the YouTube Content ID API with a single account.
In the first case a Client ID directly assigned to the YouTube channel is used to upload videos. This is the same configration as if you are not using the YouTube Content ID API. To access the YouTube Content ID API you have to create a service account and configure the id and key respectively in Sophora.
In the second case a YouTube CMS user is used to access both APIs. The given content owner is also used for the YouTube Data API to act on behalf of YouTube CMS user. If the CMS user has access to multiple channels, you need to configure the channel id, so the uploaded videos are assigned to that channel. For authentication you need to provide a Client ID with secret which has access to both APIs (no service account is used).

Authentication (OAuth)

The YouTube Connector uses the OAuth 2.0 flow for installed apps.

  • Follow the steps for obtaining authorization credentials to register the YouTube Connector with YouTube. When you are done, copy the "client id" and "client secret" to the YouTube channel configuration document in Sophora.
  • Start the YouTube Connector, publish a video and watch the log file. There will be an entry saying that no authentication code for YouTube is configured. This entry will also show an URL.
  • Visit the URL using your browser. After you have accepted the access permissions requested by the YouTube Connector, the browser will show an authentication code. Copy this code to the field "AuthCode" of the YouTube channel configuration document.
  • Restart the YouTube Connector. The Connector will now use the auth code to generate credentials for YouTube. The credentials are saved in the directory "~/.oauth-credentials".

Scripting

There are two kinds of Groovy scripts that can be used to customize the YouTube Connector: Tagging scripts and preprocessor scripts. Scripts of both types must be put in a folder which is configured in a Java property named "avtool.groovy.dir" (with "-Davtool.groovy.dir=/path/to/groovy/scripts" in .conf file).

Tagging Script

The content for fields in the "Video Data" section of the YouTube channel configuration document can be computed by Groovy scripts. If the content starts with the script class prefix as given by the property tagging.scriptClassPrefix.regexp in the youtube-connector.properties, e.g. "scriptClass:", the remaining portion of the string is interpreted as the name of a Groovy class. For example, if the field "Title" contains the value scriptClass:YoutubeTitle, the Connector looks for a script defining the class "YoutubeTitle", calls its processDocument() method and uses the result as the actual title of the YouTube video. Here is an example of a tagging script:

import com.subshell.sophora.api.content.INode;
import com.subshell.sophora.avtool.api.scripting.AbstractMediaTaggingScript;
import java.text.SimpleDateFormat;
 
class YoutubeTitle extends AbstractMediaTaggingScript {
	String processDocument(INode doc) {
		def parts = []
		parts << doc.getString('your-prefix:headline')
		parts << context.getSelectValueLabel(doc, 'your-prefix:broadcast')
		parts << "TV"
		parts.removeAll { it == null || it == '' }
		return parts.join(' | ')
	}
}

Preprocessor Script

Every time the YouTube Connector processes a video, the preprocessor script is called with the ISophoraDocument node of the video document. The script can then change the node, i.e. change or add properties or child nodes. The changed document is not required to match the node type configuration of the video document, i.e., the script can add arbitrary properties, which can then be referenced by the YouTube channel configuration document. The following example shows a simple preprocessor script.

import org.slf4j.Logger
import com.subshell.sophora.api.content.INode
import com.subshell.sophora.avtool.api.scripting.IPreprocessorContext
import com.subshell.sophora.avtool.api.scripting.IPreprocessorScript
class YouTubePreprocessor implements IPreprocessorScript {
	private Logger log
	private IPreprocessorContext context
	@Override
	public void init(IPreprocessorContext context) {
		log = context.logger
		this.context = context
	}
	@Override
	public void preprocess(INode video) {
		// Set a fixed YouTube channel for all videos.
		video.setString("your-prefix:youtubeChannelId", "channel1")
		// Set a new property which can then be referenced by the channel configuration.
		if (video.hasProperty("your-prefix:title")) {
			video.setString("youtube:title", document.getString("your-prefix:title"))
		} else {
			video.setString("youtube:title", "My video")
		}
		// The script has access to the Sophora Client.
		String description = context.sophoraClient.getDocumentByExternalId("your-special-document").getString("your-prefix:youtubeDescription")
		video.setString("youtube:description", description)
	}
}