Chapter 7 Administration

This chapter is relevant for administrators who want to install Octra or project administrators that want to customize Octra for their projects.

If you have any questions send an email to octra[at]phonetik.uni-muenchen.de.

7.1 Configuration

To make OCTRA configurable it uses different configuration files: Files for the current OCTRA installation itself and Files for the current project. In terms of the Online Mode the project administrator defines these files saved the web-backend while administrators can change one configuration for the other modes saved in the Octra folder.

7.2 App configuration

  • ./config/appconfig.json: General options for OCTRA (e.g. setting languages, enabling online mode).
  • ./i18n/: Folder that contains translation files (JSON files)

These two entries are important for customizing an Octra installation.

You can change OCTRA’s application preferences easily by changing its appconfig.json file. You can find this file in the ./config folder. If you are using an older version of OCTRA please make sure to update your appconfig.json to the new structure (of the new appconfig_sample.json) after upgrading the other files. If you made a fresh install, you have to duplicate and rename the appconfig_sample.json to appconfig.json.

For Developers: You can find a JSONSchema File on Github.

Here is the default appconfig_sample.json

{
  "version": "2.0.0",
  "octra": {
    "database": {
      "name": "octra"
    },
    "login": {
      "enabled": true
    },
    "supportEmail": "no-email-provided@email.com",
    "allowed_browsers": [
      {
        "name": "Chrome",
        "version": ""
      },
      {
        "name": "Firefox",
        "version": ""
      },
      {
        "name": "Opera",
        "version": ""
      },
      {
        "name": "Microsoft Edge",
        "version": ""
      }
    ],
    "languages": ["en", "de", "nl", "it", "ko", "zh"],
    "audioExamples": [
      {
        "language": "de",
        "url": "media/Bahnauskunft.wav",
        "description": "Individuelle Beschreibung der Audiosequenz"
      },
      {
        "language": "en",
        "url": "media/Bahnauskunft.wav",
        "description": "Description for the audio sequence"
      }
    ],
    "inactivityNotice": {
      "showAfter": 60
    }
  }
}

7.2.1 Options:

? at the end of an attribute means it’s optional.

  • version: The version shows which version of OCTRA is compatible with this configuration.

  • api?.url: URL to the Octra-Backend API

  • api?.appToken: App token offered by the Octra-Backend.

  • octra.database.name: Set the name of the local database that is found in the user’s browser. This attribute must be set.

  • octra.supportEmail: Email address visible if the server is offline.

  • octra.login?.enabled: Defines if users are allowed to use the Online Mode.

  • octra.allowed_browsers: You can define the browsers which can be used. Because OCTRA was tested in Chrome it’s recommended to use Chrome. If there is no entry all browsers are allowed. If you want to limit the valid browsers you need to add Objects like this:

    {
       "name": "Chrome",
       "version": ""
    }
    • name: The allowed user agent. Please have a look on all user agents available here.
    • version: Regular Expression to validate if the version string is valid. In OCTRA v.1.2 this attribute is not used.
  • octra.languages: If you translated OCTRA to other languages, you can define these in this array. For each language there has to be one octra_.json in /assets/i18n/octra and one guidelines_.json

  • octra.audioExamples: Set audio examples for the demo mode.

  • octra.inactivityNotice?: Set the time after that a notice because of inactivity is shown.

  • octra.maintenanceNotification?: Set the time after that a notice because of inactivity is shown.

  • octra.tracking: Set “active” to “matomo” and further options to analyze web traffic with matomo

  • octra.plugins?.asr: ASR configuration. At the moment ASR and automatic word alignment is not supported for third party installations.

  • octra.oldVersion?.url: If set Octra shows a link to a previous version on the login page.

  • octraBackend?.enabled: Defines if the OCB shall be integrated into OCTRA.

  • octraBackend?.url: URL to the web-backend.

7.3 Project specific configuration

Additionally Octra offers to set project related options like guidelines and validation rules. The Demo Mode, the URL Mode, the Local Mode share the same project configuration which is stored in the config/localmode folder of the Octra installation. For Online Mode you set the configuration on the web-backend (served by Octra-Backend).

Octra offers the following project specific configuration files:

  • ./config/localmode/projectconfig.json: Settings for the current project. For Online Mode you define this online and for others you define it in the OCTRA installation directory.
  • ./config/localmode/guidelines/: Definition of guidelines used for the current project. For each language there should be a guidelines<2-digits-lang>.json file.
  • ./config/localmode/functions.js: Validation rules and cleanup rules defined as Javascript functions

7.3.1 projectconfig.json

The projectconfig.json file contains all important options for the current project. You can find the JSON Schema definition on Github.

In version 2.0 the sample projectconfig.json looks like this:

{
  "version": "2.0.0",
  "logging": {
    "forced": true
  },
  "navigation": {
    "export": true,
    "interfaces": true
  },
  "languages": ["de", "en", "it", "nl", "ko", "zh"],
  "interfaces": ["2D-Editor", "Dictaphone Editor", "Linear Editor"],
  "octra": {
    "validationEnabled": true,
    "tools": ["combine-phrases", "cut-audio"],
    "sendValidatedTranscriptionOnly": false,
    "showOverviewIfTranscriptNotValid": false,
    "theme": "shortAudioFiles",
    "asrEnabled": true
  }
}

7.3.1.1 Options:

? at the end of an attribute means it’s optional.

  • version: The version shows which version of OCTRA is compatible with this configuration.
  • logging?.forced?: Set this to true if the user may not turn off logging interactions.
  • navigation.export?: Users can export their progress.
  • navigation.interfaces: Users can switch the editor used for transcription.
  • interfaces: Defines the names of editors that may be used for transcription. The user can only switch between these editors.
  • octra.validationEnabled: If you added a working functions.js you need to enable this in order to have validation enabled for your project.
  • octra?.tools?: List of available tools. Valid items are combine-phrases and cut-audio.
  • octra?.sendValidatedTranscriptionOnly?: Submitting the finished transcript is blocked as long as something is invalid.
  • octra?.showOverviewIfTranscriptNotValid?: If the use tries to submit an invalid transcript the overview modal with errors will be opened.
  • octra?.theme?: Theme for the project. For now only shortAudioFiles is allowed.
  • octra?.asrEnabled?: Allow users to use ASR. Only working if ASR is enabled in appconfig.json. (ASR and word alignment only supported on the main installation by LMU Munich).
  • octra?.importOptions?: Options for the converter used on import. Only relevant for Online Mode.
  • guidelines?.showExampleNumbers: Shows example numbers in guidelines.
  • guidelines?.showExampleHeader: Shows header for each example in guidelines.

7.3.2 Guidelines

OCTRA enables to define transcription guidelines. The guideline files can be found in ./config/localmode/guidelines/. On the Octra-Backend you can define the guidelines in the tool configuration for your project.

For each language set in the projectconfig.json file you need to create one guidelines_[lang].json file. For example, if “languages” in projectconfig.json is set to [“de”, “en”] you need to create one guidelines_de.json and one guidelines_en.json.

To create a new guidelines_.json file you can duplicate one of the files that already exist and change it to your wish.

7.3.2.1 Structure of guidelines

Example structure:

{
  "meta": {
    "object_language": "eng",
    "language": "en",
    "project": "Speakers",
    "authors": "Test Person (testperson@email.de)",
    "version": "1.0",
    "date": "2017-04-10",
    "encoding": "UTF-8",
    "validation_url": "config/localmode/functions.js"
  },
  "instructions": [
    {
      "group": "Spelling",
      "entries": [
        {
          "code": "R01",
          "priority": 100,
          "title": "Make use of correct spelling",
          "description": "Please take attention to correct spelling.",
          "examples": []
        }
      ]
    },
    {
      "group": "Punctuation",
      "entries": [
        {
          "code": "R06",
          "priority": 100,
          "title": "No punctuation characters",
          "description": "Do not use any punctuation characters.",
          "examples": [
            {
              "annotation": "Do not use any of <code>(.,!?;-)</code>",
              "url": ""
            }
          ]
        }
      ]
    }
  ],
  "markers": [
    {
      "id": 1,
      "name": "truncation marker start",
      "code": "[~abc]",
      "type": "normal",
      "icon": "assets/img/components/transcr-editor/default_markers/truncation_start.png",
      "button_text": "~abc",
      "description": "Set this marker at the beginning of the annotation if a word was cutted off.",
      "shortcut": {
        "mac": "ALT + 1",
        "pc": "ALT + 1"
      }
    },
    {
      "id": 2,
      "name": "filled pause",
      "code": "<nib>",
      "type": "normal",
      "icon": "assets/img/components/transcr-editor/default_markers/fil.png",
      "button_text": "filled pause",
      "description": "This marker is for the speakers hesitations like 'hm', 'ähm' and others.",
      "shortcut": {
        "mac": "ALT + 2",
        "pc": "ALT + 2"
      }
    },
    {
      "id": 3,
      "name": "intermittent noise",
      "code": "[int]",
      "type": "normal",
      "icon": "assets/img/components/transcr-editor/default_markers/int.png",
      "button_text": "intermittent noise",
      "description": "This marker is for noise like door slam, touching the microphone or something like that.",
      "shortcut": {
        "mac": "ALT + 3",
        "pc": "ALT + 3"
      }
    },
    {
      "id": 4,
      "name": "speaker noise",
      "code": "[spk]",
      "type": "normal",
      "icon": "assets/img/components/transcr-editor/default_markers/spk.png",
      "button_text": "speaker noise",
      "description": "This marker is for noise which was produced by the speaker, e.g. breathing loudly, laughing or something like that.",
      "shortcut": {
        "mac": "ALT + 4",
        "pc": "ALT + 4"
      }
    },
    {
      "id": 5,
      "name": "stationary noise",
      "code": "[sta]",
      "type": "normal",
      "icon": "assets/img/components/transcr-editor/default_markers/sta.png",
      "button_text": "stationary noise",
      "description": "This marker is for continuous, loud noise like traffic jam, music or radio in the background",
      "shortcut": {
        "mac": "ALT + 5",
        "pc": "ALT + 5"
      }
    },
    {
      "id": 6,
      "name": "unclear word",
      "code": "**",
      "type": "normal",
      "icon": "assets/img/components/transcr-editor/default_markers/stars.png",
      "button_text": "**",
      "description": "Set this marker before a word if it is not recognizable or if it is from another language.",
      "shortcut": {
        "mac": "ALT + 6",
        "pc": "ALT + 6"
      }
    },
    {
      "id": 7,
      "name": "truncation marker end",
      "code": "[abc~]",
      "type": "normal",
      "icon": "assets/img/components/transcr-editor/default_markers/truncation_end.png",
      "button_text": "abc~",
      "description": "Set this marker only at the end of the annotation if the last word was cutted off",
      "shortcut": {
        "mac": "ALT + 7",
        "pc": "ALT + 7"
      }
    },
    {
      "id": 8,
      "name": "break",
      "code": "<P>",
      "type": "break",
      "icon": "assets/img/components/transcr-editor/default_markers/break.png",
      "button_text": "Break",
      "description": "This marker represents a break that means a audio chunk without any speaking",
      "shortcut": {
        "mac": "ALT + P",
        "pc": "ALT + P"
      }
    }
  ]
}

7.3.2.2 Options

  • meta: Contains all meta data
    • meta.object_language: The spoken language
    • meta.project: project which uses this guidelines
    • meta.authors: names of the authors separated by commas
    • meta.version: version of this guidelines.
    • meta.date: Date of last change
    • meta.encoding: Encoding of this file. (e.g. UTF-8)
    • meta.validation_url: Url where the functions.json file is hosted. More information about functions.js
  • instructions: Array of guidelines (by name)
    • group: Name for the group. This is equal to the title that will be visible
    • entries: Array of instructions
      • code: Unique identifier. This is required for validation.
      • priority: Priority as number. This. could be needed for validation.
      • title: Title for instruction
      • description: Description for instruction
      • examples: Array of examples
        • annotation: Annotation text
        • url: Media URL (to audio file or mp4 file)
  • markers: Array of markers
    • id: Unique number for this marker. It is recommended to start with 1
    • name: Name for this marker in English
    • code: Text marker that is used to replace the associated image marker
    • type: “normal” or “break”. “break” means that there is no speech
    • icon_url: URL to the image icon
    • button_text: Text that is visible to the user. This should be translated for each language.
    • description: Description. This should be translated for each language.
    • shortcut:
      • mac: Shortcut for MacOS systems.
      • pc: Shortcut for Windows or Linux systems.

7.3.3 Validation

OCTRA needs an url to a functions.js file. This functions.js is needed to enable validation. If you want to have your transcription guidelines validated you need to create a functions.js file. This file should be hosted on a web server and referenced in each guidelines.json.

To create a new functions.js you can duplicate the functions.js in ./config/localmode/ and change as you need.

The functions.js has the following structure:

 /**
     * validates a given transcript with given guidelines
     * @param annotation transcript of an transcript unit
     * @param guidelines parsed JSON guidelines
     * @returns {{start: number, length: number, code: string}}
     */
    function validateAnnotation(annotation, guidelines) {
        var result = [];
    
        // ...
    
        //the next line has to be before returning the result
        result = sortValidationResult(result);
        return result;
    }
    
    /**
     * Cleans up the transcript of a given transcript unit. This method is called before annotation was saved
     * @param annotation transcript of an transcript unit
     * @param guidelines parsed JSON guidelines
     * @returns string
     */
    function tidyUpAnnotation(annotation, guidelines) {
        var result = annotation;
    
        // ...
        
        return result;
    }
    
    
    /*
    ###### Default methods.
     */
    function escapeRegex(regex_str) {
        //escape special chars in regex
        return regex_str.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
    }
    
    function sortValidationResult(result) {
        return result.sort(function (a, b) {
            if (a.start === b.start)
                return 0;
            if (a.start < b.start)
                return -1;
            if (a.start > b.start)
                return 1;
        });
    }

7.3.3.1 Methods

7.3.3.2 validateAnnotation(annotation, guidelines)

If you implement this method you need to go through the annotation text and parse it using regular expressions.

  • annotation: raw annotation text. Noise markers were replaced by their codes before. If an error was found you need to add it to an array. After all errors were collected you must sort and return this array. The structure for elements is:

    {
        start: <number>,    // text position
        length: <number>,   // length of the invalid text part
        code: <string>      // affected guideline
    }
  • guidelines: The whole JSON object from the current loaded guidelines (e.g. en or de)

7.3.3.3 tidyUpAnnotation(annotation, guidelines)

Before the transcript was saved the transcript must often be cleaned up (e.g. removing white spaces). You can clean up the annotation text using regular expressions and return the result.

7.3.3.3.1 Example
/**
    * validates a given transcript with given guidelines
    * @param annotation transcript of an transcript unit
    * @param guidelines parsed JSON guidelines
    * @param annotation
    * @returns {{start: number, length: number, code: string}}
    */
function validateAnnotation(annotation, guidelines) {
    var result = [];

    //R06 Satzzeichen
    var re = /[\(\.,\!\?;\)]/g;
    while ((match = re.exec(annotation)) != null) {
        result.push({
            start: match.index,
            length: match[0].length,
            code: "R06"
        });
    }

    //M01
    for (var i = 0; i < guidelines.markers.length; i++) {
        var marker = guidelines.markers[i].code;

        re = new RegExp("(" + escapeRegex(marker) + ")( *(" + escapeRegex(marker) + "))+", "g");
        while ((match = re.exec(annotation)) != null) {
            result.push({
                start: match.index,
                length: match[0].length,
                code: "M01"
            });
        }
    }

    //the next line has to be before returning the result
    result = sortValidationResult(result);
    return result;
}

/**
     * Cleans up the transcript of a given transcript unit. This method is called before annotation was saved
     * @param annotation transcript of an transcript unit
     * @param guidelines parsed JSON guidelines
     * @returns string
 */
function tidyUpAnnotation(annotation, guidelines) {
    var result = annotation;

    result = result.replace(/<[~^a-z0-9]+>/g, function (x) {
        return " " + x + " ";
    });
    //set whitespaces before *
    result = result.replace(/(\w|ä|ü|ö|ß|Ü|Ö|Ä)\*(\w|ä|ü|ö|ß|Ü|Ö|Ä)/g, "$1 *$2");
    //set whitespaces before and after **
    result = result.replace(/(\*\*)|(\s\*\*)|(\*\*\s)/g, " ** ");

    //replace all numbers of whitespaces to one
    result = result.replace(/\s+/g, " ");
    //replace whitespaces at start an end
    result = result.replace(/^\s+/g, "");
    result = result.replace(/\s$/g, "");
    return result;
}
    
    
    /*
    ###### Default methods.
     */
    function escapeRegex(regex_str) {
        //escape special chars in regex
        return regex_str.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$\\&');
    }
    
    function sortValidationResult(result) {
        return result.sort(function (a, b) {
            if (a.start === b.start)
                return 0;
            if (a.start < b.start)
                return -1;
            if (a.start > b.start)
                return 1;
        });
    }