Skip to main content
Skip table of contents

Learn and Explore Transcription Data (Speech-to-text)

There are many speech-to-text (S2T) transcription service providers available. Popular providers include Speechmatics, Azure, Google, Fraunhofer, to name a few.

DAVID has also created a proof of concept for a Whipser-based S2T service provider that can be hosted on-premise.

How transcription data (S2T) is created in DigaSystem

In DigaSystem transcription data can be created in various ways

  • Centralized approach:
    a workflow (e.g. PreProductionAudio) is watching for new content in DigaSystem tables and is adding transcription data for new entries. The resulting DigaSystem S2T information is stored together with the entry and all applications can make use of it

  • Application approach:
    Alternatively or additionally, some applications, like CAE (ContextualAudioEditing) or MTE, provide the functionality to directly and immediatly talk to S2T service providers and create transcription data inside the application

Technically, talking to a S2T service provider and creating DigaSystem compatible S2T data consists of 2 steps:

  • Using the API of the service provider to create S2T data in the providers data format

  • Transforming the providers data format into the DigaSystem S2T data format

Currently supported S2T service providers

The transcription service providers supported by each of the above alternatives might differ.

PreProductionAudio workflow

  • Speechmatics v2 protocol: that includes Speechmatics but also custom services that simulate the Speechmatics v2 protocol (API and data) like our own Whisper-based web service

MTE

  • Speechmatics v2 protocol, including the Whipser-based DAVID solution

ContextualAudioEditing

  • Speechmatics v1 protocol (old and deprecated)

  • Speechmatics v2 protocol, including the Whisper-based DAVID solution

  • Microsoft Azure protocol

  • Deepgram protocol

  • Auphonic protocol

How to support additional S2T service providers

To optimize for feature set, stability, and flexibility, DAVID will not develop a new technical interface for each additional provider.

Instead, we are providing information that supports others in integrating any arbitrary S2T service provider into DigaSystem:

  • Description of the DigaSystem S2T data format,

  • Example code

Approach A: Simulate a supported protocol, e.g. Speechmatics v2

  • Write a web service that simulates the API protocol of a supported S2T service provider. This service internally talks to your transcription provider

  • This web service must also transform the S2T data format of your transcription provider to the DigaSystem S2T format

We did this with our Whisper-based web service that is simulating the Speechmatics v2 protocol.

Approach B: Create DigaSystem S2T files for entries through a custom workflow

  • Write a workflow that talks to your transcription provider and transforms the received S2T data into DigaSystem compatible S2T data

  • Use the DPE ContentService API to add the created S2T file to an entry

See...

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.