Learn and Explore Transcription Data (Speech-to-text)

There are many speech-to-text (S2T) transcription service providers available. Popular providers include Speechmatics, Azure, Google, Fraunhofer, to name a few.

DAVID has also created a proof of concept for a Whipser-based S2T service provider that can be hosted on-premise.

How transcription data (S2T) is created in DigaSystem

In DigaSystem transcription data can be created in various ways

Centralized approach:
a workflow (e.g. PreProductionAudio) is watching for new content in DigaSystem tables and is adding transcription data for new entries. The resulting DigaSystem S2T information is stored together with the entry and all applications can make use of it
Application approach:
Alternatively or additionally, some applications, like CAE (ContextualAudioEditing) or MTE, provide the functionality to directly and immediatly talk to S2T service providers and create transcription data inside the application

Technically, talking to a S2T service provider and creating DigaSystem compatible S2T data consists of 2 steps:

Using the API of the service provider to create S2T data in the providers data format
Transforming the providers data format into the DigaSystem S2T data format

Currently supported S2T service providers

The transcription service providers supported by each of the above alternatives might differ.

PreProductionAudio workflow

Speechmatics v2 protocol: that includes Speechmatics but also custom services that simulate the Speechmatics v2 protocol (API and data) like our own Whisper-based web service

MTE

Speechmatics v2 protocol, including the Whipser-based DAVID solution

ContextualAudioEditing

Speechmatics v1 protocol (old and deprecated)
Speechmatics v2 protocol, including the Whisper-based DAVID solution
Microsoft Azure protocol
Deepgram protocol
Auphonic protocol

How to support additional S2T service providers

To optimize for feature set, stability, and flexibility, DAVID will not develop a new technical interface for each additional provider.

Instead, we are providing information that supports others in integrating any arbitrary S2T service provider into DigaSystem:

Description of the DigaSystem S2T data format,
Example code

Approach A: Simulate a supported protocol, e.g. Speechmatics v2

Write a web service that simulates the API protocol of a supported S2T service provider. This service internally talks to your transcription provider
This web service must also transform the S2T data format of your transcription provider to the DigaSystem S2T format

We did this with our Whisper-based web service that is simulating the Speechmatics v2 protocol.

Approach B: Create DigaSystem S2T files for entries through a custom workflow

Write a workflow that talks to your transcription provider and transforms the received S2T data into DigaSystem compatible S2T data
Use the DPE ContentService API to add the created S2T file to an entry

See...

Roundtrip Example Code

DAVID S2T Transcription Data Format