Learn and Explore Transcription Data (Speech-to-text)
There are many speech-to-text (S2T) transcription service providers available. Popular providers include Speechmatics, Azure, Google, Fraunhofer, to name a few.
DAVID has also created a proof of concept for a Whipser-based S2T service provider that can be hosted on-premise.
How transcription data (S2T) is created in DigaSystem
In DigaSystem transcription data can be created in various ways
Centralized approach:
a workflow (e.g. PreProductionAudio) is watching for new content in DigaSystem tables and is adding transcription data for new entries. The resulting DigaSystem S2T information is stored together with the entry and all applications can make use of itApplication approach:
Alternatively or additionally, some applications, like CAE (ContextualAudioEditing) or MTE, provide the functionality to directly and immediatly talk to S2T service providers and create transcription data inside the application
Technically, talking to a S2T service provider and creating DigaSystem compatible S2T data consists of 2 steps:
Using the API of the service provider to create S2T data in the providers data format
Transforming the providers data format into the DigaSystem S2T data format
Currently supported S2T service providers
The transcription service providers supported by each of the above alternatives might differ.
PreProductionAudio workflow
Speechmatics v2 protocol: that includes Speechmatics but also custom services that simulate the Speechmatics v2 protocol (API and data) like our own Whisper-based web service
MTE
Speechmatics v2 protocol, including the Whipser-based DAVID solution
ContextualAudioEditing
Speechmatics v1 protocol (old and deprecated)
Speechmatics v2 protocol, including the Whisper-based DAVID solution
Microsoft Azure protocol
Deepgram protocol
Auphonic protocol
How to support additional S2T service providers
To optimize for feature set, stability, and flexibility, DAVID will not develop a new technical interface for each additional provider.
Instead, we are providing information that supports others in integrating any arbitrary S2T service provider into DigaSystem:
Description of the DigaSystem S2T data format,
Example code
Approach A: Simulate a supported protocol, e.g. Speechmatics v2
Write a web service that simulates the API protocol of a supported S2T service provider. This service internally talks to your transcription provider
This web service must also transform the S2T data format of your transcription provider to the DigaSystem S2T format
We did this with our Whisper-based web service that is simulating the Speechmatics v2 protocol.
Approach B: Create DigaSystem S2T files for entries through a custom workflow
Write a workflow that talks to your transcription provider and transforms the received S2T data into DigaSystem compatible S2T data
Use the DPE ContentService API to add the created S2T file to an entry
See...