Transcribe Speech Node

The Transcribe Speech node converts audio input into text using a selected OpenAI Whisper model and reference text. This powerful node enables accurate speech-to-text conversion, making it ideal for transcription services, voice input processing, accessibility features, and language learning applications.

Transcribe Speech node

Basic Usage

Use Speech Input Node, Display Text, Text, and Transcribe Speech Node for your process.

Inputs

The Transcribe Speech node accepts the following inputs:

Input: The audio file or recording to be transcribed into text.
Reference (Optional): Reference text that can help improve transcription accuracy, especially for specialized vocabulary, names, or technical terms.

Outputs

Output: The transcribed text from the audio input.

Configuration

Model Selection

Select the OpenAI Whisper model to use for transcription:

OpenAI Whisper: Advanced speech recognition model supporting multiple languages and accents
- High accuracy for various audio qualities
- Supports 99+ languages
- Handles background noise and multiple speakers
- Automatic language detection

Available Models

OpenAI Text to Speech: General-purpose text-to-speech model by OpenAI.
OpenAI Alloy: A distinct voice option from OpenAI.
OpenAI Fable: A voice option from OpenAI designed for storytelling.
OpenAI Nova: A unique voice option from OpenAI.
OpenAI Onyx: A further voice option from OpenAI.
11LAB Sue: A high-fidelity voice model from 11LABS.
11LAB Katie: Another voice model from 11LABS.
11LAB Alexander: A third voice model from 11LABS.

Usage

Setting Up Transcribe Speech

Add the node to your flow canvas.
Select Model: Choose OpenAI Whisper for speech transcription.
Connect Input: Link from upstream nodes (e.g., Speech Input Node, File Upload) to provide the audio content.
Connect Reference (Optional): Link reference text to improve accuracy for specialized content.
Connect Output: Link to downstream nodes (e.g., Display Text, Text Join, AI General Prompt) to use the transcribed text.

Example Workflows

Convert audio input into Note Transcription

Scenario: Create a voice note transcription tool that converts spoken recordings into written text for easy review and editing.

Transcribe Speech Example

Steps to Create the Flow:

Add a Start Node.
Add and connect a Speech Input Node for audio recording.
- Configure to allow voice recording
- Set appropriate recording duration limits
Add and connect a Transcribe Speech Node.
- Select Model: OpenAI Whisper
- Connect the Speech Input Node's Output to Input
Add and connect a Display Text node to show the transcribed text.

Result: Users can record voice notes and immediately see them transcribed into text, making it easy to capture thoughts and ideas verbally.

Basic Usage​

Inputs​

Outputs​

Configuration​

Model Selection​

Available Models​

Usage​

Setting Up Transcribe Speech​

Example Workflows​

Convert audio input into Note Transcription​