Audio Input
The Audio Input node allows users to record audio directly through their microphone or upload existing audio files during workflow execution. Unlike the Audio node which stores static audio files, the Audio Input node captures user-generated audio in real-time, making it essential for voice responses, speech assessments, audio submissions, and interactive voice-based activities.

Basic Usage
Use the Audio Input node to enable users to record audio through their microphone or upload audio files as part of the workflow interaction.
Inputs
The Audio Input node does not accept inputs from other nodes. It collects audio directly from users during flow execution.
Outputs
Audio Out (Blue Port)
Audio Data: Outputs the recorded or uploaded audio file.
- Connects to blue input ports of audio-processing nodes
- Passes the audio file/data to connected nodes
- Can be used for transcription, analysis, or storage
- Available after user completes recording or upload
Audio File URL (Green Port)
File URL: Outputs the URL of the recorded/uploaded audio file.
- Connects to green input ports of text-compatible nodes
- Provides the URL as text string
- Can be stored, displayed, or used in API calls
- Useful for referencing the audio file location
Configuration
The Audio Input node operates without configuration - it automatically provides recording and upload capabilities to users when the flow executes.
User Interface (Runtime)
When users reach this node during flow execution, they see:
Recording Interface:
- Microphone button to start/stop recording
- Recording timer
- Playback controls to review recording
- Option to re-record
Upload Interface:
- Button to upload existing audio files
- File browser to select audio files
- Supported formats: MP3, WAV, M4A, OGG, WebM, etc.
Example Workflows
Audio Transcription Workflow
Scenario: Record user audio and convert it to text using AI transcription.

Steps to Create the Flow:
-
Add a Start Node.
-
Add an Audio Input node:
- No configuration needed
- User will record or upload audio when flow runs
-
Add a Transcribe Speech node:
i. Configure transcription model:
- Model Dropdown: Select "OpenAI Whisper" (or other available models)
- Language Detection: Select "Auto Detect" (or specific language)
ii. Connect audio:
- Audio Input "Audio Out" (blue) → Transcribe Speech "Input" (blue)
-
Add a Display Text node:
- Connect: Transcribe Speech "Output" (green) → Display Text "Input" (green)
- This shows the transcribed text to the user
-
Connect flow control:
- Start → Audio Input (red to red)
- Audio Input → Transcribe Speech (red to red)
- Transcribe Speech → Display Text (red to red)
Preview:
[Start] → [Audio Input: User records audio]
→ [Transcribe Speech: Convert to text]
→ [Display Text: Show transcript]
Result: User records audio through their microphone, the AI transcribes it to text, and the transcript is displayed.
Related Nodes
- Transcribe Speech: Converts audio to text (primary processing node)
- Audio: Static audio files (different from user input)
- Play Sound: Plays audio back to users
- Text to Speech: Generates audio from text (opposite direction)
- Speech Input: Alternative audio capture node
- Display Text: Shows instructions and transcripts
- AI General Feedback: Analyzes transcribed content
- Data Dump: Stores audio URLs and metadata
Audio Input vs Other Audio Nodes
| Feature | Audio Input | Audio Node | Speech Input |
|---|---|---|---|
| Purpose | User records/uploads | Static audio files | User voice recording |
| Audio Source | User microphone/file | Pre-uploaded by creator | User microphone |
| When Captured | During flow execution | Before flow creation | During flow execution |
| Use Case | Assignments, responses | Lessons, instructions | Voice responses |
| File Management | Runtime capture | Stored in workflow | Runtime capture |
| Outputs | Audio + URL | Audio data | Audio data |
When to use Audio Input:
- Student audio submissions
- Voice-based assessments
- Speech practice and recording
- Audio feedback collection
- Voice interviews
- Dictation and notes
When to use Audio Node:
- Pre-recorded instructions
- Music and sound effects
- Lecture recordings
- Static audio content
When to use Speech Input:
- Simple voice recording
- Quick audio capture
- Speech-to-text only
- Voice commands
Summary
The Audio Input node enables powerful voice-based interactions:
✓ Interactive: Real-time user audio capture
✓ Flexible: Recording or file upload options
✓ Integrated: Works with transcription and AI nodes
✓ Accessible: Multiple output formats (audio + URL)
✓ Versatile: Supports diverse voice-based workflows
Master the Audio Input node to create engaging voice-based assessments, language learning activities, and interactive audio experiences that enhance user engagement and enable new forms of interaction.