Describe Image Node

The Describe Image node provides descriptions for given images using reference text and selected AI models. This powerful node enables image-to-text conversion, making it ideal for generating image captions, accessibility descriptions, content analysis, and automated image documentation.

Describe Image node

Basic Usage

Use Text tool, Image Input, Display Text, and Describe Image Node for your process.

Inputs

The Describe Image node accepts the following inputs:

Input

Type: Image to be described
Mandatory: Required
Works best with: Image Input, File Upload, Generate Image output

Reference

Type: Reference text to guide the description style or focus
Mandatory: Optional
Works best with: Text Input, Text node

Use reference text to specify what aspects of the image to focus on or what style of description is needed.

Overwrite System Prompt

Type: Custom system prompt to replace default behavior
Mandatory: Optional
Works best with: Text Input, Text node

Use this to completely customize how the AI analyzes and describes images.

Outputs

Output

Type: Text description of the image
Works best with: Display Text, Document Download, AI General Prompt

The output provides a detailed description of the image based on the selected model and any reference text provided.

Configuration

Model Selection

Select the AI model to use for image description. The node supports a wide range of vision-capable models:

GPT-4o (Default in screenshot)

GPT-4o is OpenAI's advanced multimodal model with excellent vision capabilities for detailed and accurate image descriptions.

Available Models

Vision Models:

Mistral OCR: Specialized for optical character recognition and text extraction from images
GPT-5 Nano: Compact model for quick image analysis
Gemini 2.5 Flash Lite: Google's lightweight vision model for fast descriptions
GPT-4o-mini: Efficient model for general image descriptions
GPT-5 Mini: Compact OpenAI model with vision capabilities
Claude 3 Haiku: Anthropic's fast and efficient vision model
Claude 3.5 Haiku: Enhanced version of Claude Haiku
Gemini 2.5 Flash: Google's fast vision model
GPT-5 Codex: Specialized for code and technical diagrams
Gemini 2.5 Pro: Google's professional-grade vision model

Advanced Models:

GPT-5: Advanced OpenAI model with superior vision understanding
GPT-5 Chat Latest: Latest conversational model with vision
Grok 4: Advanced vision and reasoning capabilities
Claude Sonnet 4: Anthropic's balanced vision model
Claude Sonnet 4.5: Enhanced Sonnet with improved vision
GPT-4o: OpenAI's flagship multimodal model
Claude Opus 4.1: Anthropic's most powerful vision model
Claude Opus 4: High-capability vision analysis
Claude 3 Opus: Anthropic's advanced vision model
GPT-5 Pro: Professional-grade vision analysis
GPT-4.5 Preview: Preview of next-generation GPT vision
GPT-4 Vision: OpenAI's vision-specialized model
GPT-4.1 Mini: Compact advanced vision model
GPT-4.1 Nano: Ultra-compact vision model
Mathpix OCR: Specialized for mathematical notation and equations

Example Workflows

Accessibility Image Descriptions

Scenario: Generate detailed accessibility descriptions for images in educational content.

Describe Image Example

Steps to Create the Flow:

Add a Start Node.
Add and connect an Image Input or File Upload for the image to describe.

Add and connect a Text node with reference instructions.

Example reference text:

Provide a detailed accessibility description suitable for screen readers. Include:
- Main subject and composition
- Colors and visual elements
- Text content if present
- Spatial relationships
- Important details for understanding

Add and connect a Describe Image Node.
- Select Model: GPT-4o for detailed descriptions
- Connect Image Input to Input
- Connect Text node to Reference
Add and connect a Display Text to show the description.

Result: Images are automatically described with detailed, accessibility-friendly text suitable for screen readers and visually impaired users.

Basic Usage​

Inputs​

Input​

Reference​

Overwrite System Prompt​

Outputs​

Output​

Configuration​

Model Selection​

GPT-4o (Default in screenshot)​

Available Models​

Example Workflows​

Accessibility Image Descriptions​