Skip to content

Text To Image

Overview

The Text To Image node generates an image from a text prompt by calling a diffusion model provider (e.g. Stable Diffusion, DALL·E, Flux). It requires a provider credential configured in the workflow's secrets and emits a signed URL or base64-encoded image on success, or routes to the error port on failure. Use this node in battle workflows where image generation is the contender's primary task, or as a preprocessing step that produces visual input for downstream judge or scoring nodes.

Configuration

FieldTypeRequiredDescription
providerenumYesDiffusion model provider to call. Supported values: openai (DALL·E 3), stability (Stable Diffusion), fal (Flux). Determines which credential key is resolved at runtime.
modelstringNoProvider-specific model ID to use (e.g. dall-e-3, stable-diffusion-xl-1024-v1-0, fal-ai/flux/dev). Defaults to the provider's recommended model when omitted.
sizeenumNoOutput image dimensions. Accepted values: 256x256, 512x512, 1024x1024, 1024x1792, 1792x1024. Defaults to 1024x1024. Not all sizes are supported by every provider.
qualityenumNoGeneration quality tier. Accepted values: standard, hd. Only applicable to DALL·E 3; ignored by other providers. Defaults to standard.
num_inference_stepsnumberNoNumber of denoising steps for diffusion-based providers (Stability, Fal). Higher values improve quality at the cost of latency and token spend. Typical range 20–50. Defaults to 30.
output_formatenumNoFormat of the image data emitted on the output port. Accepted values: url (signed expiring URL), base64 (raw PNG data URI). Defaults to url.
prompt_input_keystringNoKey in the incoming data object to use as the text prompt. Defaults to prompt. Override when the upstream node emits the prompt under a different key (e.g. text, caption).

Inputs

PortTypeDescription
inputobjectIncoming data object. Must contain the text prompt under the key specified by prompt_input_key (default: prompt). Any additional keys are passed through and merged into the output object.

Outputs

PortTypeDescription
outputobjectOriginal input object merged with an image field containing the generated image. When output_format is url, image is a string URL. When output_format is base64, image is a data URI (data:image/png;base64,...).
errorobjectEmitted when the provider returns an error or the request times out. Contains the original input object plus an error field with code (string) and message (string). Wire this port to a fallback or retry node to handle generation failures gracefully.

Example

json
{
  "nodeType": "text_to_image",
  "config": {
    "provider": "openai",
    "model": "dall-e-3",
    "size": "1024x1024",
    "quality": "hd",
    "output_format": "url"
  }
}