Workflows

Pricing

Chatterbox Text to Speech

Text to speech workflow using Chatterbox

Chatterbox

TTS

410

Generates in about -- secs

floyoofficial

Nodes & Models

ComfyUI Official

WorkflowGraphics

LoadAudio

SaveAudio

PreviewAudio

ComfyUI-ChatterboxTTS

ChatterboxTTS

ChatterboxVC

ComfyUI-For-ChatterBox

ChatterboxTTS

ComfyUI-Chatterbox

ChatterboxTTS

ChatterboxVC

ChatterBox TTS is an open‑source text‑to‑speech and voice‑cloning system that turns text into natural‑sounding speech, lets you clone voices from a few seconds of audio, and gives fine control over emotion and intensity.

What it does

Converts text into high‑quality speech with controls for pitch, speed, and emotion (from neutral to highly dramatic).
Performs zero‑shot voice cloning: upload a short reference clip (around 5 seconds) and it can mimic that voice without separate training.
Supports multilingual output (around 22 languages) and can keep a cloned voice consistent across languages for dubbing/localization.

Voice change and control

Works as a voice changer by cloning a target voice and then speaking any input text in that style, allowing accent, pacing, and emotional intensity adjustments.
Provides explicit “exaggeration” or intensity parameters so you can dial emotion and expressiveness up or down programmatically.
Includes watermarking/provenance options (PerTh) in some deployments so synthetic audio can be detected and tracked responsibly.

How it’s typically used

Via web UIs where you paste text, choose or clone a voice, adjust emotion/pacing, and download audio.
As a self‑hosted or API‑based engine for agents, NPCs, audiobooks, podcasts, accessibility tools, or localized dubbing.

Discover more workflows

You might like these too.

VibeVoice: Single-Speaker Text to Speech

floyoofficial

941

text to speech

TTS

VibeVoice

voice cloning

VibeVoice

VibeVoice: Single-Speaker Text to Speech

VibeVoice

Voice Changer using TTS Audio Suite (ChatterBox)

floyoofficial

697

audio

Audio2Audio

Chatterbox

tts

TTS Audio Suite

voice conversion

Convert any voice to match a target speaker using ChatterBox TTS. Upload source and narrator audio, run it, get back a converted MP3. No voice training needed.

Voice Changer using TTS Audio Suite (ChatterBox)

Convert any voice to match a target speaker using ChatterBox TTS. Upload source and narrator audio, run it, get back a converted MP3. No voice training needed.

floyoofficial

24.7k

AiVideo

API

image to video

video generation

wan 2.5

Wan 2.5: Image to Video with Audio

Z-Image Turbo: Fast Image Generation in Seconds

floyoofficial

21.1k

Marketing

Photography

Production

Text2Image

Z-Image Turbo

Fast Image Generation in Seconds

Z-Image Turbo: Fast Image Generation in Seconds

Fast Image Generation in Seconds

floyoofficial

14.1k

API

gemini 3 pro

Image2Image

typography

Google just released Nano Banana Pro, and honestly, it's a pretty big step up from the original Nano Banana. The main thing? It can actually put legible text in images now. Like, real text that you can read, not the garbled nonsense most AI models spit out.

Nano Banana Pro: Generate & Edit Images