Workflows

Pricing

Seedance 2.0 Reference-to-Video with LLM Prompt

seedance

text to video

video generation

252

Generates in about 4 mins 29 secs

ashree

Nodes & Models

Floyo Partner Nodes

LLM_floyo

Ver Private

Comm Use

Seedance20FastReferenceToVideo_floyo

Ver Private

Comm Use

VideoToFrames

ComfyUI Official

LoadImage

WorkflowGraphics

ComfyUI-VideoHelperSuite

VHS_VideoCombine

ComfyUI_StarNodes

VHS_VideoCombine

ComfyUI-S3-IO

VHS_VideoCombine

Cinematic video generation with Seedance 2.0 Fast, paired with an LLM prompt builder.

Upload up to nine reference images. Write your scene in plain prose, or use labeled sections (visual, dialog, audio, camera, style). The LLM rewrites it into Seedance's exact prompt format. Then Seedance 2.0 generates the video with native audio.

Default output is 8 seconds at 720p, 16:9. The video gets re-encoded as h265 mp4 with the audio track preserved.

How do you use Seedance 2.0 reference-to-video?

Upload your reference images. Write a scene in prose or labeled sections like [VISUAL], [DIALOG], [AUDIO], [CAMERA], [STYLE]. The LLM converts your input into Seedance's format with proper dialog quoting, layered audio cues, and reference image tags. Set duration, resolution, aspect ratio, and audio toggle. Run.

Reference images Up to nine slots. The LLM tags them as [Image1], [Image2], [Image3] in the final prompt so Seedance knows what to anchor on. Use them for characters you want to keep consistent, environments you want to match, or lighting moods you want to carry through. Want one strong reference? Use a single image and leave the rest empty. Want a multi-element scene? Spread your references across slots and tell the LLM which is which in your prompt.

Prompt Two ways to write it. Plain prose works fine: describe the scene, the motion, the mood. Or use labeled sections for tighter control: [VISUAL] for the scene, [DIALOG] for any spoken lines, [AUDIO] for sound design, [CAMERA] for the move, [STYLE] for the visual treatment. The LLM handles the formatting Seedance expects, including dialog quote rules and safe rewrites of risky language.

LLM model Gemini 2.5 Flash by default. Fast and good at structured prompt rewriting. Swap to a stronger model if your prompts have heavy dialog or layered audio that needs more careful handling.

Duration 8 seconds is the default and the sweet spot for cinematic shots. Want a quick gesture or beat? Try 5s. Want a longer establishing sequence? Push to 10s. Note that Seedance pacing tightens as duration drops, so short clips feel snappier.

Resolution 720p is the default. 1080p for final delivery, 480p for fast iteration when you're testing prompts.

Aspect ratio 16:9 for cinematic and wide compositions. 9:16 for vertical short-form. 1:1 for square posts.

Generate audio On by default. Seedance 2.0 generates audio natively, and the LLM layers ambient sound, foreground effects, and music tone into the prompt for you. Turn it off if you're scoring the clip yourself in post.

Seed Randomize to explore variations of the same prompt. Lock a seed when you want to compare different prompt edits without motion drifting.

What is Seedance 2.0 reference-to-video good for?

Cinematic short clips where visual continuity matters. Reference images keep characters, environments, and art direction locked while Seedance handles motion and audio. Strong fit for anime sequences, film-style action shots, music videos, concept trailers, and any 5-to-10-second beat where mood and look outweigh long narrative.

Best for cinematic moments built around a specific look. Anime fight sequences, atmospheric establishing shots, character-driven beats, action cuts with a specific lighting mood. The LLM prompt builder helps when you want layered audio and clean camera language but don't want to memorize Seedance's exact format.

Skip this for long-form narrative or talking-head dialog scenes. Eight seconds caps your storytelling, and Seedance's audio is better at ambient and music tone than it is at extended dialog. For multi-shot sequences, generate clips one at a time and edit them together.

Also skip this if you have no reference images and want pure text-to-video. The reference-to-video setup is built around image anchoring. A plain text-to-video workflow will be faster and simpler.

FAQ

What's the difference between Seedance 2.0 reference-to-video and text-to-video? Reference-to-video uses your uploaded images as visual anchors for characters, environments, and style. Text-to-video generates everything from the prompt alone. Reference-to-video gives you tighter control over what your subject and scene look like. Text-to-video gives Seedance more freedom to interpret.

How many reference images can Seedance 2.0 take? Up to nine images, three videos, and three audio clips. Most shots only need one or two image references. The LLM tags them in order as [Image1], [Image2], [Image3] so Seedance knows which reference plays which role.

Does Seedance 2.0 generate audio natively? Yes. Audio generation is built in and on by default in this workflow. The LLM layers your audio cues (ambient base, sound effects, music tone) into the prompt so the output has a complete sound bed. Turn it off if you plan to score the clip yourself.

What durations does Seedance 2.0 Fast support? Common options are 5, 8, and 10 seconds. Eight is the default and the sharpest balance of motion and pacing. Shorter clips feel snappier. Longer clips give more breathing room for camera moves and atmosphere.

Why does the workflow use an LLM before Seedance? Seedance 2.0 has specific formatting rules: dialog needs straight double quotes with speaker tags, audio cues belong in a specific spot, camera moves draw from a fixed vocabulary. The LLM handles all of that automatically and applies safe rewrites if your prompt has risky language. You write naturally; the LLM produces the prompt Seedance wants.

How to run Seedance 2.0 online? You can run Seedance 2.0 online through Floyo. No installation, no setup. Open the workflow in your browser, upload your references, write your scene, and hit run.

Discover more workflows

You might like these too.

floyoofficial

3.1k

seedance

seedance 2.0

text to video

video generation

Generate up to 15-second videos with native audio from a text prompt using ByteDance's Seedance 2.0. Pick your aspect ratio, resolution, and duration.

Seedance 2.0 - Text to Video

Generate up to 15-second videos with native audio from a text prompt using ByteDance's Seedance 2.0. Pick your aspect ratio, resolution, and duration.

floyoofficial

552

image to video

ltx 2

text to video

video generation

Add seconds to an existing video with LTX 2.3. Upload a clip, set the duration and mode

LTX 2.3 - Extend Video

Add seconds to an existing video with LTX 2.3. Upload a clip, set the duration and mode

floyoofficial

24.5k

AiVideo

API

image to video

video generation

wan 2.5

Wan 2.5: Image to Video with Audio

Wan 2.1 FusionX: Cinematic Image to Video

floyoofficial

4.2k

FusionX

Image to Video

Video Generation

Wan

Created by @vrgamedevgirl on Civitai, please support the original creator!

Wan 2.1 FusionX: Cinematic Image to Video

Created by @vrgamedevgirl on Civitai, please support the original creator!

Z-Image Turbo: Fast Image Generation in Seconds

floyoofficial

20.8k

Marketing

Photography

Production

Text2Image

Z-Image Turbo

Fast Image Generation in Seconds

Z-Image Turbo: Fast Image Generation in Seconds

Fast Image Generation in Seconds

floyoofficial

14.0k

API

gemini 3 pro

Image2Image

typography

Google just released Nano Banana Pro, and honestly, it's a pretty big step up from the original Nano Banana. The main thing? It can actually put legible text in images now. Like, real text that you can read, not the garbled nonsense most AI models spit out.

Nano Banana Pro: Generate & Edit Images

floyoofficial

12.3k

VFX

Video2Video

Video Production

Wan2.6

Wan 2.6 Reference to Video