floyo logo
Workflows
Pricing
floyo logo
Workflows
Pricing

ACE-Step 1.5 XL - Text to Music

Generate full songs with ACE-Step 1.5 XL Base. Write a style prompt, add structured lyrics like [Intro] [Verse] [Chorus], pick BPM and key, get an MP3.

124

Generates in about -- secs

Nodes & Models

UNETLoader
acestep_v1.5_xl_base_bf16.safetensors
VAELoader
ace_1.5_vae.safetensors
DualCLIPLoader
qwen_0.6b_ace15.safetensors
qwen_4b_ace15.safetensors
PrimitiveInt
WorkflowGraphics
ModelSamplingAuraFlow
EmptyAceStep1.5LatentAudio
TextEncodeAceStepAudio1.5
ConditioningZeroOut
KSampler
VAEDecodeAudio
SaveAudioMP3

ACE-Step 1.5 XL Base turns text into full songs. Write a style description (genre, mood, instruments, tempo) and structured lyrics with section markers like [Intro] [Verse] [Chorus], then run it. You get back an MP3 up to a few minutes long. Works for instrumental tracks or vocal songs, in multiple languages.

How do you generate music with ACE-Step 1.5 XL Base?

Write two inputs: a style prompt describing the sound (genre, mood, tempo, instruments, key) and structured lyrics with section markers like [Intro], [Verse], [Chorus]. Set BPM, key, and duration. ACE-Step 1.5 XL Base turns that into an MP3 with a coherent arrangement and matching vocals if you want them.

Style prompt (tags) This is where the sound comes from. Write it like a producer note: genre, mood, key instruments, tempo, energy. "Cinematic mecha anime launch theme, pulsing synths, industrial percussion, heroic orchestral brass, 150 BPM, E minor" gives the model a clear target. Stack 4 to 8 descriptors. Name specific instruments and reference artists or eras. Vague mood words get amplified and become hard to escape.

Lyrics Use section markers to control structure: [Intro], [Verse], [Pre-Chorus], [Chorus], [Bridge], [Drop], [Breakdown], [Outro]. The model treats these as arrangement cues. For instrumentals, write stage directions inside sections: "(Pulsing synth, radio chatter)", "(Full orchestra, choir: Aaah!)". For vocal tracks, write real lyrics line by line inside each section.

BPM (default 150) Match BPM to your genre. 70 to 90 for ballads and lofi. 100 to 120 for pop and rock. 125 to 135 for house and techno. 140 to 160 for drum and bass, trance, or high-energy cinematic. 170+ for hardcore.

Key scale (default E minor) Minor keys feel darker and more cinematic. Major keys feel brighter and more upbeat. If you want happy pop, try C major or G major. If you do not care, leave it on E minor.

Duration (default 120 seconds) Two minutes covers a short song with intro, verse, chorus, outro. Push to 180 or 240 seconds for full song structures. Going much longer starts to test the model's long-range coherence between sections.

CFG scale (default 6 in KSampler) Higher CFG follows your prompt more tightly. Lower CFG gives the model more freedom and often sounds more musical. The default 6 balances both. If your output ignores your style cues, push it up to 7 or 8. If it feels stiff and over-directed, drop to 4 or 5.

Temperature (0.85) and top_p (0.9) These control how adventurous the generation gets. Lower values give predictable, safer outputs. Higher values open up more variation between seeds. Defaults are a good starting point. Try temperature 0.95 if your results feel samey.

Steps (default 50) 50 is the sweet spot. 30 is faster but can sound mushy on complex arrangements. Past 70 rarely helps. Leave it alone unless you are benchmarking.

What is ACE-Step 1.5 XL Base good for?

Demos, mood tracks, game soundtracks, jingles, and any project where you want control over structure, lyrics, and genre. Strongest on pop, rock, electronic, cinematic, and anime themes. Handles instrumental and vocal tracks in multiple languages. Not a finished studio master, but fast enough to generate twenty variations before lunch.

Best for producers prototyping hooks before they hit the studio. Film editors who need a 90-second cue in a specific style. Indie game devs building a menu theme or boss track. Songwriters testing melodies against lyrics without booking musicians. Creators who want a B-roll track that matches the vibe instead of fighting it.

Less useful when you need a polished commercial master, or when you need a specific artist's voice reproduced. Weaker on complex jazz harmony and dense classical orchestration. For those, keep your DAW open.

FAQ

What BPM should I use with ACE-Step 1.5 XL Base? Match BPM to the genre. 70 to 90 for ballads and lofi. 100 to 120 for pop and rock. 125 to 135 for house and techno. 140 to 160 for drum and bass, trance, or cinematic anime themes. Higher BPMs work but get harder to keep coherent past 180.

Does ACE-Step 1.5 XL Base support vocals and lyrics? Yes. Write real lyrics inside section markers like [Verse] and [Chorus]. The model sings them in the style of your prompt. For instrumental tracks, write stage directions like "(Pulsing synth, radio chatter)" inside sections, and the model treats those as arrangement cues rather than words to sing.

How long can ACE-Step 1.5 XL Base songs be? Default duration is 120 seconds. You can push to 180 or 240 seconds for fuller song structures with intro, multiple verses, chorus, bridge, and outro. Past that, the model starts losing long-range coherence between sections. For longer tracks, generate parts and stitch them in a DAW.

What languages does ACE-Step 1.5 XL Base support? The language field accepts English and several others through the dropdown. For non-English vocals, match the language setting to the lyrics you write. Style prompts still work in English even when the vocals are in another language.

How do you run ACE-Step 1.5 XL Base online? You can run ACE-Step 1.5 XL Base online through Floyo. No installation, no setup. Open the workflow in your browser, write your style prompt and lyrics, hit run. Free to try.

Read more

N