Qwen LoRA Trainer with Auto-Caption
Train a custom Qwen Image Edit 2511 LoRA from a folder of images. Florence 2 auto-captions, Musubi trains, and a built-in test section previews results.
Image to Image
Lora Trainer
Qwen
0
17
Nodes & Models
UNETLoader
qwen_image_edit_2511_bf16.safetensors
LoadImage
Text Multiline
Florence2ModelLoader
OrchestratorNodeGroupBypasser
Note
LoraLoaderModelOnly
Qwen-Image-Edit-2511-Lightning-4steps-V1.0-bf16.safetensors
ImageScaleToTotalPixels
Text Concatenate
ModelSamplingAuraFlow
TextEncodeQwenImageEditPlus
VAEEncode
JWImageResizeToClosestSDXLResolution
CFGNorm
FluxKontextMultiReferenceLatentMethod
Florence2Run
KSampler
easy promptConcat
VAEDecode
FL_ImageCaptionSaver
SaveImage
CR Prompt List
LoadImageListFromDir //Inspire
LoadImageListFromDir //Inspire
ShowText|pysssss
ShowText|pysssss
Description:
Train your own LoRA on Qwen Image Edit 2511 from a folder of reference images.
Drop in 15-25 images of your character, product, or style. The workflow auto-captions them with Florence 2, runs the training, and saves the LoRA after every epoch so you can pick the best one. A test section is built in so you can preview the result on new prompts before downloading.
Three sections: dataset prep, captioning, training. Enable one at a time and run them in order.
How do you train a LoRA on Qwen Image Edit 2511?
Drop a folder of 15-25 images into the dataset section, then run the captioning step. Florence 2 writes a detailed caption for each image. Point the trainer at the captioned folder, pick a name, and run. The LoRA saves to your outputs folder after every epoch so you can compare and pick the best version.
Most defaults are tuned. You only need to handle three things before training: your images, your output name, and a quick check that the dataset path matches across sections.
Dataset path The folder where your images live. For character or product LoRAs, 15-25 photos work well: varied angles, varied lighting, same subject. For style LoRAs, push to 50+ images that share a clear visual language.
Output name What you want the LoRA file called. Use the character or style name. This becomes the filename and the folder where checkpoints save.
Caption mode Florence 2 runs in "more_detailed_caption" mode by default. That suits character and object LoRAs. Want shorter, tag-style captions for a style LoRA? Switch the task to "caption" or "detailed_caption" instead.
Network dim The size knob. Default is 16. Want a stronger, more rigid LoRA that locks the subject hard? Try 32. Want a smaller file that leaves more room for prompt variation? Drop to 8.
Learning rate 5e-05 is the safe default. Want faster training? Bump to 1e-04 and watch for overcooking. Training a careful style LoRA where subtlety matters? Drop to 1e-05.
Max train epochs / save every n epochs 16 epochs by default, saves every epoch. The trainer dumps a checkpoint each pass so you can test them side by side and pick the strongest one. Want more granularity for comparison? Push epochs to 30.
Resolution 1024x1024 by default. Drop to 768 if you hit VRAM limits.
Gradient checkpointing On by default. Saves a lot of VRAM at the cost of slightly slower steps. Leave it on unless you're on a 48GB+ card and want maximum speed.
What can you train a LoRA on with Qwen Image Edit 2511?
Any custom subject, style, or look you want to apply through image edits. Character consistency across multiple edits, product retention through scene changes, brand style enforcement, signature artist looks. Qwen Image Edit 2511 is an editing model, so your LoRA controls how edits behave, not how images get generated from scratch.
Concrete cases: train a character LoRA so the same person shows up reliably across outfit swaps, background changes, and pose edits. Train a product LoRA so a sneaker keeps its exact shape and logo when dropped into different scenes. Train a style LoRA so every edit comes out in the same painted, cinematic, or anime treatment.
Where it falls short: Qwen Image Edit LoRAs are tied to the edit model. If you want a LoRA for text-to-image generation from scratch, train it on a base generation model instead. Also: the dataset matters more than the settings. 15 great photos beat 100 messy ones every time.
The workflow uses Musubi for training and Florence 2 for captioning, both running on Qwen Image Edit 2511.
FAQ
How many images do I need to train a Qwen Image Edit 2511 LoRA? 15-25 images is the sweet spot for character or product LoRAs. Style LoRAs do better with 50+. Quality beats quantity. Varied angles, varied lighting, clean backgrounds when possible. Skip blurry shots and heavy filters since the model learns whatever artifacts are in your dataset.
What's the best learning rate for a Qwen Image Edit 2511 LoRA? 5e-05 is the default and works for most subjects. For character LoRAs, stick close to it. For style LoRAs where you want subtle influence, try 1e-05 or 2e-05. Going higher than 1e-04 risks overcooking, where the LoRA forces the look on every prompt regardless of what you ask for.
How do I pick the best epoch from a Qwen Image Edit 2511 LoRA training run? The trainer saves a checkpoint after each epoch. Load each one in the test section and run the same prompt across all of them. Earlier epochs tend to be looser and respond better to prompt variation. Later epochs lock the subject harder. Pick the one that holds the subject without ignoring your prompt.
Do I need to write captions manually for my Qwen Image Edit 2511 LoRA dataset? No. The captioning section uses Florence 2 to generate detailed captions automatically. Run it once after your dataset is ready. You can review and edit the captions in the saved text files if you want to clean them up before training, but auto-captions work fine for most LoRAs.
How to run Qwen Image Edit 2511 LoRA training online? You can run Qwen Image Edit 2511 LoRA training online through Floyo. No installation, no setup. Open the workflow in your browser, upload your images, and step through the three sections. Free to try.
Read more

