floyo logo
Workflows
Pricing
floyo logo
Workflows
Pricing

Qwen3 ASR: Transcribe Audio

Upload audio and Qwen3's ASR engine returns the transcript, word-level timing for SRT subtitles, and an optional translation to English. Language auto-detected.

40

Generates in about -- secs

Nodes & Models

Qwen3TTSEngineNode
LoadAudio
MarkdownNote
UnifiedASRTranscribeNode
PreviewAny

Qwen3 audio transcription with optional English translation.

Upload an audio file, pick transcribe or translate, and get back clean text. Keep the forced aligner on and you also get timing data you can use to build an SRT subtitle file.

Auto language detection means you don't need to know what's in the audio before running it.

How do you transcribe audio with Qwen3?

Load your audio file, leave the engine on its 0.6B default, and run it. Qwen3 detects the language and returns the text. For SRT subtitles, keep the forced aligner on and the workflow outputs word-level timing alongside the transcript. To get English instead of the source language, switch the task to translate.

Audio file Drop in any audio. MP3, WAV, the usual formats. Length isn't a hard ceiling because the workflow chunks longer files automatically.

Task: transcribe or translate Want the original words written out? Use transcribe. Want the audio rendered in English no matter the source language? Switch to translate. The translate target defaults to English but you can set a different target language if you need one.

Forced aligner (asr_use_forced_aligner) On by default. Leave it on if you need timing data for subtitles or word-level timestamps. Turn it off if you want raw text and the fastest possible run.

Language Auto by default. The engine figures it out. Lock it to a specific language if your audio is mixed and Qwen3 keeps switching tracks on you.

Model size (0.6B) The default. Small, fast, and accurate enough for most spoken audio. Handles podcasts, interviews, and meeting recordings without issue.

Chunk size and overlap Defaults: 30 second chunks, 2 second overlap. Long audio gets split into pieces and stitched back together. The 2 second overlap stops words from getting cut at chunk boundaries. Most people never touch these.

What is Qwen3 ASR good for?

Transcribing podcasts, interviews, meetings, and voice notes when you want both the text and the timing data in one pass. The ability to switch between transcribe and translate makes it useful for working across languages without bringing a second tool into your pipeline.

Solid for: podcast transcripts, video subtitles via SRT export with the forced aligner, interview transcription where you need timestamps for citation, and turning foreign-language audio into English text.

Less ideal for: noisy field recordings or heavily overlapping speakers. ASR models still struggle with both.

The translate task is a one-shot rendering of the audio into the target language. It isn't running a separate translation pass after transcription, so the output reads as direct translated speech rather than a strict word-for-word match.

FAQ

Does Qwen3 ASR support timestamps for SRT subtitles? Yes. Keep the asr_use_forced_aligner option on (it's the default) and the workflow returns timing data alongside the text. You can use that timing data to build an SRT file with start and end times for each word or segment.

What languages does Qwen3 ASR transcribe? Set language to Auto and the engine detects it. Qwen3 handles dozens of languages out of the box. If you have mixed-language audio and want it to stay in one, set the language manually instead of Auto.

Can Qwen3 transcribe and translate audio at the same time? The task setting picks one mode per run. Transcribe gives you the source language text. Translate renders the audio directly into your target language (English by default). Run it twice if you want both outputs.

How long can audio files be for Qwen3 ASR? Long ones. The workflow chunks audio into 30 second pieces with 2 second overlap and stitches the result back together. You don't need to split files yourself before running.

How to run Qwen3 ASR online? You can run Qwen3 ASR online through Floyo. No installation, no setup. Open the workflow in your browser, upload your audio, and hit run. Free to try.

Read more

N