eziclip.com

Add captions and subtitles to Instagram Reels

Drop a Reel into the tool above and it transcribes the speech, times every word, and lays captions over the 9:16 frame, all in your browser. Pick a style, fix any wording, and export a 1080 by 1920 MP4 ready to post, or a subtitle file. It is free, there is no watermark, and the video never leaves your device.

Instagram Reels9:161080 × 1920up to 90 seconds

Caption a Reel in the right shape, the first time

Instagram Reels live in a vertical 9:16 frame, 1080 by 1920, running up to 90 seconds. That format shapes where text can sit. The tool reads your clip in its native vertical ratio, so the words land inside the safe area rather than tucked behind the like, comment, and share buttons down the right edge, or hidden under the caption and audio strip along the bottom.

You bring a finished Reel, a talking-head clip, a tutorial, or a voiceover over b-roll, and the captions go on top of the picture you already cut. Nothing is re-encoded into the wrong aspect or letterboxed. What you preview vertically is what you post vertically, which matters when a single line wrapping the wrong way can push your hook off-screen.

Why captions decide whether a Reel gets watched

Reels start playing silently the moment they enter the frame, while someone is thumbing through the feed and has not chosen your video yet. Sound is a tap they have to opt into, and most never do on the first pass. So the on-screen text does the persuading, and it is what makes a thumb pause instead of flicking past.

Captions also protect the timing of your content. A punchline, a stat, the turn in a story, these only work if they register on the exact beat you intended. With the words on screen, the joke can land and the point can connect before anyone reaches for volume. Without them, a muted viewer gets motion and a face, and scrolls on before your idea arrives.

Word-level timing that keeps pace with fast Reels

An on-device AI speech model listens to your clip, detects the language automatically across 99 languages, and transcribes it with timing pinned to each individual word rather than loose blocks of a sentence. That precision lets a word appear in sync with the syllable being spoken, so quick, energetic Reels stay readable instead of lagging a half-second behind the mouth.

Every word sits in an editable transcript. Correct a name, a brand, a piece of slang, or trim filler, and the caption updates. If the speech is in another language, override the detected one or regenerate the captions in a different language entirely. You can fine-tune any single line's start and end before export, so a fast cut never clips a word in half.

Your Reel stays on your device, start to finish

This runs entirely in the browser. The AI model downloads once, then everything, the transcription, the styling, and the export, happens locally on your machine. Your footage is never uploaded, never sent to a server, and never used to train anything. There is no account to make and no queue to wait in.

For unreleased Reels, client work under embargo, or anything you would rather not hand to a third party, that is a literal guarantee, not a privacy clause. The file you drop in is the only copy, and it does not move.

Four caption styles built for the feed

Choose the motion that suits the Reel. Karaoke highlights each word as it is spoken, which keeps the eye anchored through a fast monologue. Highlighted drops the active word into a colored box for a punchy, trend-friendly look. Minimal shows one clean word at a time, and Dynamic gives that single word a small pop on entry, and both keep the vertical frame uncluttered.

From there you control typeface and weight across Inter, Montserrat, Oswald, Lora, and JetBrains Mono, plus size, text and highlight colors, outline, shadow, and words per line. Position is the part that matters most on Reels: place captions toward the center or upper third and nudge them clear of the right-rail buttons and the bottom UI, so nothing the platform overlays ever covers a word.

Export a post-ready Reel, or a subtitle file

When the captions look right, burn them into the video as a 9:16 MP4 at 1080 by 1920, the native Reels size, so the text is part of the picture and shows for everyone the instant the clip autoplays, whatever their sound setting. Export is hardware-accelerated, with an option tuned for sharing or for source quality, and there is no watermark on the result.

Prefer to keep captions as a separate track? Download an SRT or VTT file instead and upload it alongside your video. Either path is free, and so is every style, every language, and every export. There is no paywall at the end.

Questions

Burning them in is the most reliable option, because the text becomes part of the picture and appears the moment a Reel autoplays muted in the feed, with no setting required from the viewer. Export a 9:16 MP4 with captions baked in. If you would rather keep them editable on the platform, download an SRT or VTT file and add it separately.

Big enough to read on a phone at arm's length, but not so large the lines wrap awkwardly in the narrow vertical frame. Keep to a few words per line and position the text in the center or upper third, clear of the right-side buttons and the bottom caption bar. Preview it in the 9:16 frame and adjust size and placement until nothing important is covered.

Yes. Transcription, all four caption styles, every language, timing edits, and export are free for everyone, with no watermark, no account, and no paywall at export. It is free by choice, not as a trial.

No. Everything runs in your browser on your own device. The AI model downloads once and then works locally, so your video is never uploaded to a server and is never used to train anything. The file stays with you the whole time.

Yes. The speech model auto-detects the language across 99 languages and transcribes with word-level timing. You can override the detected language or regenerate the captions in a different language, and every word is editable in the transcript before you export.

Add captions for other platforms

The full auto-caption tool