Why Are My Captions Wrong?

Why Are My Captions Wrong?

At Captionmax, we take pride in our expertise and accuracy when it comes to the art of captioning. We also understand that sometimes our captions may not look correct to our viewers. Blame is often placed on human captioners, managed ASR, or a lack of proofreading, but the truth is a bit more complicated than that. With the help of our top captioning experts in live and prerecorded media, we’ve compiled the top five reasons that your closed captions might look wrong. 


#1 – Technology Blips 

The captioner transcribed this sentence as “These are pop-on captions.” However, a corrupted version displays them as yellow garbled text. 

TV is a fast-moving industry. With the addition of new web and streaming platforms, technology has quickly advanced while also introducing new complications that impact the closed captions you see.  

In live programming, issues with antennas and cable can corrupt data streams and the transmission of captions. 608 and 708 closed captioning data is decoded to make captions appear overlayed on a video stream. Poor weather, weak transmission signals, internet quality, and satellite issues can all affect these captioning streams, resulting in missing words, strange characters, caption placement changes, or even the color of the captions. 

Prerecorded captions are not immune to tech issues, but they are usually exported as broadcast-specific captioning files that can bring more consistency in display. Unfortunately, stable exports can be corrupted, or even uploaded incorrectly, which may result in missing captions, characters decoded nonsensically, and more. 

#2 – Standards and Practices 

This audio was bleeped in the version the captioner used, but the audio was dubbed with “Oh, no!” in another version of the video, resulting in an incorrect bleep. 

Standards and Practices (S&P) is a department and/or set of guidelines designed to monitor the moral, ethical, and legal issues in all broadcast and streaming programs. S&P affects captioning when it comes to profanity and music choices, with networks and streaming platforms providing guidance on how these should be transcribed by captioners. 

Captioners create files based on the video and audio they are provided, but these versions may change after the fact due to S&P, causing captions to appear different from the content or audio. These changes can account for why you may see a [bleep] where something isn’t bleeped, or why you see a descriptor like [upbeat music], instead of the name of the actual song. 

Multiple versions of a program for different air times can also complicate this if they’re not provided to the captioner for reformatting, particularly when broadcast shows airing after 10pm are involved, as they are subject to different S&P guidelines regarding profanity. 

#3 – Content Prep & Whirlwind Changes 

In this caption, Eric’s name was incorrectly spelled as Erik. The captioner did not receive prep materials confirming the name, and the ID graphic was not present in the version they were given for captioning. 

Even in final cuts and live airings of programs that are locked into their audio, it’s still possible to encounter captions that may not match what you’re seeing. 

The nature of keeping up with audio transcription in live captioning means that paraphrasing is sometimes necessary to keep up with the pace of reading rate capabilities. This pacing also impacts spelling accuracies when it comes to proper nouns if prep materials aren’t provided ahead of time. Live captioners can correct a word if they mistranscribe, but that can be hard to do when there’s no context or materials available ahead of time. Captioners must make split-second judgment calls and are ultimately driven to get audio transcribed as quickly as possible in a live environment. 

Prerecorded captions seldom include paraphrasing, but captions may omit dialogue in situations in which multiple people are talking at once. Reading rate also comes into play here; there’s only so many characters allowed per caption within a limited amount of time. To ensure captions stay synchronized and accurate, captioners will prioritize dialogue most important to the understanding of the situation. It’s a technical judgment call, and not one captioners take lightly. 

#4 – Timing & Reading Rate 

This caption is slightly delayed due to reading rate and amount of characters needed to fit. 

A common complaint for both live and prerecorded closed captions is the timing. Why are they not exactly on the screen at the exact moment speech occurs, and why don’t they always leave the screen when speech is done? 

A live captioner hearing audio and transcribing it in real time means there will always be a slight delay. Live captioners cannot begin writing anything until they first hear audio. People commonly attribute this to a tape delay or latency to allow for editing, but in fact, the only technical delay built in is to allow for streaming and buffering. 

Prerecorded captioners have more freedom to edit and time captions almost exactly to the audio. But not all captions are going to be displayed right as the words are spoken, due to the reading rate threshold, which is the number of words a person can reasonably read per minute. This rate varies for different programs and is set depending on each program’s needs. Captioners will carefully time captions as close to the audio without breaking this threshold, but it may result in a caption appearing several frames after audio starts. 

# 5 – Audio Quality 

The audio on this recorded meeting was muddled on the captioner’s end and could not be clarified in time to make a correction, so the captioner used an [indistinct] tag. 

One of the last but most important factors in closed caption accuracy comes down to the audio itself. 

Remember that professional closed captioners are humans just like you. While they’re trained to discern audio more closely, if a hearing viewer cannot understand difficult audio, it’s likely that the captioner is also struggling. Captioners avoid using [indistinct] descriptors when possible, but sometimes there is no other option. 

It’s also very important to note that even when the audio may be clear to the end user, the captioner may have only been provided with a rough cut in which the audio mix was incomplete. Mumbled dialogue under blaring music can be entirely inaudible to a closed captioner, but a sound mixer might re-record or adjust the audio levels in a final mix, making it audible to a hearing viewer. 


Highly trained closed captioners are craftspeople at heart. They take the art of closed captioning seriously and have often worked for years to perfect their transcription speed, technical expertise, and editorial precision within the field. Viewers do not see the people behind the closed captions, so it can be easy to quickly focus on a single captioner’s “mistake” and put them at fault for inaccuracies. Some errors, such as data stream corruptions and the ever-changing nature of the post-production world, mean that the blame isn’t always so easy to lay on any single person or entity; it’s just part of how captions work. 

Interested in learning more?