Skip to main content
Beta: Front-End Checklist is currently in beta. Some issues are still being fixed. Thanks for your patience.

Provide captions for video content

Prerecorded video with audio must have synchronized captions. Live video must have real-time captions. This is required by WCAG 2.1 SC 1.2.2 and SC 1.2.4.

Utilities
Quick take
Typical fix time 30 min
  • Prerecorded video with audio: synchronized captions required — WCAG 2.1 SC 1.2.2 (Level AA)
  • Live video with audio: real-time captions required — WCAG 2.1 SC 1.2.4 (Level AA)
  • Use `<track kind='captions'>` with a `.vtt` (WebVTT) file for HTML5 `<video>` elements
  • Captions must include all spoken dialogue, speaker identification, and relevant non-speech audio (music, sound effects)
  • Subtitles and captions are different: captions include non-speech audio; subtitles translate dialogue only
Why it matters: Approximately 15% of adults have some degree of hearing loss. Captions are essential for deaf and hard-of-hearing users who cannot access audio content. They also benefit users in sound-sensitive environments (libraries, open offices), users watching without headphones in public, non-native speakers, and users with auditory processing disorders. WCAG SC 1.2.2 is a Level AA requirement — its absence is a legal compliance failure under the ADA, EN 301 549, and similar regulations worldwide.

Rule Details

Captions provide a synchronized text-based representation of all audio content in a video. WCAG 1.2.2 Captions (Prerecorded) (opens in new tab), WCAG 1.2.4 Captions (Live) (opens in new tab), and the <track> element reference (opens in new tab) all treat captions as a first-class part of the video experience.

Code Examples

<!-- ✅ Correct: <track kind="captions"> with WebVTT file -->
<video controls width="800">
  <source src="presentation.mp4" type="video/mp4">
  <source src="presentation.webm" type="video/webm">
 
  <!-- Captions for deaf/hard-of-hearing (includes non-speech audio) -->
  <track
    kind="captions"
    srclang="en"
    label="English captions"
    src="captions-en.vtt"
    default>
 
  <!-- Subtitles are for translation only — do NOT substitute for captions -->
  <track
    kind="subtitles"
    srclang="fr"
    label="Français"
    src="subtitles-fr.vtt">
 
  <p>Your browser does not support HTML video. <a href="presentation.mp4">Download the video</a>.</p>
</video>

WebVTT File Format

WEBVTT
 
00:00:01.000 --> 00:00:04.000
Welcome to the Frontend Checklist workshop.
 
00:00:04.500 --> 00:00:08.000
Today we'll cover accessibility fundamentals.
 
00:00:08.500 --> 00:00:10.000
[upbeat music playing]
 
00:00:10.500 --> 00:00:14.000
[Speaker 2] Let's start with color contrast requirements.

Why It Matters

The distinction between captions, subtitles, and transcripts matters because WebVTT (opens in new tab) and WebAIM's media guidance (opens in new tab) expect captions to include meaningful non-speech audio, not just dialogue.

  • Hearing Loss: 15% of adults have some hearing loss; deaf users cannot access audio content without captions.
  • Situational Limitations: Users in noisy environments, offices, or public transit often watch with sound off.
  • Non-native Speakers: Reading captions simultaneously improves comprehension for second-language viewers.
  • Cognitive and Learning Disabilities: Captions help users with attention disorders or dyslexia who process text more easily than audio.
  • SEO and Searchability: Caption text is indexable by search engines, improving video discoverability.

Captions vs Subtitles vs Transcripts

TypePurposeNon-speech audioSynchronizedWCAG SC
CaptionsDeaf/hard-of-hearingYes (required)Yes1.2.2, 1.2.4
SubtitlesTranslationNoYesNot required
TranscriptAll users, searchYes (recommended)No1.2.1 (audio-only)

Auto-Generated Captions

Auto-generated captions (YouTube, Whisper, AWS Transcribe) must be reviewed before publishing:

  • Average accuracy is ~80% — insufficient for formal or technical content
  • Proper nouns, technical terms, and accented speech are most error-prone
  • Review and correct all auto-captions before the video goes live

Exceptions

  • Logos, purely decorative text treatments, and screenshots used as documentation can be valid exceptions when their accessible alternative is still provided appropriately.
  • An image or media rule should not force redundant alt text, captions, or transcripts when another nearby mechanism already provides the equivalent information clearly.
  • If the media asset fails more than one rule, prioritize the issue that most directly blocks understanding for assistive technology users.

Verification

Automated Checks

  • Inspect the browser accessibility tree or accessibility pane for the relevant element, role, or accessible name.
  • Run an automated accessibility checker such as axe or Lighthouse where applicable.

Manual Checks

  • Test the affected UI with keyboard-only navigation and confirm the rule holds in the rendered experience.
  • Re-test one representative user flow with a screen reader if this rule affects a key interaction.

Use with AI

Copy these prompts to use with your AI assistant, or install the MCP server to use directly from Claude, Cursor, or Windsurf.

Check

Verify implementation

Find all `<video>` elements and video embeds (`<iframe>` from YouTube, Vimeo, etc.). For each `<video>` with audio: check for a `<track>` child element with `kind='captions'` and a valid `src` pointing to a `.vtt` file. Verify the `default` attribute is present on at least one track so captions are on by default (or document the UX reason they are off by default). For YouTube/Vimeo embeds: check that the platform's caption toggle is accessible. Also check that the `.vtt` file exists and is valid (not empty, not just music notes).

Fix

Auto-fix issues

For `<video>` elements without captions: (1) Create a WebVTT (`.vtt`) file containing synchronized caption text — include all spoken words, speaker IDs for multi-speaker content, and descriptions of relevant sounds (e.g., '[applause]', '[upbeat music]'). (2) Add `<track kind='captions' srclang='en' label='English' src='captions-en.vtt' default>` inside the `<video>` element. (3) For auto-generated captions (YouTube, AI tools): review and correct errors — auto-captions average 80% accuracy and often fail on proper nouns, technical terms, and accented speech. (4) For live streams: implement real-time captioning via a third-party captioning service or CART (Communication Access Realtime Translation).

Explain

Learn more

WCAG 2.1 SC 1.2.2 (Captions — Prerecorded, Level AA) requires synchronized text alternatives for all audio in prerecorded video content. Captions differ from subtitles: captions are intended for deaf/hard-of-hearing viewers and must include non-speech information (sound effects, music), while subtitles translate dialogue for viewers who can hear but do not understand the language. The HTML `<track>` element with `kind='captions'` delivers WebVTT files that browsers render as synchronized on-screen text. The `kind='subtitles'` value is for translation only and does not satisfy SC 1.2.2 because browsers may omit non-speech annotations.

Review

Code review

Review the rendered markup and interactive states that affect Provide captions for video content. Flag exact elements, roles, labels, focus behavior, or keyboard interactions that violate the rule, and note how to verify the fix with browser accessibility tooling or assistive tech.

Sources

References used to support the guidance in this rule.

Further Reading

Tools and supplementary material for exploring the topic in more depth.

axe DevTools
deque.comTool

Rules that often go hand-in-hand with this one.

Provide audio descriptions for video

Videos with important visual content include audio descriptions that narrate visual information for blind users.

Accessibility
Avoid autoplaying media

Audio and video content does not autoplay, or provides immediate controls to pause or stop playback.

Accessibility
Make videos accessible with captions

Videos have captions, audio descriptions, transcripts, pause controls, and avoid autoplay for users with hearing, vision, or cognitive impairments.

HTML

Was this rule helpful?

Your feedback helps improve rule quality. This stays internal for now.

Loading feedback...
0 / 385