Note: The creation of this content about live captions was human-based, with the assistance on artificial intelligence.
Explanation of the success criteria
1.2.4 Captions (Live) is a Level AA conformance level Success Criterion. It requires Captions to be provided for all live audio content in synchronized media.
As opposed to the prior Web Content Accessibility Guidelines (WCAG) Success Criterion 1.2.2 Captions (Prerecorded) Level A, this Success Criterion pertains to media that includes synchronized audio and visual content, and is presented live (not prerecorded). Examples include:
- Live-Streamed Events: Company town halls, product launches or demos, and conferences or panel discussions streamed on platforms like YouTube Live, Facebook Live, or Vimeo.
- Live News Broadcasts: Television or online news, live reporting, breaking news feeds.
- Webinars and Online Courses: Educational sessions where the speaker is live on camera, lectures, live virtual classrooms on platforms like Zoom, Teams, or Webex (note: one-to-one or small meetings are generally not covered by SC 1.2.4).
- Performances or Cultural Events: Livestreamed theater, concerts, or festivals, live podcasts with visual elements (like video feeds of hosts), virtual museum tours with guided narration.
- Government or Civic Broadcasts: Legislative hearings or public forums streamed online, live press briefings from government agencies, courtroom broadcasts.
- Tech & Gaming Streams: Developer livestreams (e.g., launch events, coding live), game streaming platforms (Twitch, YouTube Gaming) where players speak while playing.
Who does this benefit?
Live captions benefit a wide range of people — not just those with hearing disabilities.
- People who are Deaf or Hard of Hearing: Provide essential access to spoken content in real-time. The help ensure equal participation in events like live webinars, town halls, and livestreamed announcements.
- Non-native Speakers: Reinforce understanding when spoken language is fast, heavily accented, or uses unfamiliar terms. They are useful for people still learning the language of the broadcast.
- Viewers in Noisy or Quiet Environments: In noisy or quiet settings, captions enable viewers to follow live content without needing sound.
- People with Cognitive or Learning Disabilities: Seeing spoken words reinforced visually helps some users process and retain information more effectively.
- Event Organizers & Businesses: Demonstrates a commitment to accessibility, boost engagement, and may be legally required (e.g., ADA, WCAG, EN 301 549).
- Mobile Users: Many users consume content on the go without headphones; captions ensure they still understand what’s happening.
- Students and Attendees: Makes it easier to take notes, follow along, or review content, especially during fast-paced lectures or discussions.
Live captions not only improve usability for all users, they also provide essential accessibility for those who rely on them.
How to test for live captions
- Identify Applicable Live Content: check to see whether the website or platform hosts live synchronized media, such as webinars, live-streamed events, live news broadcasts, virtual conferences.
- Verify the Presence of Live Captions: during the live event, check if real-time captions are provided. If they are:
- captions should be synchronized with the audio
- captions should include all spoken dialogue
- captions should identify speakers
- captions should describe relevant, non-speech sounds (e.g., [laughter], [music])
- Assess Caption Accuracy and Quality: evaluate the accuracy of the captions. Are they free from significant errors? Do they accurately represent the spoken content? Captions should also be timely and synchronized with the audio.
- Review the Captioning Method: identify the captioning method, whether human-generated (e.g., CART), which is preferred for accuracy, or automated captions, which may not meet WCAG due to errors.
- Check Platform Compatibility: ensure the platform supports live captions and that they’re accessible on all devices and browsers.
Testing live captions via automated testing
Automated testing can detect captioning technologies (e.g., WebVTT) and offers speed and scalability. However, that is where the benefits sadly end.
Automated testing has limitations when evaluating live captions: it cannot assess real-time accuracy, measure caption delay, or verify synchronization with speech. It also doesn’t evaluate language quality and is prone to false positives or negatives, often assuming captions are present based solely on their container.
Testing live captions via Artificial Intelligence (AI)
AI-based testing can detect live caption feeds, verify active transcription, and monitor technical setup and feed status. It scales well, supporting simultaneous monitoring of multiple streams with near real-time analysis.
AI can compare captions to speech in near real-time, though it’s not flawless. Some tools estimate lag using speech-to-text alignment to gauge sync. While grammar can be assessed, contextual understanding remains limited. False positives and negatives are moderate; AI may miss or misflag captions if the stream is unstable.
Testing live captions via Manual testing
Manual accessibility testing provides a comprehensive evaluation of live captions. It assesses caption accuracy by comparing real-time text with spoken words and observes any delay between audio and captions to gauge latency. Testers also check synchronization with speech to ensure captions align properly. Technical verification includes reviewing system configuration and caption settings. Because it’s directly observed, the risk of false positives or negatives is minimal. Additionally, testers can evaluate language quality, including spelling, grammar, and appropriate terminology.
The downside to manual testing of live captions is that it is not scalable when there are many, as it requires a human observer per each stream. And, testing live captions during events can be quite labor-intensive, and fast errors can be missed.
Which approach is best?
No single approach guarantees appropriate testing of live captions. However, using the strengths of each approach in combination can have a positive effect.
Accessibility testing for live captions involves a mix of methods. Automated testing is effective for detecting caption setup but doesn’t assess the actual content. AI-based testing offers promising real-time monitoring and basic quality checks, though it’s not yet fully reliable for catching nuanced errors. Manual testing remains the most accurate approach, providing thorough quality evaluation during live events.