Note: The creation of this extended audio description article was human-based, with the assistance on artificial intelligence.
Explanation of the success criteria
WCAG 1.2.7 Extended Audio Description (Prerecorded) is a Level AAA conformance level Success Criterion. It states that videos include extended spoken descriptions that narrate the visual content to enhance accessibility. When pauses in the main audio aren’t enough to convey the video’s meaning, extended audio descriptions are provided for all prerecorded video content in synchronized media.
This Success Criterion differs from 1.2.5 Audio Description (Prerecorded) Level AA, which inserts audio descriptions during breaks in the audio track. In extended audio description, a more detailed description is provided by creating gaps in the audio track by pausing the media, then resuming playback once the audio description has completed. Since this can interrupt the viewing experience for others, it’s common to offer the option to toggle the feature on or off, or to provide separate versions of the media—with and without extended descriptions.
Note that this Success Criterion is at a conformance level of AAA. This means that this Success Criterion is generally considered aspirational, going beyond the standard A & AA conformance levels. It addresses more specific accessibility needs and is not mandatory for all websites or content. However, achieving Level AAA can provide additional benefits in terms of inclusivity.
Who does this benefit?
- People who are blind or have low vision and cannot see the screen often rely on audio descriptions to access visual information.
- People with cognitive limitations who struggle to interpret visual content may also benefit from audio descriptions.
Here’s a video which presents standard audio description of a video, followed by the extended audio description example for comparison.
Testing via automated testing
Automated testing rapidly identifies the absence of audio description support by analyzing metadata or content structure, providing high scalability and fast results.
However, automated testing cannot assess whether essential visual elements are properly described, nor can it evaluate timing, synchronization with audio, or the quality of narration. It also misses integrated audio descriptions, making it prone to false positives or negatives due to its reliance on metadata alone.
Testing via Artificial Intelligence (AI)
Similar to automated testing, AI-based testing can detect the presence of audio descriptions by analyzing audio patterns, metadata, or speech gaps. It provides strong scalability through cloud-based processing and enables both near real-time and batch analysis for flexible, efficient evaluation.
AI-based testing can detect objects and events in video but can’t confirm if all key visuals are described. It can estimate sync between narration and visuals, though precise alignment is difficult. While it may gauge speech clarity, it falls short on emotional tone and narrative quality. It might spot integrated audio descriptions from patterns but can’t verify intent. Accuracy varies, with moderate risk of false results depending on training data and content structure.
Testing via Manual testing
Manual testing of extended audio description involves verifying the existence of an extended AD track or related functionality, and confirming that the video properly pauses to allow for these descriptions. Testers assess whether all essential visual content is accurately described during these pauses and evaluate if the pauses are appropriately timed and placed. This process relies heavily on full media context and direct observation, making the risk of misjudgment low. Manual testing is essential for evaluating user experience and the overall coherence of the narrative flow.
Manual testing of extended audio descriptions is not practical for large media libraries, as it is time-consuming and requires a full review of each video.
Which approach is best?
No single approach for testing extended audio description is perfect. However, using the strengths of each approach in combination can have a positive effect.
Extended audio description testing varies by method: automated testing is limited to detecting metadata-level indicators and cannot assess actual usability; AI-based testing offers valuable support by identifying missing or out-of-sync extended AD but lacks the ability to fully grasp storytelling or visual nuance; manual testing remains essential for evaluating narrative flow, timing, and overall user experience.