Note: The creation of this content about audio description was human-based, with the assistance on artificial intelligence.

Explanation of the success criteria

1.2.3 Audio Description or Media Alternative (Prerecorded) is a Level A conformance level Success Criterion. It states that an alternative for time-based media or an audio description of prerecorded video content is provided for synchronized media—unless the media serves as an alternative to text and is clearly labeled as such.

Who does this benefit?

Prerecorded Audio Description or Media Alternative primarily benefit people who are blind or have low vision, as they provide spoken narration that describes key visual elements of video content, such as actions, scene changes, gestures, facial expressions, and text on screen, that are not conveyed through dialogue or sound alone.

However, they can also be helpful for:

  • People with cognitive or learning disabilities, who may benefit from additional spoken context
  • Individuals in situations where they can’t look at the screen (e.g., multitasking or on the go)
  • Language learners, who might gain a better understanding of what’s happening visually through narrated context

In short, this media alternative enhances access and understanding for anyone who may not be able to fully process the visual information in a video.

Here is an example of a video with appropriate audio description.

Testing prerecorded audio description or media alternative via automated testing

Automated tools can detect the presence of an audio description track or related metadata, indicating whether an audio description is included. This testing can be performed fast and scales high . They can also identify the existence of a media alternative transcript, though they are not capable of evaluating its completeness or accuracy.

Automated testing does have significant limitations. It cannot assess the quality or relevance of the descriptions provided, nor does it evaluate synchronization between the audio description and the video content. Additionally, important factors like narration tone and clarity are not reviewed. These limitations contribute to a high rate of false positives and negatives, as automated tools may miss integrated audio descriptions or incorrectly flag content that does include them.

Testing prerecorded audio description or media alternative via Artificial Intelligence (AI)

AI-based accessibility testing can detect the presence of audio descriptions through audio track analysis and metadata, offering a more advanced approach than basic automated testing. While it is faster than manual testing, it remains slightly slower than basic automation due to the complexity of analysis. However, with cloud-based processing, AI solutions offer good scalability, making them suitable for larger volumes of content.

Some AI systems are capable of assessing object and event coverage in video content, though they may miss subtle nuances. They can also estimate synchronization by analyzing object timing and audio cues. While AI may detect the presence of media alternatives, it does not fully evaluate their accuracy or completeness. Additionally, it may assess speech clarity, but struggles to capture more nuanced elements like tone. Overall, the rate of false positives and negatives is moderate, as AI performs better than basic automation but is still not flawless. With continued advancements, its reliability is expected to improve over time.

Testing prerecorded audio description or media alternative via Manual testing

Manual accessibility testing involves a human evaluator checking for the presence of a proper audio description track or integrated narration within the video. The tester carefully evaluates whether the descriptions accurately reflect key visual information, and ensures that the audio description is properly synchronized with the visuals. Additionally, they verify the completeness of the descriptive transcript and assess the tone, emotion, and clarity of the narration. When performed by trained testers, this approach results in low rates of false positives and negatives, making it a highly reliable method for assessing the quality.

The biggest downside to manual accessibility testing is the speed and scalability of this method of testing. Manual accessibility testing is time-consuming and requires significant human effort, making it difficult to scale effectively. Due to its resource-intensive nature, this type of testing is not well-suited for large volumes of content or high-frequency testing needs.

Which approach is best?

It depends on your situation. Automated testing can identify metadata or the presence of prerecorded audio description or media alternative tracks but lacks the ability to evaluate the quality or relevance of the content. AI-based testing is an emerging solution that offers improved detection capabilities and can begin to assess content quality, though it still falls short of human-level understanding, especially when it comes to nuanced media interpretation. Manual testing remains the most effective approach for evaluating tone, relevance, and timing, but it is also the most time-intensive and resource-demanding method.

Related Resources