Artificial intelligence (AI) is developing rapidly and is increasingly being used for audiovisual productions. But can AI also play a role in creating audio description (AD) for film and television? A recent study by Scribit.Pro, commissioned by Bartiméus Fonds and with support from CFAP and NPO Innovatie, offers new insights. The research results show that AI offers promising applications, but is still inadequate for narrative content.
What is audio description and why is it important?
Audio description makes films and television accessible to people with visual impairments by adding descriptions of what is happening on screen. Good audio description not only conveys visual elements, but also translates atmosphere, emotion and nuances that are essential for the experience of a story.
AI and audio description: the possibilities and limitations
The study used ChatGPT-4o, currently the most advanced AI model for image description. Other models, such as Google Gemini and Anthropic Claude, were not suitable for generating useful audio description.
Significant improvements in AI output were achieved through advanced prompt engineering. The use of a film script and subtitles provided more context, so that the descriptions better matched the images. Yet AI continued to struggle with narrative nuances and recognizing subtle visual details. This often resulted in descriptions that were too superficial or too detailed, without the right focus on what is really important to the viewer.
How do users and professionals rate AI audio description?
A crucial part of the research was the assessment of AI-generated audio description by end users and professionals. The short film Der Kaiser (NTR Kort) was rated with a 6.5 with AI audio description by people with a visual impairment. Without audio description, the film received a 3.5, while the human audio description scored a 9.1. So while AI offers a clear improvement over no audio description at all, users often found AI descriptions too busy, repetitive, and less tailored to the movie experience.
Professional image describers from Scribit.Pro see opportunities to use AI as a tool, but point out that full automation is not yet possible. Especially in animated films and drama productions, AI hardly adds any useful audio description, and a lot of post-processing is required.
Documentaries proved more suitable for AI audio description, with 60-80% of the text generated being usable with human review and editing.
The future of AI in audio description
For the time being, AI is not suitable for audio description of drama films and other narrative content, due to the lack of context and understanding of narrative structures. However, in documentary productions and informative content, AI can play a supporting role, as long as a human editor remains involved.
However, developments in AI are moving quickly. Emerging models, such as the open-source DeepSeek, show that great progress is being made. The researchers therefore advise to continue to closely monitor technological developments and to continue experimenting with new AI tools.
Do you want to stay informed about the developments of AI and AD? Subscribe to our newsletter.