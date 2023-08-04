The use of artificial intelligence (AI) in audio transcription, captioning, and automatic speech recognition (ASR) services has gained popularity in recent years. However, there is a growing recognition for the need of human oversight in these AI-powered applications.

Captions and subtitles are essential for individuals who are deaf or hard of hearing, providing them with access to media and information. With the rise of on-demand streaming services, the demand for better captioning options has increased. While AI has been integrated into platforms like YouTube and TikTok for video-based captioning, it is important to consider the limitations and accuracy of these AI tools.

A recent report by 3Play Media, a video accessibility and captioning services company, analyzed the impact of generative AI tools on captions for individuals with hearing disabilities. The report found that even the best AI engines only achieved around 90% accuracy for word error rate and 80% accuracy for formatted error rate. These rates fall short of the industry standard of 99% accuracy required for accessibility compliance.

Regulations like the Americans with Disabilities Act (ADA) and Federal Communications Commission (FCC) mandate accurate captions for individuals with communication disabilities. However, the accuracy of captions varied across different markets and use cases, with news, networks, cinematic, and sports content being the most challenging for AI to transcribe accurately.

Although there have been improvements in AI performance, the error rates are still high enough to necessitate human editor collaboration in all tested markets. The use of “human-in-the-loop” systems, where AI tools and human editors work together, has been promoted to minimize bias and ensure accuracy.

The importance of human oversight is also emphasized by organizations like the World Wide Web Consortium (W3C)’s Web Accessibility Initiative and 3Play Media. While automatic captions can serve as a starting point, significant editing is often required to meet user needs and accessibility requirements.

Additionally, there are concerns about AI “hallucinations” in captioning services, where AI-generated text may produce factual inaccuracies or completely fabricated sentences. Misinformation and defamation issues have arisen in AI chatbot systems as well.

As AI continues to be integrated into various technologies, it is evident that human oversight in AI captioning services is crucial. Balancing the capabilities of AI with human expertise is necessary to ensure accurate and accessible captioning for individuals with communication disabilities.