Even when individuals are aware that they may be listening to AI-generated speech, it remains challenging for both English and Mandarin speakers to accurately identify a deepfake voice. This poses a potential risk for the billions of people who speak these widely used languages, as they may unknowingly be exposed to deepfake scams and misinformation.

Researchers at University College London conducted a study involving over 500 participants to assess their ability to detect speech deepfakes among various audio clips. Some of the clips contained the genuine voice of a female speaker reading standard sentences in English or Mandarin, while others were deepfakes produced by generative AIs trained on female voices.

The participants were divided into two groups. One group listened to 20 voice samples in their native language and had to determine whether the clips were real or fake. The results showed that individuals correctly classified the deepfakes and authentic voices around 70% of the time for both English and Mandarin samples. However, this detection accuracy is likely to be lower in real-life situations where individuals may not be aware that they are listening to AI-generated speech.

The second group was presented with 20 randomly selected pairs of audio clips consisting of the same sentence spoken by a human and the deepfake. Participants had to identify the fake voice. In this setup, the detection accuracy improved to over 85%. However, the researchers acknowledged that this scenario gave participants an advantage not encountered in real life.

The study did not assess participants’ ability to identify whether the deepfakes resembled the voices of specific individuals being mimicked. This aspect is crucial in real-life situations, as scammers have used cloned voices of business leaders to deceive employees into transferring money, and deepfakes of well-known politicians have been spread through social media for misinformation campaigns.

While the research provides valuable insights into the progress of AI-generated deepfakes, it is important to note that attempts to train participants to improve their detection skills generally failed. Therefore, the development of AI-powered deepfake detectors becomes essential. The researchers are exploring the use of large language models capable of processing speech data to enhance deepfake detection capabilities.