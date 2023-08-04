CityLife

The Power of AI Models

AI Generates Famous Artists’ Voices in Viral Music Videos

ByRobert Andrew

Aug 4, 2023
A YouTube channel called There I Ruined It has gained popularity for its viral music videos that feature AI-generated voices of famous musical artists singing unexpected songs. One example includes a rendition of Sir Mix-a-Lot’s “Baby Got Back” sung by an AI-generated voice that emulates Elvis Presley. Another video showcases a faux Johnny Cash singing the lyrics to Aqua’s “Barbie Girl.”

To achieve this unique effect, There I Ruined It utilizes generative AI. The channel’s creator, musician Dustin Ballard, uses an AI model called so-vits-svc to transform his own vocal recordings into the voices of other artists. Although the process is not user-friendly and requires rigorous training, once the model is trained, vocal tracks can be uploaded and replaced with modeled voices. The rest of the song is then built around this AI-generated voice.

The AI model known as so-vits-svc stands for “SoftVC,” which breaks down the singer’s voice into key parts to be learned by a neural network. “VITS” refers to “Variational Inference with adversarial learning for end-to-end Text-to-Speech,” and “SVC” signifies “singing voice conversion.” The primary function of the AI model in There I Ruined It songs is to change the timbre of Ballard’s voice to resemble that of the chosen artist.

In the case of the Elvis rendition, Michael van Voorst, the creator of the Elvis voice AI model, gathered clean vocal audio samples from Elvis Presley’s Aloha From Hawaii concert in 1973. After careful selection and removal of any interference or noise from the samples, Voorst extracted 10-second chunks of high-quality audio for processing.

The process of training the so-vits-svc-fork AI model involves several steps, including pre-resampling the audio, downloading configuration files, and running speech model pre-training. Finally, the training process is initiated with the command “svc train -t,” and the progress can be monitored through TensorBoard.

These AI-generated music videos demonstrate the creative possibilities that can be achieved through the fusion of human talent and AI technology. As AI continues to advance, we can expect more innovative applications in the realm of music and entertainment.

