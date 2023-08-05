A team of researchers from British universities has successfully trained a deep learning model to steal data from keyboard keystrokes recorded through a microphone. The model achieved an impressive accuracy rate of 95%. Even when using the popular video conferencing platform Zoom for training, the accuracy only dropped slightly to 93%.

Acoustic attacks, like this one, pose a significant threat to data security as they can expose sensitive information such as passwords, conversations, and messages to malicious third parties. Unlike other side-channel attacks, acoustic attacks have become easier due to the widespread use of microphone-equipped devices capable of high-quality audio captures. Advancements in machine learning have further increased the feasibility and danger of sound-based side-channel attacks.

In this attack, the first step is to record the keystrokes made on the target’s keyboard. This can be done using a nearby microphone, exploiting malware with microphone access on the target’s phone, or recording keystrokes during a Zoom call.

To train their prediction algorithm, the researchers collected training data by pressing 36 keys on a modern MacBook Pro multiple times and recording the sound produced by each keypress. They processed the recordings to extract waveforms and spectrograms for each key, which provided identifiable differences. These spectrogram images were used to train an image classifier called ‘CoAtNet’.

In their experiments, the researchers used a MacBook Pro keyboard, an iPhone 13 mini placed 17cm away from the target, and the Zoom video conferencing platform. The CoANet classifier achieved 95% accuracy when trained on smartphone recordings and 93% accuracy with recordings through Zoom. Skype produced a lower but still usable accuracy of 91.7%.

Possible mitigations against acoustic side-channel attacks include altering typing styles, using randomized passwords, reproducing keystroke sounds with software, applying white noise or software-based keystroke audio filters, employing biometric authentication when possible, and utilizing password managers to reduce manual input of sensitive information.