top of page

A Passion Avenue For Science

Introduction

Our current project is about developing data collection methods and keystroke predictions through deep learning to tackle acoustic attacks on keyboards. The project aims to discuss the dangers of side channel attacks and threats. Side channel attacks is dangerous as it targets a specific type of data, such as passwords, messages or other personal data due to information exploitation. This issue is rising due to the presence and rapid development of Artificial Intelligence. This project implements audio spectrograms and image classification with the purpose of discussing the importance of audio cybersecurity. As a way to prove the impacts of these side channel attackes, this project uses sound and image classification to turn keystorke audio recordings into spectrogram images. The images are then transformed into an AI model which predicts the keystrokes simulating side channel attacks. 


Fourier Transforms.

The Fourier transform is a mathematical formula that converts signals between different domains, such as from the time domain to the frequency domain. This process allows for identifying the essential frequencies within a signal. Spectrograms, which visually represent frequencies over time, utilize Fourier transforms to display frequency information as a function of time.


Data Collecting.

This project was developed using Python. Initially, a program was created to simultaneously record audio from the laptop microphone and capture keystrokes. During the data collection process, audio recordings of keyboard typing were converted into spectrograms and subsequently sampled. The recorded keystrokes were stored in a JSON file. 


Three separate files were created during the data collection process: main.py, audio_recorder.py, and keylogger.py. In audio_recorder.py, a callback function takes input data, frames, time, and status. The program prompts the user to press ‘Enter’ to start recording and waits for input. When the ‘Enter’ key is pressed again, it stops recording and saves the data as a NumPy array. The start_record function also retrieves the current time and date, saving the WAV file output with a timestamp. In keylogger.py, the code includes a constructor to initialize the counter and start_time. The start_recording method starts the keylogger by creating a listener that activates the on_press and on_release methods when a key is pressed or released. Finally, both components were imported into the main.py file.


Generating Spectrograms.

A spectrogram is a visual representation of the freqquencies of a signal over time. It is one of the most powerful tools used to analyze real-world data. In this research, the WAV files were converted into spectrograms. Spectrograms are presented in different colors, representing the signal strengths. 


Data Preparation.

Here is the flow of the model preparation training process:

Start > Set Pytorch Lightning Seed > Add parser.add_arguments > set numbers of samples > Define UNet model > Create dataset loader > Perform split (train, test, val loader) > Define pipeline > Set class KeyStrokesPredictor > Define lightning trainer module > End.


In this stage, the program begins by calling the get_config function to retrieve the configuration of the layer settings and optimizer. The argparse module is imported to define the model’s arguments, which are then specified using the parser.add_argument method. The UNet model is also defined, with its structure outlined in the model.py file, which includes input, convolutional, hidden, and output layers. The stage concludes with the definition of the Lightning trainer module to initialize the maximum number of epochs, precision, learning rate, and strategy. At this point, the model is ready to be trained.

The UNet model is considered to have a simple structure that does not compromise its performance. It consists of convolutional layers, with an encoder responsible for retrieving information and a decoder that up-samples the feature maps. Together, these layers enable the model to segment accurate images.


Model Training.


The training process begins with a feedforward step, where inputs are fed into the network. This step is used to predict keystrokes after the training is complete. The model is then evaluated using a loss function to compare current predictions with expected outcomes. Based on this evaluation, the model updates each node in the hidden layers according to the loss function. One method used for training the model is backpropagation, which computes the gradient of the loss function to adjust the model's parameters.

In this part, the epoch is retrieved to capture the fixed timestamps. Compared to other frameworks like Boilerplate or PyTorch, PyTorch Lightning offers a flexible nature that facilitates experimentation and prototyping, making it particularly useful for training complex models efficiently.


Result and Analysis.

A web application was created in which audio is recorded and sent to a backend server. The data is then processed by a trained AI model, which uses the new data to generate predictions that are sent back to the web application. This process allows the model to predict and determine the probability of a key being pressed at certain times, providing the likelihood as a percentage.


The output is a tensor shape that includes the number of files, number of keys, and timestamps as dimensions. The probability ranges from 0 to 1, with 0 being unlikely and 1 being very likely. Therefore, if a key is pressed within a certain timestamp, a higher percentage in the output suggests a higher probability of the key being pressed. Conversely, a lower percentage indicates a low likelihood of the key being pressed.


Conclusion, Future Outlook, Solutions.

Through this machine learning model, we have been able to predict keypresses and their probability. We recorded WAV files as input and transformed them into spectrograms, which were then processed using Fast Fourier Transform. This project serves as a simulation of side-channel attacks, aimed at educating society about cybersecurity.


While the future remains unpredictable due to ongoing technological advancements, we hope this project raises awareness about the dangers of side-channel attacks. To further develop the model, it would be beneficial to implement this program on devices beyond laptops and computers. Another idea is to create a similar program that enables audio-wireless keyboards, enhancing user accessibility and cybersecurity. As a future solution to cyber attacks, incorporating laser or visual keyboards could prevent security threats or hardware-based keyloggers.

In this work, Nabila and her mentor aim to develop data collection methods using acoustic and audio-based keyboards to mitigate the risk of cyber attacks and threats.

Developing Data Collection Methods through Acoustic Keyboard and Audio-based Keyboard Attack

2023

bottom of page