Swish and Flick — Gesture Detection using Deep Learning on Edge Devices

Vinay Lanka
4 min readDec 2, 2022

--

Using an ESP-32 and Edge Impulse to recognize magic spells

Magic!

“Wingardium Leviosa”. One of the most iconic spells in the Harry Potter universe got us thinking about how they managed to detect the words and gestures required for the spell to work.

It then hit us, Natural Language Processing and Gesture Recognition of course! Does this mean Harry Potter pioneered Deep Learning research? That’s food for thought.

In this blog, let’s look at the Gesture Recognition aspect of wizardry. How do we get a wand equipped with the necessary intelligence to detect gestures? Read on to find out!

Gesture Recognition

Gesture recognition is one the simpler examples of machine learning and its capabilities. It’s also an active research field in Human-Computer Interaction and has many applications.

This would require us to make a gesture classifier using the input data from an accelerometer.

The machine learning model would take in the ‘messy’ accelerometer data and make sense of the raw input to classify them as gestures.

Parts and Software Used

  • Espressif ESP-32 (We used the ESP32-Devkit-V1)
  • GY-521 MPU6050 breakout board
  • Jumper Wires
  • Breadboard (Optional)
  • Edge Impulse Studio
  • Arduino IDE (Setup for ESP-32 boards)

Hardware Setup

The ESP-32 board and the MPU6050 breakout board are connected using I2C and the whole system is powered by micro-usb as we connect it to a PC for further steps.

Connections

Data Acquisition

We first need to start a project on Edge Impulse Studio for this application.

EdgeImpulse Suite

As we’re using the Edge Impulse Suite for training and designing a deployable model for edge devices, we use their edge-impulse-data-forwarder CLI tool. This involves downloading the Edge Impulse CLI.

Their documentation is really straightforward, the data should just be comma separated and sent via Serial at a certain frequency.

We design an Arduino sketch to send acceleration data (aX, aY, aZ) via the Serial interface.

Now that that’s set up, we need to acquire data for the various gestures we need to classify. We go to their data acquisition tab to record some gestures. The general rule of thumb is, more data = better. We try to maintain a nice ratio of 80–20 between training and test data.

Creating the Impulse

We designed the impulse with a spectral analysis block for filtering the data and generating spectrums (FFT and PSD) for the model to use, a Keras classification block with the Neural Network classifier that generates the output features.

We also have a K-means anomaly detection block for anomaly detection to prevent random gestures from being incorrectly classified.

Deployment

We can deploy this impulse to any device. It would run without an internet connection and minimizes latency and runs on the edge with minimal power.

We include this model exported as an Arduino library by including it as a header file. A few modifications are needed to make it work with the ESP-32 but we succeeded in getting it working!

There we go, we succeeded in building a machine-learning model capable of detecting gestures and deployed it to an edge device!

Magical, isn’t it?

--

--

Vinay Lanka
Vinay Lanka

Written by Vinay Lanka

Robotics Graduate Student @ UMD

No responses yet