Aplication of Gesture Recognition in HCI

Engineering Thesis

Gesture Recognition

graphic design

ML/AI

order kiosk

UI/UX

https://github.com/doworek/Gesture-recognition

Design

Technology

Project Overview

This thesis explores and compares various gesture recognition methods within the context of Human-Computer Interaction (HCI), supported by demonstration applications that implement selected techniques.

The first section introduces key concepts of HCI and gesture recognition, along with core principles for successful implementation.

Gesture recognition methods are categorized in two ways:

By gesture type – static vs. dynamic recognition
By technology used – vision-based vs. sensor-based approaches

Each method is described in detail and evaluated based on its characteristics and use cases.

Additionally, current and emerging applications of gesture recognition are discussed, including usage in mobile devices and augmented reality headsets.

Execution

The second part of the thesis focuses on two prototype applications. Both simulate a milkshake-ordering kiosk - chosen to illustrate the potential of gesture-based interfaces in public settings, especially for reducing germ transmission and enhancing accessibility.

One application uses static gesture recognition to navigate a menu, while the other employs dynamic gesture recognition for the same task.

Custom vector graphics and user interfaces were designed specifically for these applications, with attention to UI and UX best practices. Both were developed using Pygame and OpenCV libraries.

For static gesture recognition, a stereo vision method was selected. The implementation was based on an open-source repository and integrated into the custom-built interface.
For dynamic recognition, an existing gesture dataset and a trained model (based on this repository) were used. The training process lasted approximately 17 hours on a laptop with an Intel Core i7 (8th Gen) CPU and NVIDIA GeForce GTX 1050 Ti GPU.

‍

The model achieved an accuracy of nearly 95% during training, dropping to 85% in real-world testing.

Results

The results were satisfactory under controlled conditions — good lighting and a plain background significantly improved recognition accuracy. User navigation was smooth with minimal gesture misinterpretation. Future improvements could involve enhancing environmental robustness and deploying the system on alternate hardware platforms.

Despite the limited scope, the thesis successfully met all its objectives, demonstrating both the feasibility and potential of gesture recognition in practical HCI scenarios.

‍

Future Improvements and Considerations

Several enhancements could be implemented to increase the accuracy and usability of the gesture recognition systems developed in this thesis.

For static recognition, more advanced algorithms could be employed to better detect and isolate the hand from the background, reducing the impact of varying environmental conditions. Incorporating precise hand-tracking and segmentation techniques—such as identifying individual finger parts and tracking their movement—could further enhance recognition accuracy.

To improve usability, the system should dynamically identify the region of interest (ROI), eliminating the need for users to constantly adjust their hand position based on the camera view. This would make the interaction more intuitive and user-friendly.

For dynamic gesture recognition, performance can be improved by experimenting with different neural network configurations. This includes adjusting parameters, increasing model depth, using alternative activation functions, or applying different training techniques. Expanding the dataset and optimizing post-processing methods could also lead to higher recognition accuracy.

Both applications could be tested on alternative hardware platforms, such as a Raspberry Pi with a high-definition camera and external display, to evaluate performance in more compact, cost-effective setups. Additionally, exploring sensor-based recognition methods (e.g., gloves, infrared sensors, or ultrasonic devices) is a promising direction for future development.

It's also important to note that the most effective gesture recognition systems often combine multiple techniques—blending vision-based and sensor-based methods—to achieve greater robustness and reliability in diverse environments.

‍

See more

Dino Game on STM32

Spark by Communiteam Stanford-KEIO exchange project

Tetris Game on a Flip-Dot Display