This thesis explores and compares various gesture recognition methods within the context of Human-Computer Interaction (HCI), supported by demonstration applications that implement selected techniques.
The first section introduces key concepts of HCI and gesture recognition, along with core principles for successful implementation.
Gesture recognition methods are categorized in two ways:
Each method is described in detail and evaluated based on its characteristics and use cases.
Additionally, current and emerging applications of gesture recognition are discussed, including usage in mobile devices and augmented reality headsets.
The second part of the thesis focuses on two prototype applications. Both simulate a milkshake-ordering kiosk - chosen to illustrate the potential of gesture-based interfaces in public settings, especially for reducing germ transmission and enhancing accessibility.
One application uses static gesture recognition to navigate a menu, while the other employs dynamic gesture recognition for the same task.
Custom vector graphics and user interfaces were designed specifically for these applications, with attention to UI and UX best practices. Both were developed using Pygame and OpenCV libraries.
The model achieved an accuracy of nearly 95% during training, dropping to 85% in real-world testing.
The results were satisfactory under controlled conditions — good lighting and a plain background significantly improved recognition accuracy. User navigation was smooth with minimal gesture misinterpretation. Future improvements could involve enhancing environmental robustness and deploying the system on alternate hardware platforms.
Despite the limited scope, the thesis successfully met all its objectives, demonstrating both the feasibility and potential of gesture recognition in practical HCI scenarios.
Future Improvements and Considerations
Several enhancements could be implemented to increase the accuracy and usability of the gesture recognition systems developed in this thesis.
For static recognition, more advanced algorithms could be employed to better detect and isolate the hand from the background, reducing the impact of varying environmental conditions. Incorporating precise hand-tracking and segmentation techniques—such as identifying individual finger parts and tracking their movement—could further enhance recognition accuracy.
To improve usability, the system should dynamically identify the region of interest (ROI), eliminating the need for users to constantly adjust their hand position based on the camera view. This would make the interaction more intuitive and user-friendly.
For dynamic gesture recognition, performance can be improved by experimenting with different neural network configurations. This includes adjusting parameters, increasing model depth, using alternative activation functions, or applying different training techniques. Expanding the dataset and optimizing post-processing methods could also lead to higher recognition accuracy.
Both applications could be tested on alternative hardware platforms, such as a Raspberry Pi with a high-definition camera and external display, to evaluate performance in more compact, cost-effective setups. Additionally, exploring sensor-based recognition methods (e.g., gloves, infrared sensors, or ultrasonic devices) is a promising direction for future development.
It's also important to note that the most effective gesture recognition systems often combine multiple techniques—blending vision-based and sensor-based methods—to achieve greater robustness and reliability in diverse environments.