Abstract:
Hand gestures represent spatiotemporal body language conveyed by various aspects of the hand, such as the palm, shape of the hand, and finger position, with the aim of conveying a particular message to the recipient. Computer Vision has different modalities of input, such as depth image, skeletal joint points or RGB images. Raw depth images are found to have poor contrast in the region of interest, which makes it difficult for the model to learn important information. Recently, in deep learning-based dynamic hand gesture recognition, researchers have attempted to combine different input modality to improve recognition accuracy. In this paper, we use depth quantized image features and point clouds to recognize dynamic hand gestures (DHG). We look at the impact of fusing depth-quantized features in Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) with point clouds in lstm-based multi-modal fusion networks.
Description:
Supervised by
Dr. Hasan Mahmud,
Asst. Professor,
Department of Computer Science and Engineering(CSE),
Islamic University of Technology (IUT)
Board Bazar, Gazipur-1704, Bangladesh.
This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2022.