CUI Lahore Repository

Human Activity Recognition Based on Multimodal Sensor Data Fusion using Deep Neural Networks

Show simple item record

dc.contributor.author Ahmed, Saad
dc.date.accessioned 2022-08-22T07:08:56Z
dc.date.available 2022-08-22T07:08:56Z
dc.date.issued 2022-08-22
dc.identifier.uri http://repository.cuilahore.edu.pk/xmlui/handle/123456789/3428
dc.description.abstract In recent years, Human Activity Recognition (HAR) has been one of the core research areas due to its various applications. It has been attracting growing attention in the computer vision field. In HAR, activities are normally represented using numerous sensor modalities, like vision, inertial, skeleton, audio, etc. However, there are limitations associated with these sensors like Local barriers, image barriers, sensor unreliability, and consumer concerns. Multimodal Human Activity Recognition (MMHAR) solves these problems by using more than one modality sensor to enable the complementary information of different domains in the recognition task. Recently, various Deep learning-based approaches have been proposed for MMHAR and have achieved state-of-the-results. Though great efforts have been made in this area using various modalities, little attention has been paid to the analysis of the dominance and relevance of one modality over another. This research work highlights the importance of multimodal-based sensor fusion using deep neural networks to achieve HAR and highlights which modality has more importance in recognizing activities. This research work proposes a novel deep multimodal fusion network based on two-stream architecture. One model stream uses the Three-dimensional Convolutional Neural Network (3D-CONV) to handle the depth sensor data. At the same time, the second stream of the model uses a Two-dimensional Convolutional Neural Network and Long Short Term Memory (2D-CONVLSTM) for handling the inertial sensor data. Both streams capture features from data generated through depth and inertial sensors. Decision level fusion combines the results generated from both streams to get the final prediction. The proposed model has been evaluated on a publicly available benchmark dataset Berkley MHAD (Multimodal Human Action Detection) and has produced state-of-the-art accuracy of 99.73%, outperforming the previous methods. The depth sensor and inertial sensor data are passed to the proposed model streams separately in a Single modality-based Human Activity Recognition (SHAR) process. It is observed that the depth camera sensor achieves a higher accuracy result of 98.89 % than an inertial sensor accuracy score of 89.34%. x Hence, it is concluded that the depth camera sensor has more importance in the recognition task than inertial sensor data en_US
dc.publisher Department of Computer Sciences, COMSATS University Lahore. en_US
dc.relation.ispartofseries /FA18-RCS-027;7588
dc.subject Human Activity,Sensor Data Fusion,Neural Networks en_US
dc.title Human Activity Recognition Based on Multimodal Sensor Data Fusion using Deep Neural Networks en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • Thesis - MS / PhD
    This collection containts the Ms/PhD thesis of the studetns of Department of Computer Science

Show simple item record

Search DSpace


Advanced Search

Browse

My Account