Efficient Two-Stream Network for Violence Detection using Separable Convolutional LSTM

dc.contributor.author Islam, Md. Zahidul
dc.contributor.author Rukonuzzaman, Mohammad
dc.contributor.author Ahmed, Raiyan
dc.date.accessioned 2022-04-17T16:53:18Z
dc.date.available 2022-04-17T16:53:18Z
dc.date.issued 2021-03-30
dc.identifier.uri http://hdl.handle.net/123456789/1346
dc.description Supervised by Md. Hasanul Kabir, PhD, Professor, Department of Computer Science and Engineering(CSE), Islamic University of Technology, Board Bazar, Gazipur-1704, Bangladesh. en_US
dc.description.abstract Automatic detection of violence from surveillance footage holds special significance among the various subsets of general activity recognition tasks due to its broad applicability in autonomous security monitoring systems, web video censoring, etc. In this paper, we propose a two-stream deep learning architecture based on Separable Convolutional LSTM (SepConvLSTM) and pre-trained truncated MobileNet, in which one stream processes difference of adjacent frames and the other stream takes in background suppressed frames as inputs. Fast and efficient input pre-processing techniques were used to highlight moving objects in frames by suppressing nonmoving backgrounds and capturing motion in between frames. These inputs assist in producing discriminative features as violent activities are predominantly characterized by rapid movements. SepConvLSTM is built by replacing each ConvLSTM gate’s convolution operation with a depthwise separable convolution, resulting in robust long-range spatio-temporal features with significantly fewer parameters. We experimented with three fusion strategies to merge the output feature maps of the two streams. Three standard public datasets were used to assess the proposed methods. On the larger and more difficult RWF-2000 dataset, our model outperforms the previous best accuracy by more than 2%, while matching state-of-the-art results on the smaller datasets. Our studies demonstrate that the proposed models excel both in terms of computational efficiency and detection accuracy. en_US
dc.language.iso en en_US
dc.publisher Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur, Bangladesh en_US
dc.title Efficient Two-Stream Network for Violence Detection using Separable Convolutional LSTM en_US
dc.type Thesis en_US

