2022 22nd International Symposium on Cluster, Cloud and Internet Computing (CCGrid)
Download PDF

Abstract

The research on detecting violent behavior in videos has made good progress, which provides good support for monitoring abnormal videos spread in the network, so as to achieve the effect of purifying the network space environment. A large number of current violence detection models have achieved good performance in experimental environments, but their generalization ability is insufficient. Violent behavior often occurs in a variety of scenarios, automatic detection of violent behavior requires a model with strong generalization. In this paper, a crowd violence behavior detection model with good generalization ability based on human contour and dynamic characteristics was designed. The model generalization ability is improved by focusing on the human features in the video and using the human dynamic features obtained from adjacent frames. In our model, a 3D-CNN framework was used to extract spatial features of the input feature map, and LSTM was used to fuse the temporal feature, we call this model HD-Net. Through multiple contrast experiments, the generalization ability of HD-Net is tested on three datasets: RLVS, Hockey and violent flow. Comparing with other classical violence detection models, the good generalization ability of the model is verified.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles