Multi frame obscene video detection with ViT : an effective for detecting inappropriate content

Zhu, D; Shan, X; Wu, C; Yung, K; Ip, AWH

doi:10.4018/IJSWIS.359768

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/112909

Title:	Multi frame obscene video detection with ViT : an effective for detecting inappropriate content
Authors:	Zhu, D Shan, X Wu, C Yung, K Ip, AWH
Issue Date:	Dec-2024
Source:	International journal on semantic web and information systems, Jan.-Dec. 2024, v. 20, no. 1, p. 1-18
Abstract:	With the development of the Internet, people are surrounded by various types of information daily, including obscene videos. The quantity of such videos is increasing daily, making the detection and filtering of this information a crucial step in preventing its spread. However, a significant challenge remains in detecting obscene information in obscure scenarios, like indecent behavior occurring while wearing normal clothing, causing significant negative impacts, such as harmful influence on children. To address this issue, an innovative multi frame obscene video detection base on ViT is proposed by this manuscript per the authors, aiming to automatically detect and filter obscene content in videos. Extensive experiments conducted on the public NPDI dataset demonstrate that this method achieves better results than existing state-of-the-art methods, achieving 96.2%. Additionally, it achieves satisfactory classification accuracy on a dataset of obscure obscene videos.This provides a powerful tool for future video censorship and protects minors and the general public.
Keywords:	Computer Vision Deep Learning Obscene Video Detection Pornography classification Self Attention Mechanism Video Analysis Video Classification Vision Transformer ViT-based Models
Publisher:	IGI Global
Journal:	International journal on semantic web and information systems
ISSN:	1552-6283
EISSN:	1552-6291
DOI:	10.4018/IJSWIS.359768
Rights:	This article published as an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) which permits unrestricted use, distribution, and production in any medium, provided the author of the original work and original publication source are properly credited. The following publication Zhu, D., Shan, X., Wu, C., Yung, K., & Ip, A. W. (2024). Multi Frame Obscene Video Detection With ViT: An Effective for Detecting Inappropriate Content. International Journal on Semantic Web and Information Systems (IJSWIS), 20(1), 1-18 is available at https://dx.doi.org/10.4018/IJSWIS.359768.
Appears in Collections:	Journal/Magazine Article

Files in This Item:

File	Description	Size	Format
Zhu_Multi_Frame_Obscene.pdf		1.28 MB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Version of Record

Access

View full-text via PolyU eLinks

Show full item record

SCOPUS^TM
Citations

1

Citations as of Dec 19, 2025

Google Scholar^TM

Check

Files in This Item:

Open Access Information

Access

SCOPUSTM Citations

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

Google Scholar^TM