VidHarm: A Clip Based Dataset for Harmful Content DetectionShow others and affiliations
2022 (English)In: 2022 26th International Conference on Pattern Recognition (ICPR), Institute of Electrical and Electronics Engineers (IEEE), 2022, p. 1543-1549Conference paper, Published paper (Refereed)
Abstract [en]
Automatically identifying harmful content in video is an important task with a wide range of applications. However, there is a lack of professionally labeled open datasets available. In this work VidHarm, an open dataset of 3589 video clips from film trailers annotated by professionals, is presented. An analysis of the dataset is performed, revealing among other things the relation between clip and trailer level annotations. Audiovisual models are trained on the dataset and an in-depth study of modeling choices conducted. The results show that performance is greatly improved by combining the visual and audio modality, pre-training on large-scale video recognition datasets, and class balanced sampling. Lastly, biases of the trained models are investigated using discrimination probing.VidHarm is openly available, and further details are available at the webpage https://vidharm.github.io/
Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2022. p. 1543-1549
Keywords [en]
Visualization, Annotations, Pattern recognition, Task analysis, Age Rating, Video, Audio
National Category
Computer Vision and Robotics (Autonomous Systems) Computer Sciences
Research subject
Machine Learning
Identifiers
URN: urn:nbn:se:ltu:diva-94540DOI: 10.1109/ICPR56361.2022.9956148ISI: 000897707601077Scopus ID: 2-s2.0-85143613815OAI: oai:DiVA.org:ltu-94540DiVA, id: diva2:1716045
Conference
26th International Conference on Pattern Recognition (ICPR 2022), Montreal, QC, Canada, August 21-25, 2022
Funder
Vinnova, 2020-04057 ELLIITKnut and Alice Wallenberg Foundation
Note
ISBN för värdpublikation: 978-1-6654-9062-7
2022-12-052022-12-052023-02-27Bibliographically approved