VidHarm: A Clip Based Dataset for Harmful Content Detection Show others and affiliations
2022 (English) In: 2022 26th International Conference on Pattern Recognition (ICPR), Institute of Electrical and Electronics Engineers (IEEE), 2022, p. 1543-1549Conference paper, Published paper (Refereed)
Abstract [en]
Automatically identifying harmful content in video is an important task with a wide range of applications. However, there is a lack of professionally labeled open datasets available. In this work VidHarm, an open dataset of 3589 video clips from film trailers annotated by professionals, is presented. An analysis of the dataset is performed, revealing among other things the relation between clip and trailer level annotations. Audiovisual models are trained on the dataset and an in-depth study of modeling choices conducted. The results show that performance is greatly improved by combining the visual and audio modality, pre-training on large-scale video recognition datasets, and class balanced sampling. Lastly, biases of the trained models are investigated using discrimination probing.VidHarm is openly available, and further details are available at the webpage https://vidharm.github.io/
Place, publisher, year, edition, pages Institute of Electrical and Electronics Engineers (IEEE), 2022. p. 1543-1549
Keywords [en]
Visualization, Annotations, Pattern recognition, Task analysis, Age Rating, Video, Audio
National Category
Computer graphics and computer vision Computer Sciences
Research subject Machine Learning
Identifiers URN: urn:nbn:se:ltu:diva-94540 DOI: 10.1109/ICPR56361.2022.9956148 ISI: 000897707601077 Scopus ID: 2-s2.0-85143613815 OAI: oai:DiVA.org:ltu-94540 DiVA, id: diva2:1716045
Conference 26th International Conference on Pattern Recognition (ICPR 2022), Montreal, QC, Canada, August 21-25, 2022
Funder Vinnova, 2020-04057 ELLIIT Knut and Alice Wallenberg Foundation
Note ISBN för värdpublikation: 978-1-6654-9062-7
2022-12-052022-12-052025-02-01 Bibliographically approved