Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Pretraining Image Encoders without Reconstruction via Feature Prediction Loss
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.ORCID iD: 0000-0003-0100-4030
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.ORCID iD: 0000-0001-5662-825X
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.ORCID iD: 0000-0003-4029-6574
2021 (English)In: Proceedings of ICPR 2020: 25th International Conference on Pattern Recognition, IEEE, 2021, p. 4105-4111Conference paper, Published paper (Refereed)
Abstract [en]

This work investigates three methods for calculating loss for autoencoder-based pretraining of image encoders: The commonly used reconstruction loss, the more recently introduced deep perceptual similarity loss, and a feature prediction loss proposed here; the latter turning out to be the most efficient choice. Standard auto-encoder pretraining for deep learning tasks is done by comparing the input image and the reconstructed image. Recent work shows that predictions based on embeddings generated by image autoencoders can be improved by training with perceptual loss, i.e., by adding a loss network after the decoding step. So far the autoencoders trained with loss networks implemented an explicit comparison of the original and reconstructed images using the loss network. However, given such a loss network we show that there is no need for the time-consuming task of decoding the entire image. Instead, we propose to decode the features of the loss network, hence the name “feature prediction loss”. To evaluate this method we perform experiments on three standard publicly available datasets (LunarLander-v2, STL-10, and SVHN) and compare six different procedures for training image encoders (pixel-wise, perceptual similarity, and feature prediction losses; combined with two variations of image and feature encoding/decoding). The embedding-based prediction results show that encoders trained with feature prediction loss is as good or better than those trained with the other two losses. Additionally, the encoder is significantly faster to train using feature prediction loss in comparison to the other losses. The method implementation used in this work is available online. https://github.com/guspih/Perceptual-Autoencoders

Place, publisher, year, edition, pages
IEEE, 2021. p. 4105-4111
Series
International Conference on Pattern Recognition
Keywords [en]
Autoencoder, Perceptual, Knowledge Distillation, Image Classification, Object Positioning, Embeddings
National Category
Computer Sciences
Research subject
Machine Learning
Identifiers
URN: urn:nbn:se:ltu:diva-84180DOI: 10.1109/ICPR48806.2021.9412239ISI: 000678409204030Scopus ID: 2-s2.0-85110513234OAI: oai:DiVA.org:ltu-84180DiVA, id: diva2:1553157
Conference
25th International Conference on Pattern Recognition (ICPR 2020), Milan, Italy (Virtual), January 10-15, 2020
Note

ISBN för värdpublikation: 978-1-7281-8808-9

Available from: 2021-05-07 Created: 2021-05-07 Last updated: 2023-09-04Bibliographically approved
In thesis
1. Deep Perceptual Loss for Improved Downstream Prediction
Open this publication in new window or tab >>Deep Perceptual Loss for Improved Downstream Prediction
2021 (English)Licentiate thesis, comprehensive summary (Other academic)
Place, publisher, year, edition, pages
Luleå: Luleå tekniska universitet, 2021
Series
Licentiate thesis / Luleå University of Technology, ISSN 1402-1757
National Category
Computer graphics and computer vision Computer Sciences
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-86440 (URN)978-91-7790-904-0 (ISBN)978-91-7790-905-7 (ISBN)
Presentation
2021-10-12, C305, Luleå Tekniska Universitet, Luleå, 14:00 (English)
Opponent
Supervisors
Available from: 2021-08-16 Created: 2021-08-13 Last updated: 2025-02-01Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Grund Pihlgren, GustavSandin, FredrikLiwicki, Marcus

Search in DiVA

By author/editor
Grund Pihlgren, GustavSandin, FredrikLiwicki, Marcus
By organisation
Embedded Internet Systems Lab
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 321 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf