12345672 of 340
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Machine vision for automation of earth-moving machines: Transfer learning experiments with YOLOv3
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering.
2019 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

This master thesis investigates the possibility to create a machine vision solution for the automation of earth-moving machines. This research was done as without some type of vision system it will not be possible to create a fully autonomous earth moving machine that can safely be used around humans or other machines. Cameras were used as the primary sensors as they are cheap, provide high resolution and is the type of sensor that most closely mimic the human vision system.

The purpose of this master thesis was to use existing real time object detectors together with transfer learning and examine if they can successfully be used to extract information in environments such as construction, forestry and mining. The amount of data needed to successfully train a real time object detector was also investigated. Furthermore, the thesis examines if there are specifically difficult situations for the defined object detector, how reliable the object detector is and finally how to use service-oriented architecture principles can be used to create deep learning systems.

To investigate the questions formulated above, three data sets were created where different properties were varied. These properties were light conditions, ground material and dump truck orientation. The data sets were created using a toy dump truck together with a similarly sized wheel loader with a camera mounted on the roof of its cab. The first data set contained only indoor images where the dump truck was placed in different orientations but neither the light nor the ground material changed. The second data set contained images were the light source was kept constant, but the dump truck orientation and ground materials changed. The last data set contained images where all property were varied.

The real time object detector YOLOv3 was used to examine how a real time object detector would perform depending on which one of the three data sets it was trained using.

No matter the data set, it was possible to train a model to perform real time object detection. Using a Nvidia 980 TI the inference time of the model was around 22 ms, which is more than enough to be able to classify videos running at 30 fps. All three data sets converged to a training loss of around 0.10.

The data set which contained more varied data, such as the data set where all properties were changed, performed considerably better reaching a validation loss of 0.164 compared to the indoor data set, containing the least varied data, only reached a validation loss of 0.257. The size of the data set was also a factor in the performance, however it was not as important as having varied data. The result also showed that all three data sets could reach a mAP score of around 0.98 using transfer learning.

Place, publisher, year, edition, pages
2019. , p. 54
Keywords [en]
Machine learning, Machine vision, YOLOv3, You only look once, Computer vision, Real time object detection, Object detection
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:ltu:diva-75169OAI: oai:DiVA.org:ltu-75169DiVA, id: diva2:1333494
Educational program
Computer Science and Engineering, master's level
Supervisors
Examiners
Available from: 2019-07-08 Created: 2019-07-01 Last updated: 2019-07-08Bibliographically approved

Open Access in DiVA

fulltext(12575 kB)0 downloads
File information
File name FULLTEXT01.pdfFile size 12575 kBChecksum SHA-512
3b95f951037f25878d9b09871837348471209dca68de15a329f6b4e7a5627b54a2b6dbd01d4d73b4cd6a3b555fa1072f4e52775ffe2cda58984cfd30ae011d85
Type fulltextMimetype application/pdf

By organisation
Department of Computer Science, Electrical and Space Engineering
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
12345672 of 340
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf