Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Machine Vision for Construction Equipment by Transfer Learning with Scale Models
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.ORCID iD: 0000-0002-4716-9765
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.ORCID iD: 0000-0001-5408-0008
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.ORCID iD: 0000-0001-5662-825X
2020 (English)In: 2020 International Joint Conference on Neural Networks (IJCNN), IEEE, 2020, article id 21108Conference paper, Published paper (Refereed)
Abstract [en]

Machine vision is required by autonomous heavy construction equipment to navigate and interact with the environment. Wheel loaders need the ability to identify different objects and other equipment to perform the task of automatically loading and dumping material on dump trucks, which can be achieved using deep neural networks. Training such networks from scratch requires the iterative collection of potentially large amounts of video data, which is challenging at construction sites because of the complexity of safely operating heavy equipment in realistic environments. Transfer learning, for which pretrained neural networks can be retrained for use at construction sites, is thus attractive, especially if data can be acquired without full-scale experiments. We investigate the possibility of using scalemodel data for training and validating two different pretrained networks and use real-world test data to examine their generalization capability. A dataset containing 268 images of a 1:16 scale model of a Volvo A60H dump truck is provided, as well as 64 test images of a full-size Volvo A25G dump truck. The code and dataset are publicly available 1 . The networks, both pretrained on the MS-COCO dataset, were fine-tuned to the created dataset, and the results indicate that both networks can learn the features of the scale-model dump truck (validation mAP of 0.82 for YOLOv3 and 0.95 for RetinaNet). Both networks can transfer these learned features to detect objects on a full-size dump truck with no additional training (test mAP of 0.70 for YOLOv3 and 0.79 for RetinaNet).

Place, publisher, year, edition, pages
IEEE, 2020. article id 21108
Series
International Joint Conference on Neural Networks (IJCNN), ISSN 2161-4393, E-ISSN 2161-4407
Keywords [en]
construction equipment, automation, computer vision, deep learning, machine learning
National Category
Embedded Systems
Research subject
Cyber-Physical Systems
Identifiers
URN: urn:nbn:se:ltu:diva-81008DOI: 10.1109/IJCNN48605.2020.9207577ISI: 000626021407089Scopus ID: 2-s2.0-85093872275OAI: oai:DiVA.org:ltu-81008DiVA, id: diva2:1472598
Conference
2020 International Joint Conference on Neural Networks (IJCNN), 19-24 July, 2020, Glasgow, United Kingdom
Note

ISBN för värdpublikation: 978-1-7281-6926-2, 978-1-7281-6927-9

Available from: 2020-10-02 Created: 2020-10-02 Last updated: 2023-11-17Bibliographically approved
In thesis
1. Automation of Navigation During the Short-loading Cycle Using Machine Vision
Open this publication in new window or tab >>Automation of Navigation During the Short-loading Cycle Using Machine Vision
2022 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Earth-moving machines are machines used in a wide range of industries, such as the construction industry, to perform tasks related to earthworks.Currently, the vast majority of earth-moving machines are human-operated where expert operators perform these industry vital tasks.One such task is the short-loading cycle which is a repetitive work cycle performed in high quantities within the construction industry.This work cycle aims to use a wheel-loader to move material from a pile or from the ground to the tipping body of a dump truck.Not only is this task repetitive and performed in high quantities, but it is also representative of the knowledge required to perform a wide set of other work cycles, hence a good candidate for automation.

Skilled operators use their sensory input to perform the tasks required, such as tactile, sound and sight.One of the most important senses leveraged during normal operations is sight, as it is used to locate dynamic objects and detect dangers.Thus to be able to replace the driver of an earth-moving machine with an autonomous system, the system requires similar vision capabilities.Machine Vision is a field where the goal is to use some type of vision sensor, such as cameras, to extract relevant high-level information from images or video streams.This thesis aims to examine how machine vision can be used within the short-loading cycle to facilitate performing said work cycle autonomously.

The main findings in this thesis are threefold: Firstly, two knowledge gaps are identified in the domain of automation during the short-loading cycle.These relate to the loading of heterogeneous material and navigation during loading and unloading.Secondly, we show that it is possible to train a deep learning model to detect the cab, wheels and tipping body of a scale-model dump truck while mimicking the approach towards the load carrier during the short-loading cycle.This model can then be applied to real vehicles to detect the same objects, with no additional training.Lastly, we show that linear interpolation can be used to perform semi-automatic labelling of camera-based video data of the approach of a wheel-loader towards a dump truck during the short-loading cycle.This technique decreases the annotation workload by around 95% while retaining comparable performance.

The future direction of this work includes using techniques such as reinforcement learning to teach a model to perform the navigation required during the short-loading cycle.Future work also includes using world models to learn representations of underlying structures in the environment, open-ended learning to transfer the learned knowledge to adjacent work cycles and using machine vision to find the point of attack for scooping heterogeneous material.

Place, publisher, year, edition, pages
Luleå University of Technology, 2022
Series
Licentiate thesis / Luleå University of Technology, ISSN 1402-1757
Keywords
Automation, Short-loading cycle, Construction equipment, Deep Learning, Computer Vision
National Category
Computer Sciences
Research subject
Cyber-Physical Systems
Identifiers
urn:nbn:se:ltu:diva-88009 (URN)978-91-7790-987-3 (ISBN)978-91-7790-988-0 (ISBN)
Presentation
2022-02-09, E632, Laboratorievägen 14, Luleå, 14:00 (English)
Opponent
Supervisors
Available from: 2021-11-25 Created: 2021-11-24 Last updated: 2023-09-04Bibliographically approved
2. Towards Deep-learning-based Autonomous Navigation in the Short-loading Cycle
Open this publication in new window or tab >>Towards Deep-learning-based Autonomous Navigation in the Short-loading Cycle
2023 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Earth-moving machines, such as wheel loaders, are a type of heavy-duty machinery used within the construction industry to perform vital tasks, such as digging, transporting, and mining applications. One of these tasks is the short-loading cycle, where an operator manoeuvres the wheel loader to move material from a pile to the tipping body of a dump truck, through navigation, scooping, and dumping. The short-loading cycle is a repetitive task performed in high quantities, often as part of a larger refinement process, making it interesting for automation.

The main objective of this thesis work is to investigate challenges facing the automation of the short-loading cycle, focusing in particular on subtasks that can be efficiently addressed with deep learning methods. A secondary objective is to examine how alternative development paths, such as scale models, or simulations, can be used to enable data-driven automation of the short-loading cycle, as directly experimenting on real vehicles has a high associated cost when large numbers of timesteps are needed to gather enough data.

To investigate the two objectives, the literature is systematically reviewed to identify research gaps, challenges, and the usage of deep learning techniques. Secondly, a set of deep learning techniques is investigated to address perception and actuation problems identified as challenging and important for the automation of the short-loading cycle.

The investigation of deep learning techniques involves training and validating a realtime object detector neural network to identify key components (wheels, tipping body, and cab) on a scale model dump truck while testing on a real vehicle. This resulted in a localisation and classification degradation of only 14% between the scale model and the real dump truck, with no additional training. In addition, an examination to minimize the annotation workload of humans found that it is possible to decrease the workload by 95% while still retaining similar detection performance by leveraging linear interpolation.

Lastly, this thesis presents an investigation regarding the usage of reinforcement learning for navigation during the short-loading cycle. The results indicate that training the agent in simulation is currently required as the agent obtains the maximum reward after timesteps in the order of millions before being capable of performing the task. The results suggest that the trained agent is capable of bridging the gap between simulation and reality to complete a simplified version of the navigation task during the short-loading cycle.

The experiments presented in this thesis provide proof of concept that indicates deep learning techniques can aid in the realisation of an autonomous solution. Moreover, the results show that development paths allowing for experiments providing large numbers of timesteps can facilitate the practical use of such techniques.

Place, publisher, year, edition, pages
Luleå: Luleå University of Technology, 2023
Series
Doctoral thesis / Luleå University of Technology 1 jan 1997 → …, ISSN 1402-1544
National Category
Computer Sciences Robotics
Research subject
Cyber-Physical Systems
Identifiers
urn:nbn:se:ltu:diva-102486 (URN)978-91-8048-442-8 (ISBN)978-91-8048-443-5 (ISBN)
Public defence
2024-01-30, E632, Luleå tekniska universitet, Luleå, 09:00 (English)
Opponent
Supervisors
Available from: 2023-11-17 Created: 2023-11-17 Last updated: 2023-12-20Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Borngrund, CarlBodin, UlfSandin, Fredrik

Search in DiVA

By author/editor
Borngrund, CarlBodin, UlfSandin, Fredrik
By organisation
Embedded Internet Systems Lab
Embedded Systems

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 404 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf