Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Semi-Automatic Video Frame Annotation for Construction Equipment Automation Using Scale-Models
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.ORCID iD: 0000-0002-4716-9765
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering.
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.ORCID iD: 0000-0001-5408-0008
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.ORCID iD: 0000-0001-5662-825X
2021 (English)In: IECON 2021 – 47th Annual Conference of the IEEE Industrial Electronics Society, IEEE, 2021Conference paper, Published paper (Refereed)
Abstract [en]

Data collection and annotation is a time consuming and costly process, yet necessary for machine vision. Automation of construction equipment relies on seeing and detecting different objects in the vehicle’s surroundings. Construction equipment is commonly used to perform frequent repetitive tasks, which are interesting to automate. An example of such a task is the short-loading cycle, where the material is moved from a pile into the tipping body of a dump truck for transport. To complete this task, the wheel loader needs to have the capability to locate the tipping body of the dump truck. The machine vision system also allows the vehicle to detect unforeseen dangers such as other vehicles and more importantly human workers. In this work, we investigate the viability to perform semi-automatic annotation of video data using linear interpolation. The data is collected using scale-models mimicking a wheel-loaders approach towards a dump truck during the short-loading cycle. To measure the viability of this type of solution, the workload is compared to the accuracy of the model, YOLOv3. The results indicate that it is possible to maintain the performance while decreasing the annotation workload by about 95%. This is an interesting result for this application domain, as safety is critical and retaining the vision system performance is more important than decreasing the annotation workload. The fact that the performance seems to retain with a large workload decrease is an encouraging sign.

Place, publisher, year, edition, pages
IEEE, 2021.
Keywords [en]
Autonomous construction equipment, semi-automatic annotation, video-stream data, object detection, linear interpolation
National Category
Computer graphics and computer vision
Research subject
Cyber-Physical Systems; Machine Learning
Identifiers
URN: urn:nbn:se:ltu:diva-87866DOI: 10.1109/iecon48115.2021.9589255ISI: 000767230601030Scopus ID: 2-s2.0-85119470296OAI: oai:DiVA.org:ltu-87866DiVA, id: diva2:1610804
Conference
47th Annual Conference of the IEEE Industrial Electronics Society (IECON 2021), 13-16 Oct. 2021,Toronto, ON, Canada
Note

ISBN för värdpublikation:978-1-6654-3554-3, 978-1-6654-0256-9

Available from: 2021-11-11 Created: 2021-11-11 Last updated: 2025-02-07Bibliographically approved
In thesis
1. Automation of Navigation During the Short-loading Cycle Using Machine Vision
Open this publication in new window or tab >>Automation of Navigation During the Short-loading Cycle Using Machine Vision
2022 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Earth-moving machines are machines used in a wide range of industries, such as the construction industry, to perform tasks related to earthworks.Currently, the vast majority of earth-moving machines are human-operated where expert operators perform these industry vital tasks.One such task is the short-loading cycle which is a repetitive work cycle performed in high quantities within the construction industry.This work cycle aims to use a wheel-loader to move material from a pile or from the ground to the tipping body of a dump truck.Not only is this task repetitive and performed in high quantities, but it is also representative of the knowledge required to perform a wide set of other work cycles, hence a good candidate for automation.

Skilled operators use their sensory input to perform the tasks required, such as tactile, sound and sight.One of the most important senses leveraged during normal operations is sight, as it is used to locate dynamic objects and detect dangers.Thus to be able to replace the driver of an earth-moving machine with an autonomous system, the system requires similar vision capabilities.Machine Vision is a field where the goal is to use some type of vision sensor, such as cameras, to extract relevant high-level information from images or video streams.This thesis aims to examine how machine vision can be used within the short-loading cycle to facilitate performing said work cycle autonomously.

The main findings in this thesis are threefold: Firstly, two knowledge gaps are identified in the domain of automation during the short-loading cycle.These relate to the loading of heterogeneous material and navigation during loading and unloading.Secondly, we show that it is possible to train a deep learning model to detect the cab, wheels and tipping body of a scale-model dump truck while mimicking the approach towards the load carrier during the short-loading cycle.This model can then be applied to real vehicles to detect the same objects, with no additional training.Lastly, we show that linear interpolation can be used to perform semi-automatic labelling of camera-based video data of the approach of a wheel-loader towards a dump truck during the short-loading cycle.This technique decreases the annotation workload by around 95% while retaining comparable performance.

The future direction of this work includes using techniques such as reinforcement learning to teach a model to perform the navigation required during the short-loading cycle.Future work also includes using world models to learn representations of underlying structures in the environment, open-ended learning to transfer the learned knowledge to adjacent work cycles and using machine vision to find the point of attack for scooping heterogeneous material.

Place, publisher, year, edition, pages
Luleå University of Technology, 2022
Series
Licentiate thesis / Luleå University of Technology, ISSN 1402-1757
Keywords
Automation, Short-loading cycle, Construction equipment, Deep Learning, Computer Vision
National Category
Computer Sciences
Research subject
Cyber-Physical Systems
Identifiers
urn:nbn:se:ltu:diva-88009 (URN)978-91-7790-987-3 (ISBN)978-91-7790-988-0 (ISBN)
Presentation
2022-02-09, E632, Laboratorievägen 14, Luleå, 14:00 (English)
Opponent
Supervisors
Available from: 2021-11-25 Created: 2021-11-24 Last updated: 2023-09-04Bibliographically approved
2. Towards Deep-learning-based Autonomous Navigation in the Short-loading Cycle
Open this publication in new window or tab >>Towards Deep-learning-based Autonomous Navigation in the Short-loading Cycle
2023 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Earth-moving machines, such as wheel loaders, are a type of heavy-duty machinery used within the construction industry to perform vital tasks, such as digging, transporting, and mining applications. One of these tasks is the short-loading cycle, where an operator manoeuvres the wheel loader to move material from a pile to the tipping body of a dump truck, through navigation, scooping, and dumping. The short-loading cycle is a repetitive task performed in high quantities, often as part of a larger refinement process, making it interesting for automation.

The main objective of this thesis work is to investigate challenges facing the automation of the short-loading cycle, focusing in particular on subtasks that can be efficiently addressed with deep learning methods. A secondary objective is to examine how alternative development paths, such as scale models, or simulations, can be used to enable data-driven automation of the short-loading cycle, as directly experimenting on real vehicles has a high associated cost when large numbers of timesteps are needed to gather enough data.

To investigate the two objectives, the literature is systematically reviewed to identify research gaps, challenges, and the usage of deep learning techniques. Secondly, a set of deep learning techniques is investigated to address perception and actuation problems identified as challenging and important for the automation of the short-loading cycle.

The investigation of deep learning techniques involves training and validating a realtime object detector neural network to identify key components (wheels, tipping body, and cab) on a scale model dump truck while testing on a real vehicle. This resulted in a localisation and classification degradation of only 14% between the scale model and the real dump truck, with no additional training. In addition, an examination to minimize the annotation workload of humans found that it is possible to decrease the workload by 95% while still retaining similar detection performance by leveraging linear interpolation.

Lastly, this thesis presents an investigation regarding the usage of reinforcement learning for navigation during the short-loading cycle. The results indicate that training the agent in simulation is currently required as the agent obtains the maximum reward after timesteps in the order of millions before being capable of performing the task. The results suggest that the trained agent is capable of bridging the gap between simulation and reality to complete a simplified version of the navigation task during the short-loading cycle.

The experiments presented in this thesis provide proof of concept that indicates deep learning techniques can aid in the realisation of an autonomous solution. Moreover, the results show that development paths allowing for experiments providing large numbers of timesteps can facilitate the practical use of such techniques.

Place, publisher, year, edition, pages
Luleå: Luleå University of Technology, 2023
Series
Doctoral thesis / Luleå University of Technology 1 jan 1997 → …, ISSN 1402-1544
National Category
Computer Sciences Robotics and automation
Research subject
Cyber-Physical Systems
Identifiers
urn:nbn:se:ltu:diva-102486 (URN)978-91-8048-442-8 (ISBN)978-91-8048-443-5 (ISBN)
Public defence
2024-01-30, E632, Luleå tekniska universitet, Luleå, 09:00 (English)
Opponent
Supervisors
Available from: 2023-11-17 Created: 2023-11-17 Last updated: 2025-02-05Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Borngrund, CarlBodin, UlfSandin, Fredrik

Search in DiVA

By author/editor
Borngrund, CarlHammarkvist, TomBodin, UlfSandin, Fredrik
By organisation
Embedded Internet Systems LabDepartment of Computer Science, Electrical and Space Engineering
Computer graphics and computer vision

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 246 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf