Change search
Refine search result
1 - 8 of 8
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Hashmi, Khurram Azeem
    et al.
    Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; Mindgarage, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
    Pagani, Alain
    German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Stricker, Didier
    Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
    Afzal, Muhammad Zeshan
    Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; Mindgarage, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
    Exploiting Concepts of Instance Segmentation to Boost Detection in Challenging Environments2022In: Sensors, E-ISSN 1424-8220, Vol. 22, no 10, article id 3703Article in journal (Refereed)
    Abstract [en]

    In recent years, due to the advancements in machine learning, object detection has become a mainstream task in the computer vision domain. The first phase of object detection is to find the regions where objects can exist. With the improvements in deep learning, traditional approaches, such as sliding windows and manual feature selection techniques, have been replaced with deep learning techniques. However, object detection algorithms face a problem when performed in low light, challenging weather, and crowded scenes, similar to any other task. Such an environment is termed a challenging environment. This paper exploits pixel-level information to improve detection under challenging situations. To this end, we exploit the recently proposed hybrid task cascade network. This network works collaboratively with detection and segmentation heads at different cascade levels. We evaluate the proposed methods on three complex datasets of ExDark, CURE-TSD, and RESIDE, and achieve a mAP of 0.71, 0.52, and 0.43, respectively. Our experimental results assert the efficacy of the proposed approach.

  • 2.
    Kallempudi, Goutham
    et al.
    Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany.
    Hashmi, Khurram Azeem
    Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; Mindgarage, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
    Pagani, Alain
    German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Stricker, Didier
    Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
    Afzal, Muhammad Zeshan
    Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; Mindgarage, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
    Toward Semi-Supervised Graphical Object Detection in Document Images2022In: Future Internet, E-ISSN 1999-5903, Vol. 14, no 6, article id 176Article in journal (Refereed)
    Abstract [en]

    The graphical page object detection classifies and localizes objects such as Tables and Figures in a document. As deep learning techniques for object detection become increasingly successful, many supervised deep neural network-based methods have been introduced to recognize graphical objects in documents. However, these models necessitate a substantial amount of labeled data for the training process. This paper presents an end-to-end semi-supervised framework for graphical object detection in scanned document images to address this limitation. Our method is based on a recently proposed Soft Teacher mechanism that examines the effects of small percentage-labeled data on the classification and localization of graphical objects. On both the PubLayNet and the IIIT-AR-13K datasets, the proposed approach outperforms the supervised models by a significant margin in all labeling ratios (1%, 5%, and 10%). Furthermore, the 10% PubLayNet Soft Teacher model improves the average precision of Table, Figure, and List by +5.4,+1.2, and +3.2 points, respectively, with a similar total mAP as the Faster-RCNN baseline. Moreover, our model trained on 10% of IIIT-AR-13K labeled data beats the previous fully supervised method +4.5 points.

  • 3.
    Kanchi, Shrinidhi
    et al.
    Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany.
    Pagani, Alain
    German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
    Mokayed, Hamam
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Stricker, Didier
    Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
    Afzal, Muhammad Zeshan
    Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany; Mindgarage, Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany.
    EmmDocClassifier: Efficient Multimodal Document Image Classifier for Scarce Data2022In: Applied Sciences, E-ISSN 2076-3417, Vol. 12, no 3, article id 1457Article in journal (Refereed)
    Abstract [en]

    Document classification is one of the most critical steps in the document analysis pipeline. There are two types of approaches for document classification, known as image-based and multimodal approaches. Image-based document classification approaches are solely based on the inherent visual cues of the document images. In contrast, the multimodal approach co-learns the visual and textual features, and it has proved to be more effective. Nonetheless, these approaches require a huge amount of data. This paper presents a novel approach for document classification that works with a small amount of data and outperforms other approaches. The proposed approach incorporates a hierarchical attention network (HAN) for the textual stream and the EfficientNet-B0 for the image stream. The hierarchical attention network in the textual stream uses dynamic word embedding through fine-tuned BERT. HAN incorporates both the word level and sentence level features. While earlier approaches rely on training on a large corpus (RVL-CDIP), we show that our approach works with a small amount of data (Tobacco-3482). To this end, we trained the neural network at Tobacco-3482 from scratch. Therefore, we outperform the state-of-the-art by obtaining an accuracy of 90.3%. This results in a relative error reduction rate of 7.9%.

  • 4.
    Khan, Muhammad Ahmed Ullah
    et al.
    Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; Mindgarage, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
    Nazir, Danish
    Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; Mindgarage, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
    Pagani, Alain
    German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
    Mokayed, Hamam
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Stricker, Didier
    Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
    Afzal, Muhammad Zeshan
    Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; Mindgarage, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
    A Comprehensive Survey of Depth Completion Approaches2022In: Sensors, E-ISSN 1424-8220, Vol. 22, no 18, article id 6969Article, review/survey (Refereed)
    Abstract [en]

    Depth maps produced by LiDAR-based approaches are sparse. Even high-end LiDAR sensors produce highly sparse depth maps, which are also noisy around the object boundaries. Depth completion is the task of generating a dense depth map from a sparse depth map. While the earlier approaches focused on directly completing this sparsity from the sparse depth maps, modern techniques use RGB images as a guidance tool to resolve this problem. Whilst many others rely on affinity matrices for depth completion. Based on these approaches, we have divided the literature into two major categories; unguided methods and image-guided methods. The latter is further subdivided into multi-branch and spatial propagation networks. The multi-branch networks further have a sub-category named image-guided filtering. In this paper, for the first time ever we present a comprehensive survey of depth completion methods. We present a novel taxonomy of depth completion approaches, review in detail different state-of-the-art techniques within each category for depth completion of LiDAR data, and provide quantitative results for the approaches on KITTI and NYUv2 depth completion benchmark datasets.

  • 5.
    Khan, Muhammad Saif Ullah
    et al.
    Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
    Pagani, Alain
    German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Stricker, Didier
    Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
    Afzal, Muhammad Zeshan
    Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany; Mindgarage, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany.
    Three-Dimensional Reconstruction from a Single RGB Image Using Deep Learning: A Review2022In: Journal of Imaging, E-ISSN 2313-433X, Vol. 8, no 9, article id 225Article, review/survey (Refereed)
    Abstract [en]

    Performing 3D reconstruction from a single 2D input is a challenging problem that is trending in literature. Until recently, it was an ill-posed optimization problem, but with the advent of learning-based methods, the performance of 3D reconstruction has also significantly improved. Infinitely many different 3D objects can be projected onto the same 2D plane, which makes the reconstruction task very difficult. It is even more difficult for objects with complex deformations or no textures. This paper serves as a review of recent literature on 3D reconstruction from a single view, with a focus on deep learning methods from 2018 to 2021. Due to the lack of standard datasets or 3D shape representation methods, it is hard to compare all reviewed methods directly. However, this paper reviews different approaches for reconstructing 3D shapes as depth maps, surface normals, point clouds, and meshes; along with various loss functions and metrics used to train and evaluate these methods.

  • 6.
    Mishra, Shashank
    et al.
    Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany.
    Hashmi, Khurram Azeem
    Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; Mindgarage, Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
    Pagani, Alain
    German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Stricker, Didier
    Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
    Afzal, Muhammad Zeshan
    Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; Mindgarage, Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
    Towards Robust Object Detection in Floor Plan Images: A Data Augmentation Approach2021In: Applied Sciences, E-ISSN 2076-3417, Vol. 11, no 23, article id 11174Article in journal (Refereed)
    Abstract [en]

    Object detection is one of the most critical tasks in the field of Computer vision. This task comprises identifying and localizing an object in the image. Architectural floor plans represent the layout of buildings and apartments. The floor plans consist of walls, windows, stairs, and other furniture objects. While recognizing floor plan objects is straightforward for humans, automatically processing floor plans and recognizing objects is challenging. In this work, we investigate the performance of the recently introduced Cascade Mask R-CNN network to solve object detection in floor plan images. Furthermore, we experimentally establish that deformable convolution works better than conventional convolutions in the proposed framework. Prior datasets for object detection in floor plan images are either publicly unavailable or contain few samples. We introduce SFPI, a novel synthetic floor plan dataset consisting of 10,000 images to address this issue. Our proposed method conveniently exceeds the previous state-of-the-art results on the SESYD dataset with an mAP of 98.1%. Moreover, it sets impressive baseline results on our novel SFPI dataset with an mAP of 99.8%. We believe that introducing the modern dataset enables the researcher to enhance the research in this domain.

  • 7.
    Muralidhara, Shishir
    et al.
    Department of Computer Science, Technical University of Kaiserslautern, Kaiserslautern, Germany; Mindgarage, Technical University of Kaiserslautern, Kaiserslautern, Germany.
    Hashmi, Khurram Azeem
    Department of Computer Science, Technical University of Kaiserslautern, Kaiserslautern, Germany; Mindgarage, Technical University of Kaiserslautern, Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), Kaiserslautern, Germany.
    Pagani, Alain
    German Research Institute for Artificial Intelligence (DFKI), Kaiserslautern, Germany.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Stricker, Didier
    Department of Computer Science, Technical University of Kaiserslautern, Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), Kaiserslautern, Germany.
    Afzal, Muhammad Zeshan
    Department of Computer Science, Technical University of Kaiserslautern, Kaiserslautern, Germany; Mindgarage, Technical University of Kaiserslautern, Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), Kaiserslautern, Germany.
    Attention-Guided Disentangled Feature Aggregation for Video Object Detection2022In: Sensors, E-ISSN 1424-8220, Vol. 22, no 21, article id 8583Article in journal (Refereed)
    Abstract [en]

    Object detection is a computer vision task that involves localisation and classification of objects in an image. Video data implicitly introduces several challenges, such as blur, occlusion and defocus, making video object detection more challenging in comparison to still image object detection, which is performed on individual and independent images. This paper tackles these challenges by proposing an attention-heavy framework for video object detection that aggregates the disentangled features extracted from individual frames. The proposed framework is a two-stage object detector based on the Faster R-CNN architecture. The disentanglement head integrates scale, spatial and task-aware attention and applies it to the features extracted by the backbone network across all the frames. Subsequently, the aggregation head incorporates temporal attention and improves detection in the target frame by aggregating the features of the support frames. These include the features extracted from the disentanglement network along with the temporal features. We evaluate the proposed framework using the ImageNet VID dataset and achieve a mean Average Precision (mAP) of 49.8 and 52.5 using the backbones of ResNet-50 and ResNet-101, respectively. The improvement in performance over the individual baseline methods validates the efficacy of the proposed approach.

  • 8.
    Shehzadi, Tahira
    et al.
    Department of Computer Science, Technical University of Kaiserslautern, Kaiserslautern, 67663, Germany; Department of Computer Science, Mindgarage, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), Kaiserslautern, 67663, Germany.
    Hashmi, Khurram Azeem
    Department of Computer Science, Technical University of Kaiserslautern, Kaiserslautern, 67663, Germany; Department of Computer Science, Mindgarage, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), Kaiserslautern, 67663, Germany.
    Pagani, Alain
    German Research Institute for Artificial Intelligence (DFKI), Kaiserslautern, 67663, Germany.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Stricker, Didier
    Department of Computer Science, Technical University of Kaiserslautern, Kaiserslautern, 67663, Germany; German Research Institute for Artificial Intelligence (DFKI), Kaiserslautern, 67663, Germany.
    Afzal, Muhammad Zeshan
    Department of Computer Science, Technical University of Kaiserslautern, Kaiserslautern, 67663, Germany; Department of Computer Science, Mindgarage, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), Kaiserslautern, 67663, Germany.
    Mask-Aware Semi-Supervised Object Detection in Floor Plans2022In: Applied Sciences, E-ISSN 2076-3417, Vol. 12, no 19, article id 9398Article in journal (Refereed)
    Abstract [en]

    Research has been growing on object detection using semi-supervised methods in past few years. We examine the intersection of these two areas for floor-plan objects to promote the research objective of detecting more accurate objects with less labeled data. The floor-plan objects include different furniture items with multiple types of the same class, and this high inter-class similarity impacts the performance of prior methods. In this paper, we present Mask R-CNN-based semi-supervised approach that provides pixel-to-pixel alignment to generate individual annotation masks for each class to mine the inter-class similarity. The semi-supervised approach has a student–teacher network that pulls information from the teacher network and feeds it to the student network. The teacher network uses unlabeled data to form pseudo-boxes, and the student network uses both label data with the pseudo boxes and labeled data as the ground truth for training. It learns representations of furniture items by combining labeled and label data. On the Mask R-CNN detector with ResNet-101 backbone network, the proposed approach achieves a mAP of 98.8%, 99.7%, and 99.8% with only 1%, 5% and 10% labeled data, respectively. Our experiment affirms the efficiency of the proposed approach, as it outperforms the previous semi-supervised approaches using only 1% of the labels.

1 - 8 of 8
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf