Ändra sökning
Länk till posten
Permanent länk

Direktlänk
BETA
Publikationer (10 of 14) Visa alla publikationer
Belay, B., Habtegebrial, T., Meshesha, M., Liwicki, M., Belay, G. & Stricker, D. (2020). Amharic OCR: An End-to-End Learning. Applied Sciences, 10(3), Article ID 1117.
Öppna denna publikation i ny flik eller fönster >>Amharic OCR: An End-to-End Learning
Visa övriga...
2020 (Engelska)Ingår i: Applied Sciences, E-ISSN 2076-3417, Vol. 10, nr 3, artikel-id 1117Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

In this paper, we introduce an end-to-end Amharic text-line image recognition approach based on recurrent neural networks. Amharic is an indigenous Ethiopic script which follows a unique syllabic writing system adopted from an ancient Geez script. This script uses 34 consonant characters with the seven vowel variants of each (called basic characters) and other labialized characters derived by adding diacritical marks and/or removing parts of the basic characters. These associated diacritics on basic characters are relatively smaller in size, visually similar, and challenging to distinguish from the derived characters. Motivated by the recent success of end-to-end learning in pattern recognition, we propose a model which integrates a feature extractor, sequence learner, and transcriber in a unified module and then trained in an end-to-end fashion. The experimental results, on a printed and synthetic benchmark Amharic Optical Character Recognition (OCR) database called ADOCR, demonstrated that the proposed model outperforms state-of-the-art methods by 6.98% and 1.05%, respectively.

Ort, förlag, år, upplaga, sidor
MDPI, 2020
Nyckelord
Amharic script, CNN, CTC, end-to-end learning, LSTM, OCR, pattern recognition, text-line image
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
Maskininlärning
Identifikatorer
urn:nbn:se:ltu:diva-78211 (URN)10.3390/app10031117 (DOI)2-s2.0-85081261258 (Scopus ID)
Anmärkning

Validerad;2020;Nivå 2;2020-03-25 (alebob)

Tillgänglig från: 2020-03-25 Skapad: 2020-03-25 Senast uppdaterad: 2020-03-25Bibliografiskt granskad
Adewumi, O. & Liwicki, M. (2020). Inner For-Loop for Speeding Up Blockchain Mining. Open Computer Science, 10(1), 42-47
Öppna denna publikation i ny flik eller fönster >>Inner For-Loop for Speeding Up Blockchain Mining
2020 (Engelska)Ingår i: Open Computer Science, ISSN 2299-1093, Vol. 10, nr 1, s. 42-47Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

In this paper, the authors propose to increase the efficiency of blockchain mining by using a population-based approach. Blockchain relies on solving difficult mathematical problems as proof-of-work within a network before blocks are added to the chain. Brute force approach, advocated by some as the fastest algorithm for solving partial hash collisions and implemented in Bitcoin blockchain, implies exhaustive, sequential search. It involves incrementing the nonce (number) of the header by one, then taking a double SHA-256 hash at each instance and comparing it with a target value to ascertain if lower than that target. It excessively consumes both time and power. In this paper, the authors, therefore, suggest using an inner for-loop for the population-based approach. Comparison shows that it’s a slightly faster approach than brute force, with an average speed advantage of about 1.67% or 3,420 iterations per second and 73% of the time performing better. Also, we observed that the more the total particles deployed, the better the performance until a pivotal point. Furthermore, a recommendation on taming the excessive use of power by networks, like Bitcoin’s, by using penalty by consensus is suggested.

Ort, förlag, år, upplaga, sidor
Poland: Walter de Gruyter, 2020
Nyckelord
Blockchain, Network, Inner For-Loop, SHA-256, Brute force
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
Maskininlärning
Identifikatorer
urn:nbn:se:ltu:diva-76859 (URN)10.1515/comp-2020-0004 (DOI)000521941100001 ()2-s2.0-85081908099 (Scopus ID)
Anmärkning

Validerad;2020;Nivå 2;2020-04-09 (alebob)

Tillgänglig från: 2019-11-28 Skapad: 2019-11-28 Senast uppdaterad: 2020-04-09Bibliografiskt granskad
Alberti, M., Pondenkandath, V., Würsch, M., Bouillon, M., Seuret, M., Ingold, R. & Liwicki, M. (2019). Are You Tampering with My Data?. In: Laura Leal-Taixé & Stefan Roth (Ed.), Computer Vision – ECCV 2018 Workshops: Proceedings, Part II. Paper presented at 15th European Conference on Computer Vision (ECCV), September 8-14, Munich, Germany (pp. 296-312). Springer
Öppna denna publikation i ny flik eller fönster >>Are You Tampering with My Data?
Visa övriga...
2019 (Engelska)Ingår i: Computer Vision – ECCV 2018 Workshops: Proceedings, Part II / [ed] Laura Leal-Taixé & Stefan Roth, Springer, 2019, s. 296-312Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

We propose a novel approach towards adversarial attacks on neural networks (NN), focusing on tampering the data used for training instead of generating attacks on trained models. Our network-agnostic method creates a backdoor during training which can be exploited at test time to force a neural network to exhibit abnormal behaviour. We demonstrate on two widely used datasets (CIFAR-10 and SVHN) that a universal modification of just one pixel per image for all the images of a class in the training set is enough to corrupt the training procedure of several state-of-the-art deep neural networks, causing the networks to misclassify any images to which the modification is applied. Our aim is to bring to the attention of the machine learning community, the possibility that even learning-based methods that are personally trained on public datasets can be subject to attacks by a skillful adversary.

Ort, förlag, år, upplaga, sidor
Springer, 2019
Serie
Lecture Notes in Computer Science ; 11130
Nyckelord
Adversarial attack, Machine learning, Deep neural networks, Data poisoning
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
Maskininlärning
Identifikatorer
urn:nbn:se:ltu:diva-73147 (URN)10.1007/978-3-030-11012-3_25 (DOI)2-s2.0-85061797135 (Scopus ID)978-3-030-11011-6 (ISBN)
Konferens
15th European Conference on Computer Vision (ECCV), September 8-14, Munich, Germany
Tillgänglig från: 2019-03-11 Skapad: 2019-03-11 Senast uppdaterad: 2019-03-11Bibliografiskt granskad
Kovács, G., Balogh, V., Mehta, P., Shridhar, K., Alonso, P. & Liwicki, M. (2019). Author Profiling Using Semantic and Syntactic Features: Notebook for PAN at CLEF 2019. In: Linda Cappellato, Nicola Ferro, David E. Losada, Henning Müller (Ed.), CLEF 2019 Working Notes: Working Notes of CLEF 2019 - Conference and Labs of the Evaluation Forum. Paper presented at CLEF 2019.
Öppna denna publikation i ny flik eller fönster >>Author Profiling Using Semantic and Syntactic Features: Notebook for PAN at CLEF 2019
Visa övriga...
2019 (Engelska)Ingår i: CLEF 2019 Working Notes: Working Notes of CLEF 2019 - Conference and Labs of the Evaluation Forum / [ed] Linda Cappellato, Nicola Ferro, David E. Losada, Henning Müller, 2019Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

In this paper we present an approach for the PAN 2019 Author Profiling challenge. The task here is to detect Twitter bots and also to classify the gender of human Twitter users as male or female, based on a hundred select tweets from their profile. Focusing on feature engineering, we explore the semantic categories present in tweets. We combine these semantic features with part of speech tags and other stylistic features – e.g. character floodings and the use of capital letters – for our eventual feature set. We have experimented with different machine learning techniques, including ensemble techniques, and found AdaBoost to be the most successful (attaining an F1-score of 0.99 on the development set). Using this technique, we achieved an accuracy score of 89.17% for English language tweets in the bot detection subtask

Nationell ämneskategori
Språkteknologi (språkvetenskaplig databehandling)
Forskningsämne
Maskininlärning
Identifikatorer
urn:nbn:se:ltu:diva-76936 (URN)
Konferens
CLEF 2019
Tillgänglig från: 2019-11-28 Skapad: 2019-11-28 Senast uppdaterad: 2019-11-28
Maergner, P., Pondenkandath, V., Alberti, M., Liwicki, M., Riesen, K., Ingold, R. & Fischer, A. (2019). Combining graph edit distance and triplet networks for offline signature verification. Pattern Recognition Letters, 125, 527-533
Öppna denna publikation i ny flik eller fönster >>Combining graph edit distance and triplet networks for offline signature verification
Visa övriga...
2019 (Engelska)Ingår i: Pattern Recognition Letters, ISSN 0167-8655, E-ISSN 1872-7344, Vol. 125, s. 527-533Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

Offline signature verification is a challenging pattern recognition task where a writer model is inferred using only a small number of genuine signatures. A combination of complementary writer models can make it more difficult for an attacker to deceive the verification system. In this work, we propose to combine a recent structural approach based on graph edit distance with a statistical approach based on deep triplet networks. The combination of the structural and statistical models achieve significant improvements in performance on four publicly available benchmark datasets, highlighting their complementary perspectives.

Ort, förlag, år, upplaga, sidor
Elsevier, 2019
Nyckelord
Offline signature verification, Graph edit distance, Metric learning, Deep convolutional neural network, Triplet network
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
Maskininlärning
Identifikatorer
urn:nbn:se:ltu:diva-75255 (URN)10.1016/j.patrec.2019.06.024 (DOI)000482374500072 ()2-s2.0-85067868377 (Scopus ID)
Anmärkning

Validerad;2019;Nivå 2;2019-08-20 (johcin)

Tillgänglig från: 2019-07-08 Skapad: 2019-07-08 Senast uppdaterad: 2019-09-13Bibliografiskt granskad
Adewumi, O., Liwicki, F. & Liwicki, M. (2019). Conversational Systems in Machine Learning from the Point of View of the Philosophy of Science—Using Alime Chat and Related Studies. Philosophies, 4(3), Article ID 41.
Öppna denna publikation i ny flik eller fönster >>Conversational Systems in Machine Learning from the Point of View of the Philosophy of Science—Using Alime Chat and Related Studies
2019 (Engelska)Ingår i: Philosophies, ISSN 2409-9287, Vol. 4, nr 3, artikel-id 41Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

This essay discusses current research efforts in conversational systems from the philosophy of science point of view and evaluates some conversational systems research activities from the standpoint of naturalism philosophical theory. Conversational systems or chatbots have advanced over the decades and now have become mainstream applications. They are software that users can communicate with, using natural language. Particular attention is given to the Alime Chat conversational system, already in industrial use, and the related research. The competitive nature of systems in production is a result of different researchers and developers trying to produce new conversational systems that can outperform previous or state-of-the-art systems. Different factors affect the quality of the conversational systems produced, and how one system is assessed as being better than another is a function of objectivity and of the relevant experimental results. This essay examines the research practices from, among others, Longino’s view on objectivity and Popper’s stand on falsification. Furthermore, the need for qualitative and large datasets is emphasized. This is in addition to the importance of the peer-review process in scientific publishing, as a means of developing, validating, or rejecting theories, claims, or methodologies in the research community. In conclusion, open data and open scientific discussion fora should become more prominent over the mere publication-focused trend.

Ort, förlag, år, upplaga, sidor
Switzerland: MDPI, 2019
Nyckelord
conversational systems, chatbots, philosophy of science, objectivity, verification, falsification
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
Maskininlärning
Identifikatorer
urn:nbn:se:ltu:diva-75430 (URN)10.3390/philosophies4030041 (DOI)
Anmärkning

Validerad;2019;Nivå 1;2019-09-18 (marisr)

Tillgänglig från: 2019-08-08 Skapad: 2019-08-08 Senast uppdaterad: 2019-09-18Bibliografiskt granskad
Kovács, G., Tóth, L., Van Compernolle, D. & Liwicki, M. (2019). Examining the combination of multi-band processing and channel dropout for robust speech recognition. In: Proceedings of the Annual Conference of the International Speech Communication Association, 2019: . Paper presented at Interspeech 2019.
Öppna denna publikation i ny flik eller fönster >>Examining the combination of multi-band processing and channel dropout for robust speech recognition
2019 (Engelska)Ingår i: Proceedings of the Annual Conference of the International Speech Communication Association, 2019, 2019Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

A pivotal question in Automatic Speech Recognition (ASR) is the robustness of the trained models. In this study, we investigate the combination of two methods commonly applied to increase the robustness of ASR systems. On the one hand, inspired by auditory experiments and signal processing considerations, multi-band band processing has been used for decades to improve the noise robustness of speech recognition. On the other hand, dropout is a commonly used regularization technique to prevent overfitting by keeping the model from becoming over-reliant on a small set of neurons. We hypothesize that the careful combination of the two approaches would lead to increased robustness, by preventing the resulting model from over-rely on any given band. To verify our hypothesis, we investigate various approaches for the combination of the two methods using the Aurora-4 corpus. The results obtained corroborate our initial assumption, and show that the proper combination of the two techniques leads to increased robustness, and to significantly lower word error rates (WERs). Furthermore, we find that the accuracy scores attained here compare favourably to those reported recently on the clean training scenario of the Aurora-4 corpus.

Nyckelord
multi-band processing, band-dropout, robust speech recognition, Aurora-4
Nationell ämneskategori
Människa-datorinteraktion (interaktionsdesign)
Identifikatorer
urn:nbn:se:ltu:diva-76905 (URN)10.21437/Interspeech.2019-3215 (DOI)
Konferens
Interspeech 2019
Tillgänglig från: 2019-11-28 Skapad: 2019-11-28 Senast uppdaterad: 2019-11-28
Pondenkandath, V., Alberti, M., Diatta, M., Ingold, R. & Liwicki, M. (2019). Historical Document Synthesis with Generative Adversarial Networks. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW): . Paper presented at The Second International Workshop on Computational Document Forensics (ICDAR 2019 Workshop), 20-25 September, 2019, Sydney, Australia (pp. 146-151). IEEE, 5
Öppna denna publikation i ny flik eller fönster >>Historical Document Synthesis with Generative Adversarial Networks
Visa övriga...
2019 (Engelska)Ingår i: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), IEEE, 2019, Vol. 5, s. 146-151Konferensbidrag, Publicerat paper (Övrigt vetenskapligt)
Abstract [en]

This work tackles a particular image-to-image translation problem, where the goal is to transform an image from a source domain (modern printed electronic document) to a target domain (historical handwritten document). The main motivation of this task is to generate massive synthetic datasets of "historic" documents which can be used for the training of document analysis systems. By completing this task, it becomes possible to consider the generation of a tremendous amount of synthetic training data using only one single deep learning algorithm. Existing approaches for synthetic document generation rely on heuristics, or 2D and 3D geometric transformation-functions and are typically targeted at degrading the document. We tackle the problem of document synthesis and propose to train a particular form of Generative Adversarial Neural Networks, to learn a mapping function from an input image to an output image. With several experiments, we show that our algorithm generates an artificial historical document image that looks like a real historical document - for expert and non-expert eyes - by transferring the "historical style" to the classical electronic document.

Ort, förlag, år, upplaga, sidor
IEEE, 2019
Serie
International Conference on Document Analysis and Recognition Workshops (ICDARW)
Nyckelord
historical document, deep learning, document synthesis
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
Maskininlärning
Identifikatorer
urn:nbn:se:ltu:diva-78265 (URN)10.1109/ICDARW.2019.40096 (DOI)000518786800025 ()978-1-7281-5054-3 (ISBN)
Konferens
The Second International Workshop on Computational Document Forensics (ICDAR 2019 Workshop), 20-25 September, 2019, Sydney, Australia
Tillgänglig från: 2020-04-01 Skapad: 2020-04-01 Senast uppdaterad: 2020-04-01Bibliografiskt granskad
Saini, R., Dobson, D., Morrey, J., Liwicki, M. & Liwicki, F. (2019). ICDAR 2019 Historical Document Reading Challenge on Large Structured Chinese Family Records. In: ICDAR 2019: ICDAR 2019 HDRC Chinese. Paper presented at ICDAR 2019.
Öppna denna publikation i ny flik eller fönster >>ICDAR 2019 Historical Document Reading Challenge on Large Structured Chinese Family Records
Visa övriga...
2019 (Engelska)Ingår i: ICDAR 2019: ICDAR 2019 HDRC Chinese, 2019Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

In this paper, we present a large historical database of Chinese family records with the aim to develop robust systems for historical document analysis. In this direction, we propose a Historical Document Reading Challenge on Large Chinese Structured Family Records (ICDAR 2019 HDRCCHINESE).The objective of the competition is to recognizeand analyze the layout, and finally detect and recognize thetextlines and characters of the large historical document image dataset containing more than 10000 pages. Cascade R-CNN, CRNN, and U-Net based architectures were trained to evaluatethe performances in these tasks. Error rate of 0.01 has been recorded for textline recognition (Task1) whereas a Jaccard Index of 99.54% has been recorded for layout analysis (Task2).The graph edit distance based total error ratio of 1.5% has been recorded for complete integrated textline detection andrecognition (Task3).

Nationell ämneskategori
Annan elektroteknik och elektronik Datorsystem
Identifikatorer
urn:nbn:se:ltu:diva-77258 (URN)
Konferens
ICDAR 2019
Tillgänglig från: 2019-12-27 Skapad: 2019-12-27 Senast uppdaterad: 2020-01-23
Saini, R., Kumar, P., Patidar, S., Roy, P. & Liwicki, M. (2019). Trilingual 3D Script Identification and Recognition using Leap Motion Sensor. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW): . Paper presented at 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), 20-25 september, 2019, Sydney, Australia (pp. 24-28). IEEE, 5
Öppna denna publikation i ny flik eller fönster >>Trilingual 3D Script Identification and Recognition using Leap Motion Sensor
Visa övriga...
2019 (Engelska)Ingår i: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), IEEE, 2019, Vol. 5, s. 24-28Konferensbidrag, Publicerat paper (Övrigt vetenskapligt)
Abstract [en]

Recently, the development of depth sensing technologies such as Leap motion and Microsoft Kinect sensors facilitate a touch-less environment to interact with computers and mobile devices. Several research have been carried out for the air-written text recognition with the help of these devices. However, there are several countries (like India) where multiple scripts are used to write official languages. Therefore, for the development of an effective text recognition system, the script of the text has to be identified first. The task becomes more challenging when it comes to 3D handwriting. Since, the 3D text written in air is consists of single stoke only. This paper presents a 3D script identification and recognition system written in three languages, namely, Hindi, English and Punjabi using Leap motion sensor. In the first stage, script identification was carried out in one of the three language. Next, Hidden Markov Model (HMM) was used to recognize the words. An accuracy of 96.4% was recorded in script identification whereas accuracies of 72.99%, 73.25% and 60.5% were recorded in script identification of Hindi, English and Punjabi scripts, respectively.

Ort, förlag, år, upplaga, sidor
IEEE, 2019
Serie
International Conference on Document Analysis and Recognition Workshops (ICDARW)
Nyckelord
Air-writing, Leap motion, Word recognition, Script Identification, HMM
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
Maskininlärning
Identifikatorer
urn:nbn:se:ltu:diva-77257 (URN)10.1109/ICDARW.2019.40076 (DOI)000518786800005 ()978-1-7281-5054-3 (ISBN)
Konferens
2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), 20-25 september, 2019, Sydney, Australia
Tillgänglig från: 2019-12-27 Skapad: 2019-12-27 Senast uppdaterad: 2020-04-01Bibliografiskt granskad
Organisationer
Identifikatorer
ORCID-id: ORCID iD iconorcid.org/0000-0003-4029-6574

Sök vidare i DiVA

Visa alla publikationer