Change search
Refine search result
1 - 34 of 34
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Alonso, Pedro
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Kovács, György
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab. MTA-SZTE Research Group on Artificial Intelligence, Szeged, Hungary.
    Hate Speech Detection using Transformer Ensembles on the HASOC Dataset2020In: Speech and Computer: 22nd International Conference, SPECOM 2020, St. Petersburg, Russia, October 7–9, 2020, Proceedings / [ed] Alexey Karpov, Rodmonga Potapova, Springer, 2020, p. 13-21Conference paper (Refereed)
    Abstract [en]

    With the ubiquity and anonymity of the Internet, the spread of hate speech has been a growing concern for many years now. The language used for the purpose of dehumanizing, defaming or threatening individuals and marginalized groups not only threatens the mental health of its targets, as well as their democratic access to the Internet, but also the fabric of our society. Because of this, much effort has been devoted to manual moderation. The amount of data generated each day, particularly on social media platforms such as Facebook and twitter, however makes this a Sisyphean task. This has led to an increased demand for automatic methods of hate speech detection.

    Here, to contribute towards solving the task of hate speech detection, we worked with a simple ensemble of transformer models on a twitter-based hate speech benchmark. Using this method, we attained a weighted F1-score of 0.8426, which we managed to further improve by leveraging more training data, achieving a weighted F1-score of 0.8504. Thus markedly outperforming the best performing system in the literature.

    Download full text (pdf)
    fulltext
  • 2.
    Alonso, Pedro
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Kovács, György
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    TheNorth at HASOC 2019: Hate Speech Detection in Social Media Data2019In: Working Notes of FIRE 2019 - Forum for Information Retrieval Evaluation / [ed] Parth Mehta, Paolo Rosso, Prasenjit Majumder, Mandar Mitra,, RWTH Aachen University , 2019, p. 293-299Conference paper (Refereed)
    Abstract [en]

    The detection of hate speech in social media is a crucial task. The uncontrolled spread of hate speech can be detrimental to maintaining the peace and harmony in society. Particularly when hate speech is spread with the intention to defame people, or spoil the image of a person, a community, or a nation. A major ground for spreading hate speech is that of social media. This significantly contributes to the difficultyof the task, as social media posts not only include paralinguistic tools (e.g. emoticons, and hashtags), their linguistic content contains plenty of poorly written text that does not adhere to grammar rules. With the recent development in Natural Language Processing (NLP), particularly with deep architecture, it is now possible to anlayze unstructured composite natural language text. For this reason, we propose a deep NLP model for the detection of automatic hate speech in social media data. We have applied our model on the HASOC2019 hate speech corpus, and attained a macro F1 score of 0.63 in the detection of hate speech.

  • 3.
    Alonso, Pedro
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Kovács, György
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    TheNorth at SemEval-2020 Task 12: Hate Speech Detection using RoBERTa2020In: The International Workshop on Semantic Evaluation: Proceedings of the Fourteenth Workshop, International Committee for Computational Linguistics , 2020, p. 2197-2202Conference paper (Refereed)
    Abstract [en]

    Hate speech detection on social media platforms is crucial as it helps to avoid severe harm to marginalized people and groups. The application of Natural Language Processing (NLP) and Deep Learning has garnered encouraging results in the task of hate speech detection. The expressionof hate, however, is varied and ever-evolving. Thus better detection systems need to adapt to this variance. Because of this, researchers keep on collecting data and regularly come up with hate speech detection competitions. In this paper, we discuss our entry to one such competition,namely the English version of sub-task A for the OffensEval competition. Our contribution can be perceived through our results, that was first an F1-score of 0.9087, and with further refinementsdescribed here climb up to 0.9166. It serves to give more support to our hypothesis that one ofthe variants of BERT, namely RoBERTa can successfully differentiate between offensive and non-offensive tweets, given the proper preprocessing steps

  • 4.
    Chhipa, Prakash Chandra
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Chopra, Muskaan
    CCET, Punjab University, Chandigarh, India.
    Mengi, Gopal
    CCET, Punjab University, Chandigarh, India.
    Gupta, Varun
    CCET, Punjab University, Chandigarh, India.
    Upadhyay, Richa
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Chippa, Meenakshi Subhash
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    De, Kanjar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Uchida, Seiichi
    Human Interface Laboratory, Kyushu University, Fukuoka, Japan.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Functional Knowledge Transfer with Self-supervised Representation Learning2023In: 2023 IEEE International Conference on Image Processing: Proceedings, IEEE , 2023, p. 3339-3343Conference paper (Refereed)
  • 5.
    Chhipa, Prakash Chandra
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Rodahl Holmgren, Johan
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    De, Kanjar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab. Video Coding Systems, Fraunhofer Heinrich-Hertz-Institut, Berlin, Germany.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Can Self-Supervised Representation Learning Methods Withstand Distribution Shifts and Corruptions?2023In: 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW 2023), Institute of Electrical and Electronics Engineers Inc. , 2023, p. 4469-4478Conference paper (Refereed)
  • 6.
    Chhipa, Prakash Chandra
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Upadhyay, Richa
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Grund Pihlgren, Gustav
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Uchida, Seiichi
    Human Interface Laboratory, Kyushu University, Fukuoka, Japan.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Magnification Prior: A Self-Supervised Method for Learning Representations on Breast Cancer Histopathological Images2023In: Proceedings: 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2023), IEEE, 2023, p. 2716-2726Conference paper (Refereed)
    Abstract [en]

    This work presents a novel self-supervised pre-training method to learn efficient representations without labels on histopathology medical images utilizing magnification factors. Other state-of-the-art works mainly focus on fully supervised learning approaches that rely heavily on human annotations. However, the scarcity of labeled and unlabeled data is a long-standing challenge in histopathology. Currently, representation learning without labels remains unexplored in the histopathology domain. The proposed method, Magnification Prior Contrastive Similarity (MPCS), enables self-supervised learning of representations without labels on small-scale breast cancer dataset BreakHis by exploiting magnification factor, inductive transfer, and reducing human prior. The proposed method matches fully supervised learning state-of-the-art performance in malignancy classification when only 20% of labels are used in fine-tuning and outperform previous works in fully supervised learning settings for three public breast cancer datasets, including BreakHis. Further, It provides initial support for a hypothesis that reducing human-prior leads to efficient representation learning in self-supervision, which will need further investigation. The implementation of this work is available online on GitHub

  • 7.
    Chhipa, Prakash Chandra
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Upadhyay, Richa
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Lindqvist, Lars
    Optimation Advanced Measurements AB, Luleå, Sweden.
    Nordenskjold, Richard
    Optimation Advanced Measurements AB, Luleå, Sweden.
    Uchida, Seiichi
    Human Interface Laboratory, Kyushu University, Fukuoka, Japan.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Depth Contrast: Self-Supervised Pretraining on 3DPM Images for Mining Material ClassificationManuscript (preprint) (Other academic)
    Abstract [en]

    This work presents a novel self-supervised representation learning method to learn efficient representations without labels on images from a 3DPM sensor (3-Dimensional Particle Measurement; estimates the particle size distribution of material) utilizing RGB images and depth maps of mining material on the conveyor belt. Human annotations for material categories on sensor-generated data are scarce and cost-intensive. Currently, representation learning without human annotations remains unexplored for mining materials and does not leverage on utilization of sensor-generated data. The proposed method, Depth Contrast, enables self-supervised learning of representations without labels on the 3DPM dataset by exploiting depth maps and inductive transfer. The proposed method outperforms material classification over ImageNet transfer learning performance in fully supervised learning settings and achieves an F1 score of 0.73. Further, The proposed method yields an F1 score of 0.65 with an 11% improvement over ImageNet transfer learning performance in a semi-supervised setting when only 20% of labels are used in fine-tuning. Finally, the Proposed method showcases improved performance generalization on linear evaluation. The implementation of proposed method is available on GitHub. 

  • 8.
    Chhipa, Prakash Chandra
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Upadhyay, Richa
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Lindqvist, Lars
    Optimation Advanced Measurements AB, Luleå, Sweden.
    Nordenskjold, Richard
    Optimation Advanced Measurements AB, Luleå, Sweden.
    Uchida, Seiichi
    Human Interface Laboratory, Kyushu University, Fukuoka, Japan.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Depth Contrast: Self-supervised Pretraining on 3DPM Images for Mining Material Classification2022In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part VI / [ed] Avidan, S.; Brostow, B.; Cissé, M.; Farinella, G.M.; Hassner, H., Springer Nature, 2022, Vol. VI, p. 212-227Conference paper (Refereed)
  • 9.
    Keserwani, Prateek
    et al.
    Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, India.
    Dhankhar, Ankit
    Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, India.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Roy, Partha Pratim
    Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, India.
    Quadbox: Quadrilateral bounding box based scene text detection using vector regression2021In: IEEE Access, E-ISSN 2169-3536, Vol. 9, p. 36802-36818Article in journal (Refereed)
    Abstract [en]

    Scene text appears with a wide range of sizes and arbitrary orientations. For detecting such text in the scene image, the quadrilateral bounding boxes provide a much tight bounding box compared to the rotated rectangle. In this work, a vector regression method has been proposed for text detection in the wild to generate a quadrilateral bounding box. The bounding box prediction using direct regression requires predicting the vectors from each position inside the quadrilateral. It needs to predict four-vectors, and each varies drastically in its length and orientation. It makes the vector prediction a difficult problem. To overcome this, we have proposed a centroid-centric vector regression by utilizing the geometry of quadrilateral. In this work, we have added the philosophy of indirect regression to direct regression by shifting all points within the quadrilateral to the centroid and afterward performed vector regression from shifted points. The experimental results show the improvement of the quadrilateral approach over the existing direct regression approach. The proposed method shows good performance on many existing public datasets. The proposed method also demonstrates good results on the unseen dataset without getting trained on it, which validates the approach’s generalization ability.

  • 10.
    Keserwani, Prateek
    et al.
    Indian Institute of Technology, Roorkee, India.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Roy, Partha Pratim
    Indian Institute of Technology, Roorkee, India.
    Robust Scene Text Detection for Partially Annotated Training Data2022In: IEEE transactions on circuits and systems for video technology (Print), ISSN 1051-8215, E-ISSN 1558-2205, Vol. 32, no 12, p. 8635-8645Article in journal (Refereed)
    Abstract [en]

    This article analyzed the impact of training data containing un-annotated text instances, i.e., partial annotation in scene text detection, and proposed a text region refinement approach to address it. Scene text detection is a problem that has attracted the attention of the research community for decades. Impressive results have been obtained for fully supervised scene text detection with recent deep learning approaches. These approaches, however, need a vast amount of completely labeled datasets, and the creation of such datasets is a challenging and time-consuming task. Research literature lacks the analysis of the partial annotation of training data for scene text detection. We have found that the performance of the generic scene text detection method drops significantly due to the partial annotation of training data. We have proposed a text region refinement method that provides robustness against the partially annotated training data in scene text detection. The proposed method works as a two-tier scheme. Text-probable regions are obtained in the first tier by applying hybrid loss that generates pseudo-labels to refine text regions in the second-tier during training. Extensive experiments have been conducted on a dataset generated from ICDAR 2015 by dropping the annotations with various drop rates and on a publicly available SVT dataset. The proposed method exhibits a significant improvement over the baseline and existing approaches for the partially annotated training data.

  • 11.
    Kovács, Gyorgy
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Alonso, Pedro
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Challenges of Hate Speech Detection in Social Media: Data Scarcity, and Leveraging External Resources2021In: SN Computer Science, ISSN 2662-995X, Vol. 2, no 2, article id 95Article in journal (Refereed)
    Abstract [en]

    The detection of hate speech in social media is a crucial task. The uncontrolled spread of hate has the potential to gravely damage our society, and severely harm marginalized people or groups. A major arena for spreading hate speech online is social media. This significantly contributes to the difficulty of automatic detection, as social media posts include paralinguistic signals (e.g. emoticons, and hashtags), and their linguistic content contains plenty of poorly written text. Another difficulty is presented by the context-dependent nature of the task, and the lack of consensus on what constitutes as hate speech, which makes the task difficult even for humans. This makes the task of creating large labeled corpora difficult, and resource consuming. The problem posed by ungrammatical text has been largely mitigated by the recent emergence of deep neural network (DNN) architectures that have the capacity to efficiently learn various features. For this reason, we proposed a deep natural language processing (NLP) model—combining convolutional and recurrent layers—for the automatic detection of hate speech in social media data. We have applied our model on the HASOC2019 corpus, and attained a macro F1 score of 0.63 in hate speech detection on the test set of HASOC. The capacity of DNNs for efficient learning, however, also means an increased risk of overfitting. Particularly, with limited training data available (as was the case for HASOC). For this reason, we investigated different methods for expanding resources used. We have explored various opportunities, such as leveraging unlabeled data, similarly labeled corpora, as well as the use of novel models. Our results showed that by doing so, it was possible to significantly increase the classification score attained.

    Download full text (pdf)
    fulltext
  • 12.
    Kovács, György
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Alonso, Pedro
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Leveraging external resources for offensive content detection in social media2022In: AI Communications, ISSN 0921-7126, E-ISSN 1875-8452, Vol. 35, no 2, p. 87-109Article in journal (Refereed)
    Abstract [en]

    Hate speech is a burning issue of today’s society that cuts across numerous strategic areas, including human rights protection, refugee protection, and the fight against racism and discrimination. The gravity of the subject is further demonstrated by António Guterres, the United Nations Secretary-General, calling it “a menace to democratic values, social stability, and peace”. One central platform for the spread of hate speech is the Internet and social media in particular. Thus, automatic detection of hateful and offensive content on these platforms is a crucial challenge that would strongly contribute to an equal and sustainable society when overcome. One significant difficulty in meeting this challenge is collecting sufficient labeled data. In our work, we examine how various resources can be leveraged to circumvent this difficulty. We carry out extensive experiments to exploit various data sources using different machine learning models, including state-of-the-art transformers. We have found that using our proposed methods, one can attain state-of-the-art performance detecting hate speech on Twitter (outperforming the winner of both the HASOC 2019 and HASOC 2020 competitions). It is observed that in general, adding more data improves the performance or does not decrease it. Even when using good language models and knowledge transfer mechanisms, the best results were attained using data from one or two additional data sets.

  • 13.
    Kovács, György
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Faridghasemnia, Mohamadreza
    Örebro Universitet / Örebro, Sweden-70182.
    Mokayed, Hamam
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Adewumi, Tosin
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Alonso, Pedro
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Rakesh, Sumit
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Pedagogical Principles in the Online Teaching of NLP: A Retrospection2021In: Teaching NLP: Proceedings of the Fifth Workshop / [ed] David Jurgens; Varada Kolhatkar; Lucy Li; Margot Mieskes; Ted Pedersen, Association for Computational Linguistics (ACL) , 2021, p. 1-12Conference paper (Refereed)
    Abstract [en]

    The ongoing COVID-19 pandemic has brought online education to the forefront of pedagogical discussions. To make this increased interest sustainable in a post-pandemic era, online courses must be built on strong pedagogical foundations. With a long history of pedagogic research, there are many principles, frameworks, and models available to help teachers in doing so. These models cover different teaching perspectives, such as constructive alignment, feedback, and the learning environment. In this paper, we discuss how we designed and implemented our online Natural Language Processing (NLP) course following constructive alignment and adhering to the pedagogical principles of LTU. By examining our course and analyzing student evaluation forms, we show that we have met our goal and successfully delivered the course. Furthermore, we discuss the additional benefits resulting from the current mode of delivery, including the increased reusability of course content and increased potential for collaboration between universities. Lastly, we also discuss where we can and will further improve the current course design.

  • 14.
    Kumar, Pradeep
    et al.
    Deptt of CSE, IIT, Roorkee, India.
    Saini, Rajkumar
    Deptt of CSE, IIT, Roorkee, India.
    Behera, Santosh Kumar
    Indian Institute of Technology Bhubaneswar, Bhubaneswar, Orissa, IN.
    Dogra, Debi Prosad
    Indian Institute of Technology Bhubaneswar, Bhubaneswar, Orissa, IN.
    Roy, Partha Pratim
    Indian Institute of Technology Roorkee, Roorkee, Uttar Pradesh, IN.
    Real-time recognition of sign language gestures and air-writing using leap motion2017In: Proceedings of the 15th IAPR International Conference on Machine Vision Applications, MVA 2017, IEEE, 2017, p. 157-160, article id 7986825Conference paper (Refereed)
  • 15.
    Lavergne, Eric
    et al.
    Luleå University of Technology.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Kovács, György
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Murphy, Killian
    Luleå University of Technology.
    TheNorth @ HaSpeeDe 2: BERT-based Language Model Fine-tuning for Italian Hate Speech Detection2020In: Proceedings of the Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2020) Online event, December 17th, 2020. / [ed] Valerio Basile; Danilo Croce; Maria Di Maro; Lucia C. Passaro, RWTH Aachen University , 2020Conference paper (Refereed)
    Abstract [en]

    This report was written to describe the systems that were submitted by the team “TheNorth” for the HaSpeeDe 2 shared task organised within EVALITA 2020. To address the main task which is hate speech detection, we fine-tuned BERT-based models. We evaluated both multilingual and Italian language models trained with the data provided and additional data. We also studied the contributions of multitask learning considering both hate speech detection and stereotype detection tasks.

  • 16.
    Liwicki, Foteini
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Gupta, Vibha
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    De, Kanjar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Rethinking the Methods and Algorithms for Inner Speech Decoding and Making Them Reproducible2022In: NeuroSci, ISSN 2673-4087, Vol. 3, no 2, p. 226-244Article in journal (Refereed)
    Abstract [en]

    This study focuses on the automatic decoding of inner speech using noninvasive methods, such as Electroencephalography (EEG). While inner speech has been a research topic in philosophy and psychology for half a century, recent attempts have been made to decode nonvoiced spoken words by using various brain–computer interfaces. The main shortcomings of existing work are reproducibility and the availability of data and code. In this work, we investigate various methods (using Convolutional Neural Network (CNN), Gated Recurrent Unit (GRU), Long Short-Term Memory Networks (LSTM)) for the detection task of five vowels and six words on a publicly available EEG dataset. The main contributions of this work are (1) subject dependent vs. subject-independent approaches, (2) the effect of different preprocessing steps (Independent Component Analysis (ICA), down-sampling and filtering), and (3) word classification (where we achieve state-of-the-art performance on a publicly available dataset). Overall we achieve a performance accuracy of 35.20% and 29.21% when classifying five vowels and six words, respectively, in a publicly available dataset, using our tuned iSpeech-CNN architecture. All of our code and processed data are publicly available to ensure reproducibility. As such, this work contributes to a deeper understanding and reproducibility of experiments in the area of inner speech detection.

    Download full text (pdf)
    fulltext
  • 17.
    Mishra, Ashish Ranjan
    et al.
    Rajkiya Engineering College Sonbhadra, UP, India; Madan Mohan Malaviya University of Technology, Gorakhpur, UP, India.
    Kumar, Rakesh
    Madan Mohan Malaviya University of Technology, Gorakhpur, UP, India.
    Gupta, Vibha
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Prabhu, Sameer
    Luleå University of Technology, Department of Civil, Environmental and Natural Resources Engineering, Operation, Maintenance and Acoustics.
    Upadhyay, Richa
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Chhipa, Prakash Chandra
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Rakesh, Sumit
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Mokayed, Hamam
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    SignEEG v1.0 : Multimodal Electroencephalography and Signature Database for Biometric Systems2023Manuscript (preprint) (Other academic)
  • 18.
    Mokayed, Hamam
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Shivakumara, Palaiahnakote
    Department of System and Technology, Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur 50603, Malaysia.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Chee Hin, Loo
    Department of Computer Science, Asia Pacific University, Kuala Lumpur 57000, Malaysia.
    Pal, Umapada
    Computer Vision and Pattern Recognition Unit, Indian Statistical Institute, Kolkata 700108, India.
    Anomaly Detection in Natural Scene Images based on enhanced Fine-Grained Saliency and Fuzzy Logic2021In: IEEE Access, E-ISSN 2169-3536, Vol. 9, p. 129102-129109Article in journal (Refereed)
    Abstract [en]

    This paper proposes a simple yet effective method for anomaly detection in natural scene images improving natural scene text detection and recognition. In the last decade, there has been significant progress towards text detection and recognition in natural scene images. However, in cases where there are logos, company symbols, or other decorative elements for text, existing methods do not perform well. This work considers such misclassified components, which are part of the text as anomalies, and presents a new idea for detecting such anomalies in the text for improving text detection and recognition in natural scene images. The proposed method considers the result of the existing text detection method as input for segmenting characters or components based on saliency map and rough set theory. For each segmented component, the proposed method extracts feature from the saliency map based on density, pixel distribution, and phase congruency to classify text and non-text components by exploring a fuzzy-based classifier. To verify the effectiveness of the method, we have performed experiments on several benchmark datasets of natural scene text detection, namely, MSRATD-500 and SVT. Experimental results show the efficacy of the proposed method over the existing ones for text detection and recognition in these datasets.

  • 19.
    Pandey, Sachi
    et al.
    Department of Computer Science and Engineering, SRM Institute of Science and Technology, Delhi-NCR Campus, India.
    Chouhan, Vikas
    Department of Computer Science and Engineering, Indian Institute of Technology, Roorkee, India.
    Verma, Devanshi
    Department of Computer Science and Engineering, SRM Institute of Science and Technology, Delhi-NCR Campus, India.
    Rajrah, Shubham
    Department of Computer Science and Engineering, SRM Institute of Science and Technology, Delhi-NCR Campus, India.
    Alenezi, Fayadh
    Dept. of Electrical Engineering. College of Engineering, Jouf University, SA.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab. Department of Computer Science and Engineering, Indian Institute of Technology, Roorkee, India.
    Santosh, KC
    Applied Artificial Intelligence (2AI) Research Lab - Computer Science, The University of South Dakota, USA.
    Do-It-Yourself Recommender System: Reusing and Recycling with Blockchain and Deep Learning2022In: IEEE Access, E-ISSN 2169-3536, Vol. 10, p. 90056-90067Article in journal (Refereed)
    Abstract [en]

    Due to aggressive urbanization (with population size), waste increases exponentially, resulting in environmental damage. Even though it looks challenging, such an issue can be controlled if we can reuse them. To handle this, in our work, we design a machine learning and blockchain-oriented system that identifies thewaste objects/products and recommends to the user multiple ’Do-It-Yourself’ (DIY) ideas to reuse or recycle. Blockchain records every transaction in the shared ledger to enable transaction verifiability and supports better decision-making. In this study, a Deep Neural Network (DNN) trained on about 11700 images is developed using ResNet50 architecture for object recognition (training accuracy of 94%).We deploy several smart contracts in the Hyperledger Fabric (HF) blockchain platform to validate recommended DIY ideas by blockchain network members. HF is a decentralized ledger technology platform that executes the deployed smart contracts in a secured Docker container to initialize and manage the ledger state. The complete model is delivered on a web platform using Flask, where our recommendation system works on a web scraping script written using Python. Fetching DIY ideas using web-scraping takes nearly 1 second on a desktop machine with an Intel Core-i7 processor with 8 cores, 16 GB RAM, installed with Ubuntu 18.04 64-bit operating system, and Python 3.6 package. Further, we evaluate blockchain-based smart contracts’ latencies and throughput performances using the hyperledger caliper benchmark. To the best of our knowledge, this is the first work that integrates blockchain technology and deep learning for the DIY recommender system.

  • 20.
    Rakesh, Sumit
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Javed, Saleha
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Sign Gesture Recognition from Raw Skeleton Information in 3D Using Deep Learning2021In: Computer Vision and Image Processing: 5th International Conference, CVIP 2020, Prayagraj, India, December 4-6, 2020, Revised Selected Papers, Part II / [ed] Satish Kumar Singh; Partha Roy; Balasubramanian Raman; P. Nagabhushan, Springer Nature, 2021, p. 184-195Conference paper (Refereed)
    Abstract [en]

    Sign Language Recognition (SLR) minimizes the communication gap when interacting with hearing impaired people, i.e. connects hearing impaired persons and those who require to communicate and don’t understand SLR. This paper focuses on an end-to-end deep learning approach for the recognition of sign gestures recorded with a 3D sensor (e.g., Microsoft Kinect). Typical machine learning based SLR systems require feature extractions before applying machine learning models. These features need to be chosen carefully as the recognition performance heavily relies on them. Our proposed end-to-end approach eradicates this problem by eliminating the need to extract handmade features. Deep learning models can directly work on raw data and learn higher level representations (features) by themselves. To test our hypothesis, we have used two latest and promising deep learning models, Gated Recurrent Unit (GRU) and Bidirectional Long Short Term Memory (BiLSTM) and trained them using only raw data. We have performed comparative analysis among both models and also with the base paper results. Conducted experiments reflected that proposed method outperforms the existing work, where GRU successfully concluded with 70.78% average accuracy with front view training. 

  • 21.
    Rakesh, Sumit
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Computer Science.
    Kovács, György
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Mokayed, Hamam
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Pal, Umapada
    ISI Kolkata, India.
    Static Palm Sign Gesture Recognition with Leap Motion and Genetic Algorithm2021In: 2021 Swedish Artificial Intelligence Society Workshop (SAIS), IEEE, 2021, p. 54-58Conference paper (Refereed)
    Abstract [en]

    Sign gesture recognition is the field that models sign gestures in order to facilitate communication with hearing and speech impaired people. Sign gestures are recorded with devices like a video camera or a depth camera. Palm gestures are also recorded with the Leap motion sensor. In this paper, we address palm sign gesture recognition using the Leap motion sensor. We extract geometric features from Leap motion recordings. Next, we encode the Genetic Algorithm (GA) for feature selection. Genetically selected features are fed to different classifiers for gesture recognition. Here we have used Support Vector Machine (SVM), Random Forest (RF), and Naive Bayes (NB) classifiers to have their comparative results. The gesture recognition accuracy of 74.00% is recorded with RF classifier on the Leap motion sign gesture dataset.

  • 22.
    Rakesh, Sumit
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Mokayed, Hamam
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Upadhyay, Richa
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Chhipa, Prakash Chandra
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Gupta, Vibha
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    De, Kanjar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Kovács, György
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Singh, Dinesh
    Computer Science & Engineering, DCRUST, Murthal, Sonepat, India.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab. Department of CSE, IIT Roorkee, Roorkee, India.
    Emotions Classification Using EEG in Health Care2023In: Computer Vision and Machine Intelligence: Proceedings of CVMI 2022 / [ed] Tistarelli, Massimo; Dubey, Shiv Ram; Singh, Satish Kumar; Jiang, Xiaoyi, Springer Nature, 2023, p. 37-49Conference paper (Refereed)
  • 23.
    Roy, Partha Pratim
    et al.
    Department of Computer Science and Engineering, IIT Roorkee, Roorkee, India.
    Kumar, Pradeep
    Department of Computer Science and Engineering, IIT Roorkee, Roorkee, India.
    Patidar, Shweta
    Department of Computer Science and Engineering, IIT Roorkee, Roorkee, India.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    3D word spotting using leap motion sensor2021In: Multimedia tools and applications, ISSN 1380-7501, E-ISSN 1573-7721, Vol. 80, no 8, p. 11671-11689Article in journal (Refereed)
    Abstract [en]

    Leap motion sensor provides a new way of interaction with computers or mobile devices. With this sensor, users can write in air by moving palm or finger, thus, avoiding traditional pen and paper for writing. The strokes of air-writing or 3D writing is different from conventional way of writing. In 3D writing, the words are connected by continuous lines instead of space between them. Also, the arbitrary size of characters and presence of frequent jitters in strokes make the recognition tasks of such words and sentences difficult. To understand the semantics of a word without recognizing each character of words, the alternative process called “word-spotting” is being used. Word-spotting is often useful than conventional recognition systems to understand complex handwriting. Hence, we propose a novel word spotting methodology for 3D text using Leap motion sensor data. Spotting/detection of a keyword in 3D sentences is carried out using Hidden Markov Model (HMM) framework. From experimental study, an average of 41.7 is recorded in terms of Mean-Average-Precision (MAP). The efficiency of the system is demonstrated by comparing traditional segmentation based system. The improved performance shows that the system could be used in developing novel applications in Human-Computer-Interaction (HCI) domain.

  • 24.
    Saini, Rajkumar
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Dobson, Derek
    FamilySearch, USA.
    Morrey, Jon
    FamilySearch, USA.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    ICDAR 2019 Historical Document Reading Challenge on Large Structured Chinese Family Records2019In: The 15th IAPR International Conference on Document Analysis and Recognition: ICDAR 2019, Piscataway, New Jersey, USA: IEEE, 2019, p. 1499-1504Conference paper (Refereed)
    Abstract [en]

    In this paper, we present a large historical database of Chinese family records with the aim to develop robust systems for historical document analysis. In this direction, we propose a Historical Document Reading Challenge on Large Chinese Structured Family Records (ICDAR 2019 HDRCCHINESE).The objective of the competition is to recognizeand analyze the layout, and finally detect and recognize thetextlines and characters of the large historical document image dataset containing more than 10000 pages. Cascade R-CNN, CRNN, and U-Net based architectures were trained to evaluatethe performances in these tasks. Error rate of 0.01 has been recorded for textline recognition (Task1) whereas a Jaccard Index of 99.54% has been recorded for layout analysis (Task2).The graph edit distance based total error ratio of 1.5% has been recorded for complete integrated textline detection andrecognition (Task3).

  • 25.
    Saini, Rajkumar
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Kumar, Pradeep
    IIT Roorkee, India.
    Patidar, Shweta
    IIT Roorkee, India.
    Roy, Partha
    IIT Roorkee, India.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Trilingual 3D Script Identification and Recognition using Leap Motion Sensor2019In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), IEEE, 2019, Vol. 5, p. 24-28Conference paper (Other academic)
    Abstract [en]

    Recently, the development of depth sensing technologies such as Leap motion and Microsoft Kinect sensors facilitate a touch-less environment to interact with computers and mobile devices. Several research have been carried out for the air-written text recognition with the help of these devices. However, there are several countries (like India) where multiple scripts are used to write official languages. Therefore, for the development of an effective text recognition system, the script of the text has to be identified first. The task becomes more challenging when it comes to 3D handwriting. Since, the 3D text written in air is consists of single stoke only. This paper presents a 3D script identification and recognition system written in three languages, namely, Hindi, English and Punjabi using Leap motion sensor. In the first stage, script identification was carried out in one of the three language. Next, Hidden Markov Model (HMM) was used to recognize the words. An accuracy of 96.4% was recorded in script identification whereas accuracies of 72.99%, 73.25% and 60.5% were recorded in script identification of Hindi, English and Punjabi scripts, respectively.

  • 26.
    Saini, Rajkumar
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Prabhu, Sameer
    Data Ductus AB, Luleå, Sweden.
    Upadhyay, Richa
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Rakesh, Sumit
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Chhipa, Prakash Chandra
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Mokayed, Hamam
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Imagined Object Recognition Using EEG-Based Neurological Brain Signals2022In: Recent Trends in Image Processing and Pattern Recognition (RTIP2R 2021) / [ed] KC Santosh, Ravindra Hegadi, Umapada Pal, Springer, 2022, p. 305-319Conference paper (Refereed)
    Abstract [en]

    Researchers have been using Electroencephalography (EEG) to build Brain-Computer Interfaces (BCIs) systems. They have had a lot of success modeling brain signals for applications, including emotion detection, user identification, authentication, and control. The goal of this study is to employ EEG-based neurological brain signals to recognize imagined objects. The user imagines the object after looking at the same on the monitor screen. The EEG signal is recorded when the user thinks up about the object. These EEG signals were processed using signal processing methods, and machine learning algorithms were trained to classify the EEG signals. The study involves coarse and fine level EEG signal classification. The coarse-level classification categorizes the signals into three classes (Char, Digit, Object), whereas the fine-level classification categorizes the EEG signals into 30 classes. The recognition rates of 97.30%, and 93.64% were recorded at coarse and fine level classification, respectively. Experiments indicate the proposed work outperforms the previous methods.

  • 27.
    Shirkhani, Shaghayegh
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Digital Services and Systems.
    Mokayed, Hamam
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Chai, Hum Yan
    Department of Mechatronics and Biomedical Engineering, Universiti Tunku Abdul Rahman, Selangor, Malaysia.
    Study of AI-Driven Fashion Recommender Systems2023In: SN Computer Science, ISSN 2662-995X, Vol. 4, no 5, article id 514Article in journal (Refereed)
    Abstract [en]

    The rising diversity, volume, and pace of fashion manufacturing pose a considerable challenge in the fashion industry, making it difficult for customers to pick which product to purchase. In addition, fashion is an inherently subjective, cultural notion and an ensemble of clothing items that maintains a coherent style. In most of the domains in which Recommender Systems are developed (e.g., movies, e-commerce, etc.), the similarity evaluation is considered for recommendation. Instead, in the Fashion domain, compatibility is a critical factor. In addition, raw visual features belonging to product representations that contribute to most of the algorithm’s performances in the Fashion domain are distinguishable from the metadata of the products in other domains. This literature review summarizes various Artificial Intelligence (AI) techniques that have lately been used in recommender systems for the fashion industry. AI enables higher-quality recommendations than earlier approaches. This has ushered in a new age for recommender systems, allowing for deeper insights into user-item relationships and representations and the discovery patterns in demographical, textual, virtual, and contextual data. This work seeks to give a deeper understanding of the fashion recommender system domain by performing a comprehensive literature study of research on this topic in the past 10 years, focusing on image-based fashion recommender systems taking AI improvements into account. The nuanced conceptions of this domain and their relevance have been developed to justify fashion domain-specific characteristics.

    Download full text (pdf)
    fulltext
  • 28.
    Simistira Liwicki, Foteini
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Gupta, Vibha
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    De, Kanjar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Abid, Nosheen
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Rakesh, Sumit
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Wellington, Scott
    University of Bath, Department of Computer Science, Bath, UK.
    Wilson, Holly
    University of Bath, Department of Computer Science, Bath, UK.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Eriksson, Johan
    Umeå University, Department of Integrative Medical Biology (IMB) and Umeå Center for Functional Brain Imaging (UFBI), Umeå, Sweden.
    Bimodal electroencephalography-functional magnetic resonance imaging dataset for inner-speech recognition2023In: Scientific Data, E-ISSN 2052-4463, Vol. 10, article id 378Article in journal (Refereed)
    Abstract [en]

    The recognition of inner speech, which could give a ‘voice’ to patients that have no ability to speak or move, is a challenge for brain-computer interfaces (BCIs). A shortcoming of the available datasets is that they do not combine modalities to increase the performance of inner speech recognition. Multimodal datasets of brain data enable the fusion of neuroimaging modalities with complimentary properties, such as the high spatial resolution of functional magnetic resonance imaging (fMRI) and the temporal resolution of electroencephalography (EEG), and therefore are promising for decoding inner speech. This paper presents the first publicly available bimodal dataset containing EEG and fMRI data acquired nonsimultaneously during inner-speech production. Data were obtained from four healthy, right-handed participants during an inner-speech task with words in either a social or numerical category. Each of the 8-word stimuli were assessed with 40 trials, resulting in 320 trials in each modality for each participant. The aim of this work is to provide a publicly available bimodal dataset on inner speech, contributing towards speech prostheses.

    Download full text (pdf)
    fulltext
  • 29.
    Simistira Liwicki, Foteini
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Gupta, Vibha
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    De, Kanjar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Abid, Nosheen
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Rakesh, Sumit
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Wellington, Scott
    Department of Computer Science, University of Bath, United Kingdom.
    Wilson, Holly
    Department of Computer Science, University of Bath, United Kingdom.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Eriksson, Johan
    Department of Integrative Medical Biology (IMB), Umeå University, Sweden.
    Bimodal pilot study on inner speech decoding reveals the potential of combining EEG and fMRIManuscript (preprint) (Other academic)
    Abstract [en]

    This paper presents the first publicly available bimodal electroencephalography (EEG) / functional magnetic resonance imaging (fMRI) dataset and an open source benchmark for inner speech decoding. Decoding inner speech or thought (expressed through a voice without actual speaking); is a challenge with typical results close to chance level. The dataset comprises 1280 trials (4 subjects, 8 stimuli = 2 categories * 4 words, and 40 trials per stimuli) in each modality. The pilot study reports for the binary classification, a mean accuracy of 71.72\% when combining the two modalities (EEG and fMRI), compared to 62.81% and 56.17% when using EEG, resp. fMRI alone. The same improvement in performance for word classification (8 classes) can be observed (30.29% with combination, 22.19%, and 17.50% without). As such, this paper demonstrates that combining EEG with fMRI is a promising direction for inner speech decoding.

    Download full text (pdf)
    fulltext
  • 30.
    Upadhyay, Richa
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Chhipa, Prakash Chandra
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Phlypo, Ronald
    GIPSA-lab, Université Grenoble Alpes, CNRS, Grenoble INP, Grenoble, 38000, France.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Multi-Task Meta Learning: learn how to adapt to unseen tasksManuscript (preprint) (Other academic)
    Abstract [en]

    This work proposes Multi-task Meta Learning (MTML), integrating two learning paradigms Multi-Task Learning (MTL) and meta learning, to bring together the best of both worlds. In particular, it focuses simultaneous learning of multiple tasks, an element of MTL and promptly adapting to new tasks with fewer data, a quality of meta learning. It is important to highlight that we focus on heterogeneous tasks, which are of distinct kind, in contrast to typically considered homogeneous tasks (e.g., if all tasks are classification or if all tasks are regression tasks). The fundamental idea is to train a multi-task model, such that when an unseen task is introduced, it can learn in fewer steps whilst offering a performance at least as good as conventional single task learning on the new task or inclusion within the MTL. By conducting various experiments, we demonstrate this paradigm on two datasets and four tasks: NYU-v2 and the taskonomy dataset for which we perform semantic segmentation, depth estimation, surface normal estimation, and edge detection. MTML achieves state-of-the-art results for most of the tasks. Although semantic segmentation suffers quantitatively, our MTML method learns to identify segmentation classes absent in the pseudo labelled ground truth of the taskonomy dataset. 

  • 31.
    Upadhyay, Richa
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Chhipa, Prakash Chandra
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Phlypo, Ronald
    Université Grenoble Alpes.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Multi-Task Meta Learning: learn how to adapt to unseen tasks2023In: IJCNN 2023 - International Joint Conference on Neural Networks, Conference Proceedings, Institute of Electrical and Electronics Engineers Inc. , 2023Conference paper (Refereed)
    Abstract [en]

    This work proposes Multi-task Meta Learning (MTML), integrating two learning paradigms Multi-Task Learning (MTL) and meta learning, to bring together the best of both worlds. In particular, it focuses simultaneous learning of multiple tasks, an element of MTL and promptly adapting to new tasks, a quality of meta learning. It is important to highlight that we focus on heterogeneous tasks, which are of distinct kind, in contrast to typically considered homogeneous tasks (e.g., if all tasks are classification or if all tasks are regression tasks). The fundamental idea is to train a multi-task model, such that when an unseen task is introduced, it can learn in fewer steps whilst offering a performance at least as good as conventional single task learning on the new task or inclusion within the MTL. By conducting various experiments, we demonstrate this paradigm on two datasets and four tasks: NYU-v2 and the taskonomy dataset for which we perform semantic segmentation, depth estimation, surface normal estimation, and edge detection. MTML achieves state-of-the-art results for three out of four tasks for the NYU-v2 dataset and two out of four for the taskonomy dataset. In the taskonomy dataset, it was discovered that many pseudo-labeled segmentation masks lacked classes that were expected to be present in the ground truth; however, our MTML approach was found to be effective in detecting these missing classes, delivering good qualitative results. While, quantitatively its performance was affected due to the presence of incorrect ground truth labels. The the source code for reproducibility can be found at https://github.com/ricupa/MTML-learn-how-to-adapt-to-unseen-tasks.

  • 32.
    Upadhyay, Richa
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Phlypo, Ronald
    University of Grenoble Alpes, France.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Less is More: Towards parsimonious multi-task models using structured sparsity2024In: Proceedings of Machine Learning Research, PMLR / [ed] Yuejie Chi, Gintare Karolina Dziugaite, Qing Qu, Atlas Wang Wang, Zhihui Zhu, Proceedings of Machine Learning Research , 2024, Vol. 234, p. 590-601Conference paper (Refereed)
    Abstract [en]

    Model sparsification in deep learning promotes simpler, more interpretable models with fewer parameters. This not only reduces the model’s memory footprint and computational needs but also shortens inference time. This work focuses on creating sparse models optimized for multiple tasks with fewer parameters. These parsimonious models also possess the potential to match or outperform dense models in terms of performance. In this work, we introduce channel-wise l1/l2 group sparsity in the shared convolutional layers parameters (or weights) of the multi-task learning model. This approach facilitates the removal of extraneous groups i.e., channels (due to l1 regularization) and also imposes a penalty on the weights, further enhancing the learning efficiency for all tasks (due to l2 regularization). We analyzed the results of group sparsity in both single-task and multi-task settings on two widely-used multi-task learning datasets: NYU-v2 and CelebAMask-HQ. On both datasets, which consist of three different computer vision tasks each, multi-task models with approximately 70% sparsity outperform their dense equivalents. We also investigate how changing the degree of sparsification influences the model’s performance, the overall sparsity percentage, the patterns of sparsity, and the inference time.

    Download full text (pdf)
    fulltext
  • 33.
    Upadhyay, Richa
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Phlypo, Ronald
    GIPSA-lab, Université Grenoble Alpes, CNRS, Grenoble INP, Grenoble, 38000, France.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Sharing to learn and learning to share: Fitting together Meta-Learning, Multi-Task Learning, and Transfer Learning: A meta reviewManuscript (preprint) (Other academic)
    Abstract [en]

    Integrating knowledge across different domains is an essential feature of human learning. Learning paradigms such as transfer learning, meta learning, and multi-task learning reflect the human learning process by exploiting the prior knowledge for new tasks, encouraging faster learning and good generalization for new tasks. This article gives a detailed view of these learning paradigms and their comparative analysis. The weakness of one learning algorithm turns out to be a strength of another, and thus merging them is a prevalent trait in the literature. There are numerous research papers that focus on each of these learning paradigms separately and provide a comprehensive overview of them. However, this article provides a review of research studies that combine (two of) these learning algorithms. This survey describes how these techniques are combined to solve problems in many different fields of study, including computer vision, natural language processing, hyperspectral imaging, and many more, in supervised setting only. As a result, the global generic learning network an amalgamation of meta learning, transfer learning, and multi-task learning is introduced here, along with some open research questions and future research directions in the multi-task setting.

  • 34.
    Xie, Yejing
    et al.
    Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, 44000, Nantes, France.
    Mouchère, Harold
    Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, 44000, Nantes, France.
    Simistira Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Rakesh, Sumit
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Nakagawa, Masaki
    Tokyo University of Agriculture and Technology, Fuchu, Japan.
    Nguyen, Cuong Tuan
    FPT University, Hanoi, Vietnam.
    Truong, Thanh-Nghia
    Tokyo University of Agriculture and Technology, Fuchu, Japan.
    ICDAR 2023 CROHME: Competition on Recognition of Handwritten Mathematical Expressions2023In: Document Analysis and Recognition - ICDAR 2023, Part II / [ed] Gernot A. Fink, Rajiv Jain, Koichi Kise & Richard Zanibbi, Springer, 2023, p. 553-565Conference paper (Refereed)
    Abstract [en]

    This paper overviews the 7th edition of the Competition on Recognition of Handwritten Mathematical Expressions. ICDAR 2023 CROHME proposes three tasks with three different modalities: on-line, off-line and bimodal. 3905 new handwritten equations have been collected to propose new training, validation and test sets for the two modalities. The complete training set includes previous CROHME training set extented with complementary off-line (from OffRaSHME competition) and on-line samples (generated). The evaluation is conducted using the same protocol as the previous CROHME, allowing a fair comparison with previous results. This competition allows for the first time the comparison of the on-line and off-line systems on the same test set. Six participating teams have been evaluated. Finally the same team won all 3 tasks with more than 80% of expression recognition rate.

1 - 34 of 34
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf