Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Subword Semantic Hashing for Intent Classification on Small Datasets
MindGarage, Technical University Kaiserslautern, Germany.
MindGarage, Technical University Kaiserslautern, Germany.
MindGarage, Technical University Kaiserslautern, Germany.
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.ORCID iD: 0000-0003-0100-4030
Show others and affiliations
2019 (English)In: 2019 International Joint Conference on Neural Networks (IJCNN), IEEE, 2019, article id N-19329Conference paper, Published paper (Other academic)
Abstract [en]

In this paper, we introduce the use of Semantic Hashing as embedding for the task of Intent Classification and achieve state-of-the-art performance on three frequently used benchmarks. Intent Classification on a small dataset is a challenging task for data-hungry state-of-the-art Deep Learning based systems. Semantic Hashing is an attempt to overcome such a challenge and learn robust text classification. Current word embedding based methods [11], [13], [14] are dependent on vocabularies. One of the major drawbacks of such methods is out-of-vocabulary terms, especially when having small training datasets and using a wider vocabulary. This is the case in Intent Classification for chatbots, where typically small datasets are extracted from internet communication. Two problems arise with the use of internet communication. First, such datasets miss a lot of terms in the vocabulary to use word embeddings efficiently. Second, users frequently make spelling errors. Typically, the models for intent classification are not trained with spelling errors and it is difficult to think about ways in which users will make mistakes. Models depending on a word vocabulary will always face such issues. An ideal classifier should handle spelling errors inherently. With Semantic Hashing, we overcome these challenges and achieve state-of-the-art results on three datasets: Chatbot, Ask Ubuntu, and Web Applications [3]. Our benchmarks are available online.

Place, publisher, year, edition, pages
IEEE, 2019. article id N-19329
Series
International Joint Conference on Neural Networks (IJCNN), ISSN 2161-4407, E-ISSN 2161-4393
Keywords [en]
Natural Language Processing, Intent Classification, Chatbots, Semantic Hashing, Machine Learning, State-of-the-art
National Category
Computer Sciences
Research subject
Machine Learning
Identifiers
URN: urn:nbn:se:ltu:diva-76841DOI: 10.1109/IJCNN.2019.8852420Scopus ID: 2-s2.0-85073258046OAI: oai:DiVA.org:ltu-76841DiVA, id: diva2:1372648
Conference
2019 International Joint Conference on Neural Networks (IJCNN), 14-19 July, 2019, Budapest, Hungary
Note

ISBN för värdpublikation: 978-1-7281-1985-4, 978-1-7281-1986-1

Available from: 2019-11-25 Created: 2019-11-25 Last updated: 2022-10-31Bibliographically approved
In thesis
1. Faster and More Resource-Efficient Intent Classification
Open this publication in new window or tab >>Faster and More Resource-Efficient Intent Classification
2020 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Intent classification is known to be a complex problem in Natural Language Processing (NLP) research. This problem represents one of the stepping stones to obtain machines that can understand our language. Several different models recently appeared to tackle the problem. The solution has become reachable with deep learning models. However, they have not achieved the goal yet.Nevertheless, the energy and computational resources of these modern models (especially deep learning ones) are very high. The utilization of energy and computational resources should be kept at a minimum to deploy them on resource-constrained devices efficiently.Furthermore, these resource savings will help to minimize the environmental impact of NLP.

This thesis considers two main questions.First, which deep learning model is optimal for intent classification?Which model can more accurately infer a written piece of text (here inference equals to hate-speech) in a short text environment. Second, can we make intent classification models to be simpler and more resource-efficient than deep learning?.

Concerning the first question, the work here shows that intent classification in written language is still a complex problem for modern models.However, deep learning has shown successful results in every area it has been applied.The work here shows the optimal model that was used in short texts.The second question shows that we can achieve results similar to the deep learning models by more straightforward solutions.To show that, when combining classical machine learning models, pre-processing techniques, and a hyperdimensional computing approach.

This thesis presents a research done for a more resource-efficient machine learning approach to intent classification. It does this by first showing a high baseline using tweets filled with hate-speech and one of the best deep learning models available now (RoBERTa, as an example). Next, by showing the steps taken to arrive at the final model with hyperdimensional computing, which minimizes the required resources.This model can help make intent classification faster and more resource-efficient by trading a few performance points to achieve such resource-saving.Here, a hyperdimensional computing model is proposed. The model is inspired by hyperdimensional computing and its called ``hyperembed,'' which shows the capabilities of the hyperdimensional computing paradigm.When considering resource-efficiency, the models proposed were tested on intent classification on short texts, tweets (for hate-speech where intents are to offend or not to), and questions posed to Chatbots.

In summary, the work proposed here covers two aspects. First, the deep learning models have an advantage in performance when there are sufficient data. They, however, tend to fail when the amount of available data is not sufficient. In contrast to the deep learning models, the proposed models work well even on small datasets.Second, the deep learning models require substantial resources to train and run them while the models proposed here aim at trading off the computational resources spend to obtaining and running the model against the classification performance of the model.

Place, publisher, year, edition, pages
Luleå, Sweden: Luleå University of Technology, 2020. p. 86
Series
Licentiate thesis / Luleå University of Technology, ISSN 1402-1757
National Category
Computer Sciences
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-81178 (URN)978-91-7790-689-6 (ISBN)978-91-7790-690-2 (ISBN)
Presentation
2020-12-18, A3580, Luleå, 09:00 (English)
Opponent
Supervisors
Available from: 2020-10-19 Created: 2020-10-19 Last updated: 2020-11-27Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Grund Pihlgren, GustavAlonso, PedroKovács, GyörgySimistira, FoteiniLiwicki, Marcus

Search in DiVA

By author/editor
Grund Pihlgren, GustavAlonso, PedroKovács, GyörgySimistira, FoteiniLiwicki, Marcus
By organisation
Embedded Internet Systems Lab
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 207 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf