1718192021222320 of 78
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Optimizing Deep Neural Networks for Classification of Short Texts
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Computer Science.
2019 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

This master's thesis investigates how a state-of-the-art (SOTA) deep neural network (NN) model can be created for a specific natural language processing (NLP) dataset, the effects of using different dimensionality reduction techniques on common pre-trained word embeddings and how well this model generalize on a secondary dataset. The research is motivated by two factors. One is that the construction of a machine learning (ML) text classification (TC) model is typically done around a specific dataset and often requires a lot of manual intervention. It's therefore hard to know exactly what procedures to implement for a specific dataset and how the result will be affected. The other reason is that, if the dimensionality of pre-trained embedding vectors can be lowered without losing accuracy, and thus saving execution time, other techniques can be used during the time saved to achieve even higher accuracy.

A handful of deep neural network architectures are used, namely a convolutional neural network (CNN), long short-term memory neural network (LSTM) and a bidirectional LSTM (Bi-LSTM) architecture. These deep neural network architectures are combined with four different word embeddings: GoogleNews-vectors-negative300, glove.840B.300d, paragram_300_sl999 and wiki-news-300d-1M.

Three main experiments are conducted in this thesis. In the first experiment, a top-performing TC model is created for a recent NLP competition held at Kaggle.com. Each implemented procedure is benchmarked on how the accuracy and execution time of the model is affected. In the second experiment, principal component analysis (PCA) and random projection (RP) are applied to the pre-trained word embeddings used in the top-performing model to investigate how the accuracy and execution time is affected when creating lower-dimensional embedding vectors. In the third experiment, the same model is benchmarked on a separate dataset (Sentiment140) to investigate how well it generalizes on other data and how each implemented procedure affects the accuracy compared to on the original dataset.

The first experiment results in a bidirectional LSTM model and a combination of the three embeddings: glove, paragram and wiki-news concatenated together. The model is able to give predictions with an F1 score of 71% which is good enough to reach 9th place out of 1,401 participating teams in the competition. In the second experiment, the execution time is improved by 13%, by using PCA, while lowering the dimensionality of the embeddings by 66% and only losing half a percent of F1 accuracy. RP gave a constant accuracy of 66-67% regardless of the projected dimensions compared to over 70% when using PCA. In the third experiment, the model gained around 12% accuracy from the initial to the final benchmarks, compared to 19% on the competition dataset. The best-achieved accuracy on the Sentiment140 dataset is 86% and thus higher than the 71% achieved on the Quora dataset.

Place, publisher, year, edition, pages
2019. , p. 60
Keywords [en]
Machine learning, Text classification, state-of-the-art, SOTA, Deep neural network, Dimensionality reduction, Word embeddings
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:ltu:diva-76811OAI: oai:DiVA.org:ltu-76811DiVA, id: diva2:1372086
Subject / course
Student thesis, at least 30 credits
Educational program
Computer Science and Engineering, master's level
Supervisors
Examiners
Available from: 2019-11-22 Created: 2019-11-21 Last updated: 2019-11-22Bibliographically approved

Open Access in DiVA

fulltext(1482 kB)14 downloads
File information
File name FULLTEXT01.pdfFile size 1482 kBChecksum SHA-512
b3775533544658a7ec94d76c4f97085f8a6647dc7d836ac7af9a64402c14e63f60a5550212217f37a5f5819ecb78f72221b14a1c604a18372cfc83700e0593a1
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Pettersson, Fredrik
By organisation
Computer Science
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 14 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 163 hits
1718192021222320 of 78
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf