Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Random indexing of multi-dimensional data
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.ORCID iD: 0000-0001-5662-825X
SICS Swedish ICT, SE-722 13 Västerås .
SICS Swedish ICT, SE-164 29 Kista .
Number of Authors: 3
2017 (English)In: Knowledge and Information Systems, ISSN 0219-1377, E-ISSN 0219-3116, Vol. 52, no 1, 267-290 p.Article in journal (Refereed) Published
Abstract [en]

Random indexing (RI) is a lightweight dimension reduction method, which is used for example to approximate vector-semantic relationships in online natural language processing systems. Here we generalise RI to multi-dimensional arrays and thereby enable approximation of higher-order statistical relationships in data. The generalised method is a sparse implementation of random projections,which is the theoretical basis also for ordinary RI and other randomisation approaches to dimensionality reduction and data representation. We present numerical experiments which demonstrate that a multi-dimensional generalisation of RI is feasible, including comparisons with ordinary RI and principal component analysis (PCA). The RI method is well suited for online processing of data streams because relationship weights can be updated incrementally in a fixed-size distributed representation,and inner products can be approximated on the fly at low computational cost. An open source implementation of generalised RI is provided.

Place, publisher, year, edition, pages
Springer, 2017. Vol. 52, no 1, 267-290 p.
Keyword [en]
Data mining, random embeddings, dimensionality reduction, sparse coding, semantic similarity, streaming algorithm, natural language processing
National Category
Computer Science
Research subject
Industrial Electronics
Identifiers
URN: urn:nbn:se:ltu:diva-60658DOI: 10.1007/s10115-016-1012-2ISI: 000405223500009OAI: oai:DiVA.org:ltu-60658DiVA: diva2:1049308
Funder
The Kempe Foundations, GÖF
Note

Validerad;2017;Nivå 2;2017-06-15 (andbra)

Available from: 2016-11-24 Created: 2016-11-24 Last updated: 2017-11-24Bibliographically approved

Open Access in DiVA

fulltext(1872 kB)151 downloads
File information
File name FULLTEXT01.pdfFile size 1872 kBChecksum SHA-512
326eac0dd7e060d4ebb8e69c8d94d2637f14ee7a7eaf74af07cb0ab31e66f30bec4991a0098c3a57377659a4d6c40f6c8bb9949bed3169054b688ba2ef818c81
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Search in DiVA

By author/editor
Sandin, FredrikEmruli, Blerim
By organisation
Embedded Internet Systems Lab
In the same journal
Knowledge and Information Systems
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 151 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 313 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf