Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Efficiency of data structures for detecting overlaps in digital documents
Monash University, Melbourne, VIC.
Monash University, Melbourne, VIC.
2001 (English)In: Australian Computer Science Communications, ISSN 0157-3055, Vol. 23, no 1, p. 140-147Article in journal (Refereed) Published
Abstract [en]

This paper analyses the efficiency of different data structures for detecting overlap in digital documents. Most existing approaches use some hash function to reduce the space requirements for their indices of chunks. Since a hash function can produce the same value for different chunks, false matches are possible. In this paper we propose an algorithm that can be used for eliminating those false matches. This algorithm uses a suffix tree structure, which is space consuming. We define a modified suffix tree that only considers chunks starting at the beginning of words and we show how the algorithm can work on this structure. We can alternatively reduce space requirements of a suffix tree by converting it to a directed acyclic graph. We show that suffix link information can be preserved in this new structure and the matching statistics algorithm still works with those modifications that we propose.

Place, publisher, year, edition, pages
2001. Vol. 23, no 1, p. 140-147
Identifiers
URN: urn:nbn:se:ltu:diva-6885DOI: 10.1109/ACSC.2001.906635Local ID: 535daaa0-cea7-11dc-91eb-000ea68e967bOAI: oai:DiVA.org:ltu-6885DiVA, id: diva2:979771
Note
Upprättat; 2001; Bibliografisk uppgift: This Article has also been published in: ACM International Conference Proceeding Series Proceedings of the 24th Australasian conference on Computer science 2001 , Gold Coast, Queensland, Australia; 20080129 (ysko)Available from: 2016-09-29 Created: 2016-09-29 Last updated: 2018-06-11Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records BETA

Zaslavsky, Arkady

Search in DiVA

By author/editor
Zaslavsky, Arkady

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 16 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf