This chapter gives an overview of the state of the art and recent methods in the area of historical document analysis. Historical documents differ from the ordinary documents due to the presence of different artifacts. Issues such as poor conditions of the documents, texture, noise and degradation, large variability of page layout, page skew, random alignment, variety of fonts, presence of embellishments, variations in spacing between characters, words, lines, paragraphs and margins, overlapping object boundaries, superimposition of information layers, etc bring complexity issues in analyzing them. Most methods currently rely on deep learning based methods, including Convolutional Neural Networks and Long Short-Term Memory Networks. In addition to the overview of the state of the art, this chapter describes a recently introduced idea for the detection of graphical elements in historical documents and an ongoing effort towards the creation of large database.
ISBN för värdpublikation: 978-981-121-106-5, 978-981-121-107-2, 978-981-121-108-9