Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Designing for Automated Digital Preservation: Model, Pre-Ingest, and Error Handling
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering.ORCID iD: 0000-0001-5137-3390
2020 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

With the rapid increase in the amount and complexity of data that is needed to be preserved, manual preservation activities produce complex, lengthy, and costly processes. Therefore, automation of preservation processes, together with modeling of workflows and streamlining, can help reduce costs and enhance the focus on preservation processes.  Accordingly, the research question is defined as: “How to establish an automated many-to-many interaction between Information Systems and digital preservation systems?”

This research proposes a model and instantiation of middleware as a standalone system, which could be hosted in the cloud, for bridging between ISs and DPSs including three sub-parts making both many-to-many capacity and automation of interactions possible: pre-ingest workflow, Context-aware Preservation Manager (CaPM), and error-handling workflow. A Design Science Research (DSR) approach was taken to conduct this research consisting of three design cycles to design and develop each of the three sub-parts of the solution artifact, i.e. the middleware.  The middleware consists of several action-based components and an administrative component (CaPM) which carries out the automation of the tasks in the middleware. The action-based components are designed to complete a pre-ingest workflow to prepare digital content sent from an information system to be transferred into a digital preservation system. The path for the pre-ingest workflow, i.e. which components are going to process the digital content and in what order, is automatically defined by CaPM according to the information system’s preservation policies. Standard interfaces are used for middleware’s internal or external communications to promote its scalability in the long run as well as its capability of embedding additional workflows or processes developed in the future, e.g. post-access workflow.

An additional outcome of this research is proposing five design principles aiming to contribute to the knowledge for future design practices: DP1. Provide rule-based definition of workflow execution path so that the middleware affords IS to implement their preservation policy and metadata extraction requirements. DP2. Provide capability of executing alternative workflow routes so that the middleware affords IS to ensure a successful encapsulation and submission of SIP. DP3. Provide features for gathering preservation data in the middleware so that the middleware affords preservation planning support. DP4. Provide an automated error-handling workflow with compensating action so that the middleware affords to minimize manual intervention in case of errors in a workflow. DP5. Provide capability of executing concurrent workflows so that the middleware affords IS and DPS many-to-many interactions via the middleware.

The results of this thesis contribute to the state-of-the-art in a few aspects:

  • Compared to existing solutions, such as pre-ingest tool developed for Finnish National Archives and UAM for Estonia, that need to be installed on a user’s system, integration with the middleware is carried out with less complexity. This is achieved by designing the middleware as a standalone system that could be hosted in the cloud along with using standard communication interfaces, which further make the middleware adaptable to changes or upgrades in the environment it operates in. Such capability of the middleware in handling many-to-many interactions goes beyond what was introduced in previous middleware architectures for Digital Preservation System’s integration with Information Systems.
  • The middleware solution for pre-ingest in this thesis, in comparison with the similar recent solutions, promotes automation capabilities especially for preserving complex digital content (e.g. databases, workflows), automatic execution of the pre-ingest workflow, or in case of a need for using multiple external digital preservation solutions or services.
  • CaPM monitors the execution of workflows and can update or abort a workflow path if needed. An aborted workflow caused by an error/failure will automatically be replaced by an error-handling workflow with compensation action, hence increasing the level of automation. Automation of such functionalities, as well as the approach for handling errors, has not been applied in previous tools.
  • CaPM can also contribute to the current stream of research on decisions making regarding preservation planning and strategies by providing logged data about the digital objects passing through the middleware.

While the solution artifact of this research provides middleware to perform as a bridge for automated many-to-many interactions between information systems and digital preservation systems, the resulted design and implementation of the middleware components cover only one direction of such interaction, from information system to digital preservation system (pre-ingest).

Place, publisher, year, edition, pages
Luleå: Luleå University of Technology, 2020. , p. 149
Series
Doctoral thesis / Luleå University of Technology 1 jan 1997 → …, ISSN 1402-1544
Keywords [en]
Long-term Digital Preservation, Design Science Research, Information System
National Category
Electrical Engineering, Electronic Engineering, Information Engineering Information Systems, Social aspects
Research subject
Information systems
Identifiers
URN: urn:nbn:se:ltu:diva-78243ISBN: 978-91-7790-565-3 (print)ISBN: 978-91-7790-566-0 (electronic)OAI: oai:DiVA.org:ltu-78243DiVA, id: diva2:1417669
Public defence
2020-06-03, A109, Luleå University of Technology, A building, 13:00 (English)
Opponent
Supervisors
Projects
ForgetIT - European Commission FP7
Funder
EU, FP7, Seventh Framework ProgrammeAvailable from: 2020-03-31 Created: 2020-03-30 Last updated: 2020-05-12Bibliographically approved
List of papers
1. Integrating Contemporary Content Management and Long-Term Digital Preservation: A Design Problem
Open this publication in new window or tab >>Integrating Contemporary Content Management and Long-Term Digital Preservation: A Design Problem
2015 (English)In: Nordic Contributions in IS Research: 6th Scandinavian Conference on Information Systems, SCIS 2015, Oulu, Finland, August 9-12, 2015, Proceedings / [ed] Harri Oinas-Kukkonen; Netta Iivari; Kari Kuutti; Anssi Öörni; Mikko Rajanen, Cham: Springer International Publishing , 2015, p. 92-107Conference paper, Published paper (Refereed)
Abstract [en]

The fields of long-term digital preservation (DP) and enterprise content management (ECM) have remained, until recently, rather separated. Along with increasing amounts of digital content and evolving DP services, there is a need for maximal automation of preservation processes from ECM systems instead of continuing current resource-consuming practices. This paper aims at a design problem definition on the integration of ECM and DP solutions. In order to motivate and to define the problem in more detail, we conducted a review on ECM and DP literatures touching the issue. The review reveals a research gap addressing a need for designing new middleware solutions for interactive processes between ECM and DP. We suggest a general-level model of three such processes between ECM and DP: preservation administration, pre-ingest, and access. The article concludes with avenues for future research on novel solutions for integrating DP with contemporary ECM and other information systems in organizations.

Place, publisher, year, edition, pages
Cham: Springer International Publishing, 2015
Series
Lecture Notes in Business Information Processing, ISSN 1865-1348 ; 223
Keywords
Long-term digital preservation, Enterprise content management, System integration
National Category
Information Systems, Social aspects
Research subject
Information systems
Identifiers
urn:nbn:se:ltu:diva-27730 (URN)10.1007/978-3-319-21783-3_7 (DOI)000376485800007 ()2-s2.0-84946781354 (Scopus ID)143a73cc-4f2f-4e49-97ba-a4ad9f8e26f3 (Local ID)978-3-319-21782-6 (ISBN)978-3-319-21783-3 (ISBN)143a73cc-4f2f-4e49-97ba-a4ad9f8e26f3 (Archive number)143a73cc-4f2f-4e49-97ba-a4ad9f8e26f3 (OAI)
Conference
Scandinavian Conference on Information Systems : 09/08/2015 - 12/08/2015
Projects
Concise Preservation by combining Managed Forgetting and Contexualized Remembering
Note

Validerad; 2015; Nivå 1; 20150814 (jorgenn)

Available from: 2016-09-30 Created: 2016-09-30 Last updated: 2022-09-23Bibliographically approved
2. Administration of Digital Preservation Services in the Cloud Over Time: Design Issues and Challenges for Organizations
Open this publication in new window or tab >>Administration of Digital Preservation Services in the Cloud Over Time: Design Issues and Challenges for Organizations
2014 (English)In: The Proceedings of the 2nd International Conference on Cloud Security Management / [ed] Barbara Endicott-Popovsky, Reading, UK: Academic Conferences and Publishing International Ltd , 2014Conference paper, Published paper (Refereed)
Abstract [en]

In organizations, information systems produce increasing amounts of digital information that needs to be preserved. Simultaneously, human intervention in preservation activities needs to be minimized, due to scarce resources available for preservation. Thus, digital preservation services need to be integrated with other organizational information systems as seamlessly as possible, and preservation activities need to be maximally automated. Selection of adequate cloud-based preservation services poses yet another set of challenges for reaching the seamless integration and automation. Two sources of challenges relate to administrating integrated digital preservation and information systems. Firstly, information systems change over time. Secondly, digital preservation services change over time. This paper contributes by identifying and analyzing scenarios and design issues of administering digital preservation in connection to the changes in information systems and preservation services over time. We also suggest seven generic tasks of the “preservation administration” component, which are needed between digital preservation services and information systems and discuss about the choices of locating the operative responsibilities and middleware.

Place, publisher, year, edition, pages
Reading, UK: Academic Conferences and Publishing International Ltd, 2014
Series
Proceedings of the international conference on cloud security management, ISSN 2051-7920
National Category
Information Systems, Social aspects
Research subject
Information systems; Enabling ICT (AERI)
Identifiers
urn:nbn:se:ltu:diva-29805 (URN)368ce177-a223-4256-81b0-24600a90833c (Local ID)978-1-910309-63-6 (ISBN)368ce177-a223-4256-81b0-24600a90833c (Archive number)368ce177-a223-4256-81b0-24600a90833c (OAI)
Conference
International Conference on Cloud Security Management : 23/10/2014 - 24/10/2014
Projects
Concise Preservation by combining Managed Forgetting and Contexualized Remembering
Note
Godkänd; 2014; 20141015 (parafr)Available from: 2016-09-30 Created: 2016-09-30 Last updated: 2023-09-06Bibliographically approved
3. Towards automated pre-ingest workflow for bridging information systems and digital preservation services
Open this publication in new window or tab >>Towards automated pre-ingest workflow for bridging information systems and digital preservation services
2019 (English)In: Records Management Journal, ISSN 0956-5698, E-ISSN 1758-7689, Vol. 29, no 3, p. 289-304Article in journal (Refereed) Published
Abstract [en]

Purpose

This paper aims to automate pre-ingest workflow for preserving digital content, such as records, through middleware that integrates potentially many information systems with potentially several alternative digital preservation services.

Design/methodology/approach

This design research approach resulted in a design for model- and component-based software for such workflow. A proof-of-concept prototype was implemented and demonstrated in context of a European research project, ForgetIT.

Findings

The study identifies design issues of automated pre-ingest for digital preservation while using middleware as a design choice for this purpose. The resulting model and solution suggest functionalities and interaction patterns based on open interface protocols between the source systems of digital content, middleware and digital preservation services. The resulting workflow automates the tasks of fetching digital objects from the source system with metadata extraction, preservation preparation and transfer to a selected preservation service. The proof-of-concept verified that the suggested model for pre-ingest workflow and the suggested component architecture was technologically implementable. Future research and development needs to include new solutions to support context-aware preservation management with increased support for configuring submission agreements as a basis for dynamic automation of pre-ingest and more automated error handling.

Originality/value

The paper addresses design issues for middleware as a design choice to support automated pre-ingest in digital preservation. The suggested middleware architecture supports many-to-many relationships between the source information systems and digital preservation services through open interface protocols, thus enabling dynamic digital preservation solutions for records management.

Place, publisher, year, edition, pages
Emerald Group Publishing Limited, 2019
Keywords
Workflow, Middleware, Long-term digital preservation
National Category
Information Systems, Social aspects
Research subject
Information Systems
Identifiers
urn:nbn:se:ltu:diva-73317 (URN)10.1108/RMJ-05-2018-0011 (DOI)000491603900001 ()2-s2.0-85062035980 (Scopus ID)
Note

Validerad;2019;Nivå 2;2019-12-06 (johcin)

Available from: 2019-03-26 Created: 2019-03-26 Last updated: 2023-09-14Bibliographically approved
4. Towards Automated, Context-Aware Management of Preservation Submissions
Open this publication in new window or tab >>Towards Automated, Context-Aware Management of Preservation Submissions
2018 (English)Conference paper, Oral presentation only (Refereed)
Abstract [en]

Research in digital preservation field has realized the need for au-tomation of digital preservation activities. Without automation, the preservation of digital entities will be a complex and labor-intensive task. A middleware con-cept has been introduced by scholars to support automation of interactions be-tween content management systems and digital preservation systems. To boost the automation of workflows in the middleware, we introduce a component within the middleware, namely Context-aware Preservation Manager (CaPM), which is in charge of administration of the inner components of the middleware and the workflows for bi-directional interactions between content management systems and digital preservation systems. We describe the specifications of the Context-aware Preservation Manager and depict its inner components. Further, we explain about the processes that are improved, supported or can run automat-ically as a result of functionalities of Context-aware Preservation Manager.

Keywords
Long-term Digital Preservation, Automation, Middleware, Context-aware Preservation Manager
National Category
Information Systems
Research subject
Information systems
Identifiers
urn:nbn:se:ltu:diva-76035 (URN)
Conference
41th Information Systems Research Seminar in Scandinavia (IRIS 41), Odder, Denmark, August 5-8, 2018
Funder
EU, FP7, Seventh Framework Programme, 600826
Available from: 2019-09-17 Created: 2019-09-17 Last updated: 2021-09-28Bibliographically approved
5. Error-Handling Workflow with Compensation for Pre-Ingest in Digital Preservation
Open this publication in new window or tab >>Error-Handling Workflow with Compensation for Pre-Ingest in Digital Preservation
(English)In: Records Management Journal, ISSN 0956-5698, E-ISSN 1758-7689Article in journal (Refereed) Submitted
Abstract [en]

Purpose – The purpose of this paper is the automation of an error-handling workflow that, in case of failures, will take over a pre-ingest workflow which is originally designed for preserving digital content through a middleware for many-to-many interactions between Information Systems and Digital Preservation Systems.

Design/methodology/approach – A Design Science Research (DSR) approach is taken resulted in a model and proof-of-concept for an automatic error-handling workflow with compensation action. The proof-of-concept was demonstrated and evaluated through several simulation scenarios.

Findings – This study pinpoints the elements of consideration in designing a workflow for maximally automating handling of failures in a pre-ingest workflow. The error-handling workflow automatically takes over the original workflow and executes a backward path to compensate for the side-effects of components that have processed digital content before the failure occurred. The proof-of-concept, demonstrated and evaluated, verifies the technological feasibility of implementing the model. The evaluation of the model shows a considerable reduction in manual work and its complexity for handling errors. Three design principles are proposed as a contribution to the knowledge of design to afford a similar purpose.  

Originality/value – The approach introduced in this study for automatic error-handling in a pre-ingest workflow has not yet been applied in existing solutions.

Keywords
Long-Term Digital Preservation, Error-handling, Automation, Workflow, Pre-ingest
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Research subject
Information systems
Identifiers
urn:nbn:se:ltu:diva-78215 (URN)
Funder
EU, FP7, Seventh Framework Programme
Available from: 2020-03-25 Created: 2020-03-25 Last updated: 2020-03-30
6. Handling of Errors in Context-Aware Preservation Manager using Saga Patterns
Open this publication in new window or tab >>Handling of Errors in Context-Aware Preservation Manager using Saga Patterns
2018 (English)Conference paper, Oral presentation only (Refereed)
Abstract [en]

In the field of Digital Preservation, rapid growth of digitized data has made the need for an automatic process to perform digital preservation tasks undeniable. One of the solutions to maximally automate interaction between Information Systems and Digital Preservation Systems is involvement of a middleware between the two systems. However, the middleware calls for human interaction in case an error occurs during a workflow within the middleware. This article suggests a workflow for automatically handling errors in the pre-ingest workflow in the middleware. Moreover, in this article, information on the implementation of such workflow will be provided and contributions of this workflow to the existing model of middleware will be pointed out.

Place, publisher, year, edition, pages
Chennai, India: , 2018
Keywords
Digital Preservation, Error Handling, Middleware, Integration
National Category
Social Sciences Information Systems, Social aspects
Research subject
Information systems
Identifiers
urn:nbn:se:ltu:diva-69165 (URN)
Conference
13th International Conference on Design Science Research in Information Systems and Technology, DESRIST 2018, Chennai, India, 3-6 juni 2018
Projects
ForgetIT
Funder
EU, FP7, Seventh Framework Programme
Available from: 2018-06-07 Created: 2018-06-07 Last updated: 2020-09-18Bibliographically approved

Open Access in DiVA

fulltext(21768 kB)1404 downloads
File information
File name FULLTEXT01.pdfFile size 21768 kBChecksum SHA-512
8967d4a4fd25abb495f4f5a111b9461039e05ff9bc8bcf891190bf88108e04fcc4eb2c7dfad235ca61e2edebd9a109f1cf9ec9fe744f6a5e41f5fbd87f63e3bb
Type fulltextMimetype application/pdf

Authority records

Westerlund, Parvaneh

Search in DiVA

By author/editor
Westerlund, Parvaneh
By organisation
Department of Computer Science, Electrical and Space Engineering
Electrical Engineering, Electronic Engineering, Information EngineeringInformation Systems, Social aspects

Search outside of DiVA

GoogleGoogle Scholar
Total: 1405 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 2170 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf