Today, digital information is often scattered across various systems, sometimes in different formats and can be of varying importance. Ensuring the usability of this data is crucial in both the public and private sectors. Although backup copies are routinely made, managing this data can be a daunting task. For example, request codes need to be generated to access data, and in some cases, the data becomes unusable over time.
Furthermore, transitioning to new software can present challenges for institutions. How can we ensure the seamless transfer of data from one system to another while maintaining its authenticity, reliability, and usability?
This is not just an internal concern for institutions; many public sector organisations must transfer records with long-term value to the National Archives for preservation. Unfortunately, there is often no direct digital bridge between the institution’s Electronic Records Management System (ERMS) and the digital archive held by the National Archives.
In such situations, Oneclick tools may help as they did in the EKA’s archiving case – the content was extracted from two information systems, ERMS Webdesktop and Digiteek, and the SIP package was created using the Oneclick SIP creator.
Three extracted datasets:
- ERMS Webdesktop PDR (Public Document Registry) export package – corrected public data set: Webdesktop PDR houses corporate management records, which are made accessible to the public according to Estonian legislation and the Public Information Act. Exporting this data was a challenge because a private company, Webware, hosted it, and there were unique export criteria involved.
- ERMS Webdesktop AV export (records with archival value) – permanent disposal schedule + to be transferred to NAE: The Archives Act of Estonia states that records with archival value at EKA must be transferred to the National Archives of Estonia for long-term preservation. The export prepared these ERMS Webdesktop records for transfer to the National Archives.
- CMS Digiteek export package – EKA has full access to the source code of the application, database, and media files: Digiteek is a web-based collection of graduation works and theses dating back to 1914. It stores image files and links each object to essential information, maintaining a comprehensive “object lifecycle map”, where information related to the objects is held, and the history of the condition of the object is preserved. All data is stored on a central server, from the processing of status reports and the periodic inventory of artefacts to the graphical mapping of highly detailed research and conservation work. This export prepared the collection for transfer to new software, ensuring the preservation of vital data.
Data extraction
Extraction of the content from the described systems presented its own set of challenges, mainly stemming from the proprietary nature of these systems. The challenge of data extraction often leads to the “vendor lock-in” problem. Discover more about our experience in tackling the vendor lock-in issue and how it contributed to the development of a groundbreaking interface in the article Archiving: Vendor lock-in or complicated conformance? (Jääskeläinen, Oolu, Uueni, 2022)
In the ERMS Webdesktop case, the extracted AV package was transferred as a test to the National Archives of Estonia, which was a pioneering move in the Estonian context. This step not only addressed the extraction challenge but also paved the way for a more efficient and direct connection between agencies and digital archives. Moreover, the system provider, Webware took the initiative later to develop a specialised extraction functionality for the ERMS. The extraction of Digiteek was executed as a payload, which, in this form, is likely to be problematic to be transferred to the new software but still a valuable experience.
To learn more about the data extraction process, please refer to the project deliverable M2.16. EKA collections migration plan, which includes the following:
- AS-IS and TO-BE status of data
- Identification of the data format, location, and sensitivity
- The size and scope of the migration
- Backup
- Team roles, responsibilities, and migration tool
- Execution of the data migration/migration process
- Testing/validation
- Calendar
- Risks
SIP creation
Exported EKA packages, even the large one was successfully migrated into the E-ARK SIP packages with Oneclick SIP creator. These packages will not be shared with anyone due to their sensitive and personal content, but please ask more if interested from karin.oolu@artun.ee
Preservation plan
One of the milestones was the creation of a preservation plan for EKA collections. Since the topic of digital preservation had not been dealt with systematically before, it was necessary to start from scratch. First of all, an in-house survey was conducted to get an overview of capacities, formats, skills, etc. The process of research and creation of a preservation plan revealed the main shortcomings as follows:
- Getting data out from source systems is problematic
- Data is not properly handled in systems, and fixing is time-consuming
- Common knowledge that old Digiteek needs new software
As a result an action plan with concrete preservation activities was listed. To learn more about EKA’s preservation activities please refer to the project deliverable M2.19 EKA Digital Preservation Plan which includes the following:
- General principles/preservation policy
- Key factors analyzed: file integrity, storage, access
- Institutional commitment settled
- EKA’s roles and responsibilities listed
- Challenges of preservation of both collections were analyzed
- 4 possible action plans were created
- Action plan with short and long-term goals