Wednesday 23 January 2013

Archiving and Recovering Database-driven Websites

an article by Michael Rumianek (Universität Duisburg-Essen, Duisburg, Deutschland & Global Village GmbH, Voerde, Deutschland) published in D-Lib MagazineVolume 19 Number 1/2 (January/February 2013)

Abstract

An ever increasing amount of information is provided by database-driven websites.

Many of these are based on Content Management Systems (CMS).

CMS typically separate the textual content from file content and store the textual content within a database while files are stored in a directory structure of a file system. For archiving and preservation of such websites, in many cases several tools are needed to archive the file data and the database data separately in different container formats.

The database data may be especially difficult to archive since vendor specific implementations of datatypes constrict restoring the archive on different systems.

The author developed and implemented a procedure that enables storing both file and database data in a single XML document based on an XML Schema, where the data in the database are mapped into a standardised form to facilitate recovery on different systems. The mapping of the complete content into only printable characters allows preservation of the archive in multiple ways. Setting up a highly automated cycle of archiving and restoring website content by using a Version Control System (VCS) is also suggested.

Full text (HTML) will lead you to a printer-friendly version should you want to keep a hard copy.

No, this isn’t careers information, I’m not even sure by how far you would consider it information management but …
I would make that a very big but …
Much of the information that careers practitioners use on a regular basis is contained in database systems where the format and the data are kept in different files.
It is very useful to have an understanding of this for when things go wrong (and they will).
Whether you are concerned about archiving you will certainly be concerned about back-up and retrieval.



No comments: