Patterns of Exemplary Practice in Electronic Access to Information
Migration & Preservation
The electronic information resources that were the focus of attention in this research exist in a very wide range of formats and storage media. The formats and storage medium for any information set is a result of decisions made about technology by those responsible for creating the information in the first place. Since information technology changes rapidly, new formats and storage media are becoming available at a rapid pace, and older methods and materials become obsolete just as quickly. This process presents the repository with the problem of deciding on a format and storage medium for its content that will preserve access for as long as necessary. All of the repositories in this study had developed methods for dealing with this problem. The methods differed, however, due to the nature of the information they store, the technologies and needs of their users, and the time frames for maintaining such storage and access.
The research revealed two basic approaches to solving the problem of maintaining long-term access to electronic information in multiple formats and storage media. The repositories developed policies for the kinds of formats and storage media that they would except, in an effort to reduce the variety to maintain or migrate over time. The repositories also developed strategies for migrating information from older to newer formats, according to the nature of the information and the needs of the users. The problem of receiving data in multiple formats is more severe for repositories that accept data from a wide government or research community. To reduce the variety in formats, the ICPSR accept data only in a limited set of very common formats. A similar strategy is followed by the Federal Justice Statistics Research Center. However, these repositories to make exceptions for data sets that represent substantial value, even if they are in an unusual or obsolete format. The ICPSR staff reported excepting data sets recently on punch cards, although they did not currently have equipment on-site that could read the cards. So they had to go to an equipment warehouse to find punch card readers in order to create electronic version of the data set. They noted that the same problem could occur for equipment used to create current formats. It may be necessary to maintain some obsolete equipment in working order for the purpose of processing or migrating old data sets that come to light.
Once a data set is accepted in a particular format, it will still be necessary to refresh or migrate as information as technology changes. The repositories reported systematic conversion projects and schedules for migrating to new formats. They engaged in risk analysis to better understand the consequences of alternative conversion strategies and formats. One principle coming from the risk analysis which several mentioned was to convert to the most frequently used formats, since they were likely to persevere in use over longer periods. The large number of users for these common formats would provide an incentive for developers to create migration technologies and methods for them.