Subtitle: Unlike Paper and Microfilm, Digital Documents Can Last Forever
“Conventional wisdom” amongst genealogists, historians, and archivists states that digital media is a poor method of storing data for decades or for centuries. This “conventional wisdom” claims that the only practical method of storing information for many years is to do so on paper or on microfilm/microfiche. There’s only one problem: “conventional wisdom” is wrong!
To understand the challenges involved, let’s first review the processes used by those who espouse “conventional wisdom:”
In the past few hundred years, paper has been the storage method of choice. Indeed, paper has worked well. We do have documents still readable today that were written in the 1700s and some in the 1600s. Even some medieval books written in the Middle Ages still survive and are readable, assuming you are trained in the handwriting and language used. On a trip to England, I saw a contemporary copy of the Magna Carta, written in 1215, that was still readable. However, these documents were not written on paper. Instead, the oldest documents were written on parchment, which is made of sheepskin, or the finer-quality vellum parchment, made of calf or goatskin. Its complicated manufacture means that parchment was usually reserved for important documents. In fact, paper was rare in the Middle Ages. Most surviving written documents of those times were written on parchment, not paper.
By the 1700s, paper was generally made from cotton, linen, or hemp. Production became more common, and large paper factories appeared in the 1800s. The manufacturing processes of those times did not use acids. As a result, the high-quality paper lasted for a long time. Newspapers are usually printed on the most inexpensive paper available at the time. Many copies of Thomas Paine’s newspapers from the American Revolution have survived because even the cheapest paper of 1776 was made from cotton, linen, or hemp, without the use of acids.
Inks also varied but were generally of high quality. The Chinese invented ink 5000 years ago, using a mixture of soot from pine smoke and lamp oil, thickened with gelatin from animal skins and musk. By the 1700s, varnish-like ink made of soot, turpentine, and walnut oil was created specifically for the printing press.
The paper and inks of 300, 200, or even only 100 years ago will last a long time if stored in locations where they have not been subject to major temperature and humidity changes. Indeed, museums of today have thousands of printed documents that are several centuries old.
Paper and ink manufacturing have both changed dramatically in the past century or so. Most of today’s paper is made from wood. The wood is converted to pulp, a concentrated mixture of fibers suspended in water. Most chemical pulp is made using the Kraft process, which is performed by pressure-cooking the material in a mixture of sodium hydroxide and sodium sulfide. Heat is applied, and three chemical components of the wood, cellulose, hemicellulose, and lignin then combine together and eventually result in sheets of paper.
A more detailed explanation of modern paper making methodologies may be found at http://en.wikipedia.org/wiki/Paper.
Today’s process uses a lot of acid to make paper. As the years go by, the acids will eventually cause the paper to self-destruct. The number of years varies, depending on the quality of the original paper and the storage conditions involved. The paper you use in your laser printer will probably will not last 100 years. In fact, if not stored properly, it may not even last 25 years.
As short a lifespan as that may be, paper is really not the biggest problem. Ink, or what we use in place of ink, will probably not even last as long as the paper it is printed on!
All paper acts as an “ink blotter.” When ink is pressed onto the paper, much of the ink is absorbed into the paper. The result is a more-or-less permanent combining of ink and paper. This “blotting” process is critical to long-term storage. While the paper might last for centuries, the document will be useless if the ink fades. The absorption of ink into the paper is critical to the readability of archived documents.
Let’s examine the laser printer that you use. In fact, it doesn’t use ink at all! It uses toner: tiny bits of plastic. Toner isn’t applied by pressure but is “fused” onto the surface of the paper, first by electrostatic charges and then by heat. The information to be printed is translated into bit mapped charges of the opposite polarity on a special drum in the printer. The toner is attracted to the charged areas, where it is transferred to paper. The toner is then “set”, usually by heat.
Toner is not absorbed into the paper; it is “stuck” on the surface instead. Over time, bits of toner will lose the electrostatic “stickiness” and will fall off. The plastic bits that do remain will also fade. The life expectancy of printed documents made by laser printers is unknown, mainly because laser printers have only been around for a relatively short time. No manufacturer of laser printers is willing to make predictions about the life expectancy of printed documents. However, almost all will agree that it will be less than 100 years. In fact, it might be less than 25 years. The problem is not with the paper but with the toner. The toner is not absorbed but is only attached or “stuck” to the paper. The toner absolutely will fade and/or fall off. The only question is, “When?”
The same is true for office copiers and the high-speed printers used to produce small numbers of printed books. All use toner.
“Wait a minute,” you say. “I don’t use a laser and toner, I use an inkjet printer. Surely there is ink there!” True, inkjets operate a bit more like traditional inks, and the sprayed-on ink is absorbed into the paper. There are almost as many printer inks as there are types of printers. Most inkjet inks are either water-based or solvent-based. Water and solvents don’t last long in storage. If you had an inkjet printer ten years ago, look at a document you printed then. Note that it is already fading. It will be unreadable in another decade or two. If it was printed in color, it might not even last another decade or two. (The reds are the first to fade away.)
The more expensive pigment ink can be applied to a wider variety of surfaces and will respond better to different temperatures, outdoor conditions, and fleeting time. Pigment inks will probably last 50 years or more. There is but one problem: the more expensive pigment inks are rare and are never found in inkjet printers designed for use in the home.
So how can you produce paper documents on your computer for long-term storage? The quick answer is that you cannot. Ten to perhaps fifty years is all that you can expect, and even fifty is questionable. We can, however, make recommendations with regards to prolonging the life of computer-generated documents for as long as possible:
• Use acid-free papers (pH 7.5-8.5) that are better suited for documents intended to be stored for long periods e.g. wills, rather than normal laser paper and recycled papers (pH 4.0-5,5). You probably will not find such paper at your local office supply store, however. When you do find it, acid-free paper is expensive.
• The documents should be stored in folders made of polypropylene or polyethylene rather than PVC.
• Store the document in a climate-controlled facility with cool temperatures and low humidity.
• Even better, print your document on an offset press. These have long been the standard printing presses used by commercial printers everywhere. An offset press is a sophisticated printing machine designed to produce fine quality reproductions. It uses almost any kind of paper but requires proper inks for its operation. Offset presses are used almost exclusively in larger print shops and are not found in homes or in “overnight printing” services.
In summation, if you really want to preserve your paper documents for more than one hundred years, be prepared to spend a lot of money. Also, please realize that nothing you print on your own computer printer will last that long.
NOTE: For the remainder of this article, I will use the terms “microfilm” and “microfiche” interchangeably. They are simply minor variations of the same technology.
Microfilm first became popular in the 1930s and was seen as a method of reducing the amount of space required for the storage of documents. In fact, early microfilms used cellulose nitrate and were not particular suited for long-term storage. In fact, over a period of time, cellulose nitrate produces a flammable gas that produces a high fire risk. You would hate to see a major genealogy archive going up in flames because the microfilms caught fire!
By the early 1950s, commercial production of all formats of cellulose nitrate film had permanently ceased as cellulose acetate film became the storage medium of choice. Cellulose acetate does not produce any flammable gases. There is but one problem: cellulose acetate will still naturally degrade over time. Such microfilms will last less than fifty years. This degradation process is accelerated when acetate film is not properly stored. Although a great deal of acetate microfilm still exists, acetate film is not acceptable as a preservation medium.
Polyester is the only film base currently recommended for preservation microfilming. Both stable and durable, black-and-white polyester film has a life expectancy of 500+ years under ideal storage conditions. Indeed, microfilm would seem ideal for long-term archiving of valuable information. However, life is never that simple.
First, creating microfilm, exposing it (taking the picture), and then developing the film is an involved process that requires expensive equipment. Next, the demand for microfilm is dropping every year as computer digitization is replacing microfilm. Twenty years ago there were many manufacturers of microfilm equipment. Today there are only a handful of manufacturers in the business, and a few more drop out of the business every year. Within another decade or two, you probably will not be able to purchase a microfilm reader, a microfilm camera, or even an unexposed reel of film. The equipment will probably be found only in a few museums, gathering dust. Even the manufacturing of spare parts for obsolete microfilm equipment will cease.
This will be similar to the history of 78-rpm records. While still viable and useful, you can no longer purchase 78-rpm records today, and it is almost impossible to purchase the equipment to play the older records you still own. I own about fifty 78-rpm records, but the only player I own that will play them back was manufactured in 1926!
Technology marched past 78-rpm records and, indeed, the other record formats. Manufacturers ceased making equipment as their profits disappeared.
Microfilm equipment and manufacturers will do the same within a few years. While museums in the twenty-second century may own cabinets full of ancient microfilms, it is doubtful if they will possess any equipment to view the information contained therein.
The largest user of microfilm equipment within the genealogy community has been the FamilySearch Department of the Church of Jesus Christ of Latter-day Saints. For years, this organization has sent crews and microfilm cameras to archives all over the world to make microfilm images of stored documents. They produced more than two million reels of microfilm containing hundreds of millions of records. Now the Family History Department is abandoning microfilm. In fact, only a very few microfilm cameras are still in operation, and the number is becoming smaller each year with digital equipment replacing microfilm in the majority of crews.
Why? The primary reason is that the FamilySearch Department can no longer purchase microfilm cameras! In fact, suitable cameras for their work have not been available for quite a few years. As the present cameras suffered wear and tear, repairs have been made. Now even spare parts are no longer available.
The FamilySearch Department did manufacture their own spare parts for a while, contracting the work out to various machine shops. However, repair procedures encountered difficulties, and the expenses were prohibitive. The FamilySearch Department is now in the process of abandoning microfilm completely for most new projects. Within a very few years, all new data acquisition by the FamilySearch Department of the Church of Jesus Christ of Latter-day Saints will be done in digital format, not on microfilm. Even today, most data acquisition is being done as digital images, not on microfilm.
Reproduction of Printed Materials
For centuries, historians and archivists have solved the problem of decaying original documents by making copies. I suspect that trend will continue for several more centuries.
In medieval times, monks copied books by hand. In more modern times, entire books were reprinted with modern printing processes. In the same vein, several companies in business today provide a valuable service by republishing old genealogy books. In most cases, these are photo reproductions: the new books are simply photographic images of the originals. Such mass production allows valuable information to become available to many interested buyers.
Reproduction does have some disadvantages, however. Namely, the second copy is never as clear as the first. Some amount of “fuzziness” is introduced each time a copy is made. If an original is photographed, the next generation’s image will have only a bit of fuzziness, usually an acceptable amount. However, if a copy is made of a copy, the total amount of fuzziness becomes noticeable and distracting.
A “copy of a copy of a copy of a copy” is probably unusable. Each new “generation” adds additional degradation with the final product containing the sum of all the fuzziness added by each previous generation. It’s easy to see why all of today’s producers of photo reproduction books struggle to obtain original books from which to make their master copies.
Now, think about the documents you produce on your computer. Perhaps you will give copies to your children or grandchildren or other relatives. As the years go by, they may want to give copies to newer generations. Assuming no change in technology (which is doubtful), they will have to create copies. After all, we already know that your originals are deteriorating. The new generations of genealogists will need to make reproductions, which will add some fuzziness. Then, what will even later generations do? They will make reproductions of the reproductions, which will add even more fuzziness.
NOTE: Actually, I think the later generations will convert your documents to digitized documents, but I am getting ahead of myself…
Reproduction of Microfilms
As stated previously, microfilms will last perhaps 500 years if stored properly. However, that assumes that the films are never run through a microfilm reader! What good is that?
Each time a microfilm moves through a microfilm reader or a microfilm copier, it is subject to additional wear and tear. To see examples of usage, visit any popular library or FamilySearch Center. You see all the scratches on the film? That’s wear and tear.
Most microfilm archives of today make one or more master copies of each reel of microfilm and then store the masters under optimum archival storage conditions. The archive’s employees work diligently to minimize the handling of the original, master microfilms. A few copies are made of the master films, then further copies are made from the copies. The masters are kept locked up for as long as possible and are used as little as possible.
Of course, some fuzziness is introduced each time a copy is made. The problem is similar to that of photocopied books with one major exception: microfilm images are tiny, and any induced fuzziness of a copy is much more noticeable. Thus, a copy of a copy can be very difficult to read. A “copy of a copy of a copy of a copy” is generally unusable. Each new “generation” adds additional degradation with the final product containing the sum of all the fuzziness added by each previous generation.
In short, microfilm is a great storage mechanism but a lousy retrieval method. Films are easily scratched. There are good reasons why the originals should not be taken out every few years to make new copies; do that often enough, and even the originals will become badly scratched. Most active archival facilities make copies of copies, even with the inherent image degradation.
As time goes by and present microfilm equipment wears out and new microfilm equipment becomes unavailable, the problem will become magnified. Future genealogists will not be able to obtain microfilm readers. Indeed, microfilm have already become an impractical storage solution.
Digital Records: Is It a Solution or Is It an Even Bigger Problem?
Today’s headlong rush into digitizing everything in the world provides many solutions but also more than a few new problems. Converting old documents to digital images is a rather simple process; however, the long-term storage requirements produce even more complex problems than the storage requirements of paper and microfilm.
“Conventional wisdom” amongst genealogists, historians, and archivists says that digital media is a poor method of storing data for decades or centuries. There are many examples of this. Years ago, digital data was stored on 80-column punch cards. Try to read those documents today! Later “improvements” shifted to various formats of magnetic tape, to 8-inch floppy disks, followed by 5-1/4 inch floppies and then by 3-1/2 inch floppies. Still later, CD-ROM disks became the storage media of choice, followed by DVD-ROM disks. A new optical disk technology called Blue-Ray iis now popular today. Each new generation of digital storage offers greater and greater storage capacities. However, none of the newer technologies solve the biggest problem of all: obsolescence.
Magnetic storage will eventually lose its magnetism. Floppy tapes and disks will be unreadable in 25 or 50 years as there will not be enough magnetism left to be read. Optical disks have their own problems. CD-ROM and DVD-ROM disks have problems with chemical changes inside the disks. Acids eat into the substrate that stores the information. Blue-Ray is so new that we do not yet know what the long-term storage properties will be.
All this will probably be a moot point anyway. The actual devices to read those stored items will become unavailable long before magnetism ceases and acids devour. When was the last time you saw a 5-1/4 inch floppy disk drive?
OK, so I still have a working 5-14 inch floppy disk drive in one of my older computers, and I can still make copies. There may be a few others available also. However, I do not have a working 8-inch floppy disk drive or a card reader capable of reading 80-column punch cards. Data stored on punch cards or on floppy disks in the late 1970s is unreadable today.
Here is where “conventional wisdom” goes astray:
Those who think about archival paper and microfilm assume that all information is created once and then is stored, never to be touched again, if possible. The assumption is that information is stored on its original medium forever. Likewise, those responsible for storage of original microfilms go to great lengths to keep the originals in exactly their original format. They plan to keep the originals forever.
In a digital archive, that would be DUMB!
A well-run digital archive is operated in almost the exact OPPOSITE manner as a paper archive or a microfilm archive. Digital archivists go to great lengths to ensure that the original documents are NEVER left in their original state! The goal is to ensure that digital archives are always maintained in whatever format is dictated by current technology. As that technology changes, all archived digital documents absolutely must be copied to the new storage medium of choice.
Let’s take the case of 80-column punch cards of the early 1970s. If left untouched in the manner of traditional archival procedures, those records would be inaccessible today. However, if deposited at that time with a well-run digital archive, the 80-column punch cards would have been copied to 9-track digital tapes in the late 1970s. In the 1980s, the same records would have been copied from tape to floppy disks. In the 1990s, the floppy disks would have been copied to CD-ROM disks. By the year 2000, the CD-ROM disks would have been copied to DVD-ROM disks and perhaps next year they would be copied to Blue-Ray disks. You could then expect them to be copied every five to ten years into the future to whatever storage format makes sense at that time.
The format also would be copied. Data was stored in Hollerith Code on 80-column punch cards. (I can still hold a punch card up to the light and read the data directly with my eyes because I memorized the Hollerith Code back when it was in common use. (However, I don’t find much use for that talent nowadays.) The Hollerith Code might have been converted to EBCDIC when placed on 9-track tapes, then converted to ASCII when it was placed on floppies. Perhaps next year’s copy will convert it to Unicode.
So far, I have only mentioned textual information, but the same would be true of digital images. In the mid-1980s, many scanned images were in BMP format. Those images would later be converted to TIFF, JPEG, GIF, PNG, or whatever format is appropriate each time it was copied to modern storage media.
Remember the discussion of induced fuzziness made by each new generation of copies? That is true for “analog” media such as paper or microfilm. However, digital records can be faithfully reproduced time and time again with no degradation or fuzziness. When copying digital text or images, a copy of a copy of a copy of a copy of a copy of a copy of a copy is exactly the same as the original: there is absolutely no degradation involved. The year 2006 DVD-ROM copy of the 1976 data original entered on punch cards is perfect and can be copied many more times with no degradation.
In short, digital data can last forever. Of course, this does assume that the digital data is stored in an active digital archive which guarantees that multiple backup copies are made of the original and that all copies are periodically transferred to new storage media and new file formats.
In fact, there are several major efforts underway today to make digital copies of all paper and microfilms of genealogical and historical interest. The Family History Department of the Church of Jesus Christ of Latter-day Saints has already announced a massive, multi-year undertaking to convert all of their millions of reels of microfilms to digital images. Of course, the FamilySearch Department will never throw away the microfilms; both microfilm and digital formats will be maintained, and each format will be used wherever appropriate.
NewsBank, ProQuest, Newspapers.com, and other companies are feverishly scanning old newspapers, converting them to digital formats. Most major repositories are converting paper to digital formats. For instance, the American Antiquarian Society has perhaps the largest collection of historical newspapers in the United States. The American Antiquarian Society has announced that many of those newspapers will soon be available online.
You will read about another company in future newsletters that is scanning millions of historical and genealogical documents. They are scanning newspapers, books, and millions of pages of loose-leaf paper. Brigham Young University, the University of Virginia, and many other academic institutions are digitizing millions of documents of interest to historians and genealogists. Almost all major collections of historical documents are working hard to digitize their collections.
Yet not one of these organizations ever expects to create a digital image or textual record and place it on a shelf to be untouched for decades. Instead, each one plans to store multiple backup copies and to convert those copies to more modern storage media as technology improvements become available. Each of these organizations expects to make the information available to historians, genealogists, and others for many years, perhaps centuries, in whatever format is available at that time. Indeed, they know that the only practical method of storing and distributing information forever is by use of digital storage and periodic conversion to new formats.
What does this mean to you and to your “personal archive” of genealogy information?
Like the pros, you cannot count on storing your data on paper or on microfilm. Like the pros, you cannot plan to make a floppy disk or a CD-ROM disk and place it on a shelf or in a safe deposit box to remain untouched for the next fifty years.
Just like the pros, you need a plan to store your data and to nurture it and grow it. You need to make sure that you or someone in a later generation will maintain the information and will copy it to newer storage methods and file formats as they become available.
If you have a younger person in your family with both technical expertise and an interest in your genealogy projects, get that person involved now!
I would also love to see genealogy societies become involved in data preservation for members. I have not yet heard of such a project anywhere, and yet it seems like a “natural” for local, regional, and national genealogy societies. Perhaps the data would be submitted to the society on a CD-ROM or DVD-ROM disk or via online file transfer. The data might be in ASCII text or GEDCOM or Word DOC or in Adobe’s PDF format. Images could be submitted in TIFF or JPEG format. Even better, submit your information in all those formats at once on one disk or in one file transfer! That’s easy to do.
The society would then save the information in a proper archival facility. Backup copies would also be stored off-site, preferably at a second facility a long distance away. Even more important, the society would also periodically convert this information to newer formats. The process should be simple as conversion software always exists for a few years while an older technology is still available side-by-side with a newer one.
The society undoubtedly would charge a fee for this. (Do I see a fund raising opportunity here?) The society would guarantee to preserve and to periodically copy your data. In accordance with the wishes of the donor, the information within the files might be freely distributed or else could be tightly controlled.
I know that I would quickly donate my information to a society that agrees to periodically copy and convert it for me and to also distribute to others under the conditions that I specify. Yes, I would pay money for this “archival service.”
I bet that hundreds of thousands of other genealogists would do the same.
You might want to discuss this fund-raising opportunity at your next society meeting.