Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

A National Scan Center: A Public Works Project

In the course of doing research for some recent testimony before Congress on the National Archives and Records Administration, I was struck by several facts about how our first National Archivist, Robert D.W. Connor, met some seemingly insurmountable challenges when he took office in the mid-1930s.

The biggest challenge was the deluge of paperwork, a situation not very different from what our national institutions face today. Instead of simply moaning the impossibility of swallowing all the records Connor would need to establish the National Archives, he thought nonlinear. The result was the invention of several key technologies: the airbrush to clean paper, the laminator to protect it, and of course, the microphotograph (now known as microfilm or microfiche), a technology so successful it reduced incoming paper needs by 95%.

The other challenge that Connor faced with the National Archives, a situation again not very different from what our national institutions face today, was a paucity of skilled labor. Lucky for Connor through, the National Archives was born in the middle of the last great depression. Connor went to Harry Hopkins, and together they went to President Roosevelt, and the result was a Works Progress Administration program that ran until 1942 to survey federal archives. The work program put 3,171 people to work in 1,057 communities and created two important reference aids still in use today, the Historical Records Survey and the Inventory of Federal Archives.

Just before I testified, I read in the New York Times that the President of France had just announced a stimulus package of $50 billion. President Sarkozy pledge 2% of that stimulus package, a full $1.1 billion, towards scanning and digitizing a national archive. I didn't use the term Freedom Scans in my testimony, but the fact that the French were far ahead of the U.S. in putting paperwork into cyberspace seemed a political opportunity.

In the U.S., we face a similar deluge of paperwork that we faced in the 1930s. A huge backlog of paper, microfiche, audio, video, and other materials is located throughout the federal government. Little money has gone from Congress for digitization, and bureaucracies have resorted to a series of questionable private-public partnerships as a way of digitizing their materials. For example, the Government Accountability Office shipped 60 million pages of our Federal Legislative Histories (the record of each law from the initial bill through the hearings and conference reports) off to Thomson West, but didn't even get digital copies back. Another example is the recent failed effort by the Government Printing Office to digitize 60 million pages of the Federal Depository Library Program, an effort they tried to get through as a "zero dollar cost to the government" effort with the private sector.

There are no free lunches and there are no "no cost to the government" deals. The costs involve the government effort to supervise the contract, prepare the materials, and ship them, and in both the GAO and GPO cases, the government wasn't getting much back for its effort. What the government and the people usually get is a lien on the public domain, preventing the public from accessing these vital materials. Similar efforts are sprinkled throughout the government. I testified to Congress that I had learned that the National Archives was contemplating a scan of congressional hearings with LexisNexis under similar circumstances, and many may be aware of the questionable deal the Archives cut with Amazon where my favorite online superstore got de facto exclusive rights to 1,899 wonderful pieces of video.

We can learn much from the French leadership on this issue. After my testimony, I went and visited senior officials at the Library of Congress and the Smithsonian. They all said that while they had tried to get more congressional interest in digitization, and had tried to go after stimulus money, so far nobody had much success. I asked if they had gone hand-in-hand with their sister institutions to ask for this money, and it was pretty clear that they had not. Each institution went in one at a time pleading their own special case to congressional staffers and to officials at the Office of Management and Budget.

There was one more thing I learned about our first National Archivist, which was that he had backing where he needed it and the political skills to use that backing. One of the big challenges Archivist Connor faced was getting the agencies to cooperate with him in giving the National Archives their records. His solution was leadership: President Roosevelt agreed to host a meeting of a newly-formed National Archives Council in the Cabinet Room. That, needless to say, got the department secretaries and agency chiefs to show up, and they elected the Secretary of State as head of the Council. The Council only met a few times, but that was all it took, and the result were new federal policies about how agencies should dispose of their records.

There are several agencies in the government that face huge digitization and scanning backlogs, including the Library of Congress, the Smithsonian Institution, the Government Printing Office, the National Archives and Records Administration, and the National Technical Information Service. In addition, there are agencies such as the Government Accountability Office and the Defense Visual Information Directorate that have valuable archives.

Chairman Wm. Lacy Clay of the the Information Policy, Census and National Archives Subcommittee asked many very informed questions of the panelists, and one that came my way was about costs for digitization. Today, the widely accepted cost for scanning a piece of paper and running it through OCR is about 10 cents per page. These are the numbers that you hear from places like the Internet Archive and Google Book Search, and that's what I told the Chairman. But, I also told the Chairman that it was my belief that if the government starting scanning at volume, those costs could go down by half. I also testified about the vastly reduced costs of digitizing video, a task I perform under a joint venture with the National Technical Information Service using less than $10,000 in hardware.

If the government invested a mere $100 million of our stimulus package (we've already spent over $72.6 billion), that means 2 billion pages of paper or microfiche would get scanned. For $500 million, we're talking a huge chunk of our national backlog being digitized, a task that would result in an enduring digitial public work for our modern era, something that would prove immense use to future generations, and would also save the government tremendous amounts of money in storage costs and other facilities expenses.

What would it take to get the Library of Congress, the Smithsonian Institution, the Government Printing Office, the National Archives and Records Administration, and the National Technical Information Service all singing off the same page and working together? There is a tremendous opportunity for White House leadership here, bringing the parties together and creating a compelling case on why we should launch and fund a 5-year $500 million effort to create a National Scan Center. Both the CIO and the CTO in the Executive Office of the President have talked about the tremendous "moral authority and convening power" of the White House, and I believe that this issue is of sufficient importance that it would be worthwhile to pursue.

Don't be the product, buy the product!