The human element
But as Deb Logan, an analyst at Gartner Inc., points out, system readiness is one thing; human limitation is another. According to Logan, the onerous task will be sorting through the unclassified and unprocessed data that the Bush administration will leave behind. The fact is, she says, the federal government itself has insufficient records management practices and systems in place, which means they'll basically be dumping raw data on NARA.
"It would be one thing if the stuff had to be moved seamlessly to a records repository, but it's just eight years of stuff," she says. "It will be nearly impossible to get it under control without a massive expenditure of human resources because the technology is not there."
According to NARA, it took about 400 days to process just the 2TB of data it received from the Clinton administration. Since it had no system at the time, it archived this data by recreating the Clinton administration's computer systems that originally held the records -- 17 in all -- and developed simple search interfaces that NARA personnel could use to access requested information.
Logan says part of the blame lies with federal agencies themselves, pointing to a GAO survey that concluded federal agencies have failed across the board to fulfill their records management obligations, "not out of malice or neglect but out of the nature of the volume of electronic communications and the time frame in which they have to do it," she says. "Anyone who's putting an optimistic face on the job is not being realistic."
Optimism may be relevant from a technology point of view, she acknowledges, but not from an information management point of view. "From my side of NARA, I don't deal with what's in the records, just whether we can get them into the system," she notes. "We allow the library staff to deal with the content."
An unprecedented effort
The system itself had its challenges, which Thibodeau says are a natural outcome of creating a system the scope and scale of the ERA. After all, the system is not just intended to preserve presidential records.
Under the Federal Records Act, it also works with federal agencies to preserve all of their relevant records, which amounts to about 2% of all the records they create. These records are submitted, appraised and archived continuously, not in batch modes at the end of each term, as presidential records are.
The system is charged with the following:
- Ingesting electronic records from federal agencies.
- Managing records storage in a way that guarantees their integrity and availability.
- Enabling users to search descriptions and business data about all types of records and to search and retrieve their contents.
- Supporting records management functions such as scheduling, appraisal, description and requests to transfer custody.
- Preserving records in the formats in which they were received, as well as creating backup copies for off-site storage.
To that end, the system is a mix of off-the-shelf and custom-built components, based on a service-oriented architecture and incorporating Oracle Corp. 's database technology, EMC Corp. 's Documentum for records management, search technology and a Web-based front end. It also incorporates a hierarchical storage system from Hitachi Ltd. that blends servers from EMC, Hitachi and Sun, as well as the Hitachi Content Archive Platform, which automatically indexes records as they enter the system, enabling immediate search capability.
The first glitch with the system was a missed deadline by Lockheed Martin Corp., which NARA contracted with to build the system, in September 2007 (see timeline at the end of this story). Thibodeau says this occurred in part because shortly after rewarding the contract to Lockheed, NARA discovered it needed to cut the budget in half, which resulted in rescoping the system's initial capabilities. This effort took the better part of a year, according to Thibodeau, as well as the time and attention of Lockheed engineering management.
To speed things up, NARA and Lockheed also decided to use a two-pronged approach to developing the system. In this approach, the first prong -- or the base system, which was completed in June -- manages record schedules, requests record transfers and stores records. NARA plans to beta-test this system for a year, working with just four agencies from which it accepts records. So far, Thibodeau says, there have been 16 records transfers. Other functionalities, such as the ability to automatically inspect and appraise records, were delayed for later increments.
The second prong is the system dedicated to the presidential records, originally called the Executive Office of the President (EOP) system, and now referred to as Search and Access ERA. This system is being developed in parallel with the base system, and the two will be merged as originally envisioned by 2011.
Testing was completed in early November, Thibodeau says, although security testing is still ongoing. "There were no show-stoppers, so we're optimistic that we'll turn it on in December before the onslaught in January," he says. Other functionality will continue to be built through 2011. If NARA is rewarded its next appropriation of money, it expects to build the public-access capability within a year, Thibodeau says.
Other slowdown factors
Adam Jansen, president of Dkives Consulting in Spokane, Wash., agrees that the phased approach is the way to go. Formerly the digital archivist for the state of Washington, Jansen built an electronic records archiving system for the state that serves 750 users and stores 75 million records -- from 150-year-old census books to e-mails accumulated over the most recent governor's eight-year term. The system stores a million Web pages from 400-plus agencies, and the state is about to release several hundred hours' worth of searchable full text, digital audio and tape of legislative committee hearings.
He calls NARA's project "a hugely ambitious project, and it's very difficult to bite off that big of a chunk all at once," he says. While with Washington's state government, Jansen says his team started with a few types of records and expanded from there. In four years, he says, the system went through three distinct iterations, with tweaking and reinventing along the way, especially when it came to ingesting records.
As for the ERA's bumpy history, Jansen also faults government bureaucracy and NARA's failure to seek out advice from others who had implemented such systema. "There were people who'd done research, and I'm not sure the lessons learned were researched and taken to heart," he says. "Having run a program similar to this for five years in Washington, I got almost no interaction with them despite efforts to do so."
But Thibodeau says when the project began in 1998, there was little information available. At that time, it took the agency two years just to research the feasibility of developing such a system, and it created a program management office to support it. "The biggest system we'd acquired before this was under $10 million," he says. "When you're doing something over $100 million, it's much more complicated, so we wanted to make sure we were competent to do it."
NARA also dedicated three years to eliciting and validating requirements, which culminated in inviting both the IT industry and the general public to comment on the requirements, he says. During this phase, NARA also organized two conferences, one for prospective users and another for industry, to discuss its plans and get feedback. "We wanted there to be no question of what we were building," Thibodeau says, claiming there have been no changes to the requirements over the course of the system build.
Countdown to January
Logan says the problem of managing electronic records won't be resolved until the government agencies themselves do a better job of electronic records management, including classifying, de-duplicating and purging data through the use of systems such as archiving, records and policy management, content monitoring/filtering, and content analytics tools.
Right now, she says, it's too easy to just keep buying more storage and keeping everything, and what's important to keep is intertwined with what's trivial. Not to mention that with no clear guidance or policy on data handling, she says, there's the risk of political appointees in outgoing administrations shredding data rather than turning it over.
"We've created a huge volume of stuff, and it's going to be impossible to sort it with any level of precision," she says. "The longer it sits around, the more you lose context and run the chance that the data formats will become extinct. I think the result will be a great loss of information for the future." Logan apologizes for seeming so pessimistic, "but I've been covering this for nine years, and the progress has been minimal."
Cameras
Camcorders
Cell Phones
Components
Desktops
HDTV
Home Theater
GPS
Laptops
Monitors
MP3 Players
Networking &
Printers
Storage




