Data.gov gets an open-source revamp
The U.S. government’s portal for the data it creates, Next.Data.gov, is getting a revamp that should make it easier to view and reuse government data.
The update should also help federal agencies comply with a White House executive order issued in May to make government data machine-readable by default.
The beta version of the site, now available for user testing under a subdomain of Next.Data.gov, features more visualization of government data, an expanded section for communities of interest, and a stream of examples of government data usage by third parties.
“It looks different, and it is exciting that they are pulling in more information about how data is used and how people are talking about” government data, said John Wonderlich, policy director for the Sunlight Foundation, a nonprofit organization that seeks to foster greater government openness and transparency through the use of the Internet. “The first look is encouraging.”
An early initiative from the Obama administration, Data.gov was launched in 2009 as a way to collect and provide a portal for data sets created by U.S. federal agencies, so they can be viewed and reused by the public.
In much the same way that the Defense Department’s GPS (Global Positioning System) data has fueled the growth of geolocation-based businesses, so too should these additional government data sets generate new businesses, President Barack Obama has argued.
The site’s popularity has steadily been growing. In May, it received 213,000 visitors, more than twice the number of visitors in May 2012.
Data.gov’s challenge is to ensure that “as much data as possible ends up there and that agencies take seriously the requirement that they are open with their information,” Wonderlich said.
The White House charged its Office of Management and Budget (OMB) to develop the site. The OMB then had the White House’s Office of Science and Technology Policy (OSTP) oversee the project. The General Services Administration (GSA) manages the operations and development of Data.gov.
For the update, “The team studied the usage patterns on Data.gov and found that visitors were hungry for examples of how data are used,” wrote Nick Sinai, U.S. deputy chief technology officer and Ryan Panchadsaram, senior adviser to the U.S. technology officer, in a co-bylined blog post announcing the update.
The site will include a stream of blog posts, tweets, quotes and other sources showing how people and organizations are using government data feeds. “It certainly helps the Data.gov brand to have people understand how data is being used in the world,” Wonderlich said.
For the site, the Data.gov development team will rely heavily on open-source software. The redesigned site will use the Apache Solr search server software to improve the site’s search capabilities. Agencies that post their metadata in the Common Core Metadata Schema will have their data sets indexed by Data.gov.
For the data catalogue, it will use the CKAN (Comprehensive Knowledge Archive Network) data management platform. For content and the community sections, Data.gov will use the WordPress content management system.
Nonetheless, the use of open source is a “reassuring sign” that Data.gov is moving further in line with the White House’s preference to maximize the use of open-source software, Wonderlich said.