Sunlight Foundation

 

Making Government Transparent and Accountable

The Sunlight Foundation uses cutting-edge technology and ideas to make government transparent and accountable. Underlying all of our efforts is a fundamental belief that increased transparency will improve the public's confidence in government

 

The Sunlight Foundation Blog

  • Improvements Needed For High Value Datasets On Data.gov

    This morning a number of organizations — POGO, OMB Watch, CREW, National Security Archive, the Center for Democracy and Technology  and the Open The Government coalition– and Sunlight sent a letter to Vivek Kundra, Federal CIO, about improvements needed to the release of High Value Datasets on Data.gov. Here are the core recommendations included. Please tell us what you think in the comments below.

    As advocates for government openness, we support the Administration’s efforts to provide the public with access to information through Data.gov. We are eager to work with you to ensure the success of Data.gov and, in that spirit, write to raise our concerns with the datasets submitted by agencies to fulfill their requirement under the Open Government Directive to post three high value datasets by January 22, and to offer constructive suggestions for improving their usefulness.

    As an overall recommendation, we urge you to add public representatives to the Open Government Initiative interagency working committee and ask the committee to address the problems and recommendations identified below.

    Release Format and Usability by the Public

    We understand one of the primary purposes of Data.gov is to enable the technology community and transparency advocates to most effectively use the data to make a direct impact on the daily lives of the American people. The format of the data plays a key role in its usability; many within the community of advocates who re-use and repackage government data would prefer data in CSV format, rather than the XML format in which many of the posted databases are provided. Accordingly, we recommend that you strike an appropriate balance between formats (such as XML) that serve the coding community and web-based presentations by agencies that can be used and understood by the general public.

    In addition, some of the currently posted files are quite large, ranging upward to several hundred megabytes. Their large size undermines their usefulness for most people or organizations. The large number of currently posted datasets also makes it difficult to find a particular database of interest. We therefore recommend that if a Data.gov dataset is available from an agency through a web-based interface, Data.gov link to that interface on the dataset’s Data.gov landing page. For a consumer looking for information on a car seat, for example, it would be far easier to search the Department of Transportation’s online database rather than scrolling through screen after screen of raw data in XML format. Additionally, as agencies continue to post datasets to Data.gov, efforts should be made to identify those of greatest public interest that lack such interfaces and develop web interfaces that allow the data to be explored online.

    Further, while we agree there is value in aggregating government data in a single site, it is questionable how much the collocation of the currently posted information on Data.gov actually benefits the public. The site is not searchable by topic and does not provide any way to bring together data from different sources on similar topics.

    As an enhancement to the organization of the site, we recommend that you use tagging or metadata to enable the public to bring together information on a topic. The thesaurus that USA.gov uses provides a useful example of the needed vocabulary.

    Value of Data

    The release of the datasets also has prompted discussions about the value and the quality of the released data, and the additional value provided by access to existing data in a new format. We believe repackaging old information is of marginal value, yet that is what many agencies have done with their recent postings on Data.gov. According to the Sunlight Foundation, of 58 datasets posted by major agencies, only 16 were previously unavailable in some format online. This leaves the impression that agencies posted easily available data, the proverbial low-hanging fruit, rather than seriously considering which of their datasets truly are of high value. While these initial postings can be considered a test run, more attention needs to be directed toward ensuring the overall quality and usefulness of the data.

    In addition, sustained attention should be paid to the possibility of making some of the datasets available as feeds that are constantly up to date, rather than as static datasets that are pulled down and then reposted on an occasional basis. We recommend that agencies be required to explain why the data is high value by having them designate which of the “high value criteria” the data meets: information that can be used to increase agency accountability and responsiveness; improve public knowledge of the agency and its operations; further the core mission of the agency; create economic opportunity; or respond to need and demand as identified through public consultation. Similarly, we recommend requiring agencies to indicate whether a high value dataset was previously unavailable, available only with a FOIA request, available only for purchase, or available, but in a less user-friendly format. Going forward, this will make it much easier to track how agencies are complying with the other requirements of the Open Government Directive. While we appreciate the value of data that furthers the mission of an agency, we believe it is equally important to make available to the public data that holds an agency accountable for its policy and spending decisions. We hope to see more datasets of this type available in the near future.

    Quality

    As is to be expected in efforts of this type, there were a number of glitches–datasets that could not be downloaded or, once downloaded, could not be opened (the Central Contractor Registration FOIA extract from the General Services Administration seems to have caused several users problems). Additionally, some datasets were incomplete (the Hazard Grant Mitigation Program data released by FEMA is missing 23 years of data between 1966 and 1989). Even more troubling, some did not have header rows, and for those that did, their Data.gov pages did not always link to code sheets explaining what those header rows meant. Without this information, the data cannot be used.

    We therefore urge the implementation of a responsive feedback mechanism that allows the public to alert an agency that a specific dataset is not working, lacks information, or is missing explanatory material and provides a response to the concerns within a specified time. One way to address this may be to include an agency contact with the ability to resolve any database problems or provide information about the database. The interagency working group could sample the quality of these agency-specific dialogues to ensure that they are having an impact and to develop recommendations on best practices to improve the responsiveness. Additionally, we strongly recommend that all datasets on Data.gov be directly associated with their code sheets.

    Finally, we are concerned with the current lack of public notice when data is removed from the site. We respectfully urge you to note all raw tools and data that are removed from Data.gov, and to provide an explanation for their removal.

    Many of the concerns outlined above apply across all or many of the agencies’ datasets. Accordingly, we think that standards for handling these types of problems can easily be addressed through the interagency working group and then disseminated amongst the agencies.

  • This Week in Transparency – July 31, 2009

    Here are some of the more interesting media mentions of Sunlight and our friends and allies over the past week:

    National Journal’s Eliza Newlin Carney wrote about how the health care industry is unleashing big money as the health care debate in Congress intensifies. She notes the blog post from Paul Blumenthal, Sunlight’s senior writer, about how five of Sen. Max Baucus‘ (Mont.) former staff members now work for a total of 27 different organizations that either represent the health care or insurance industries, or have a vested interest in the debate. She also quotes Paul, “We thought it was important to show the public that the senators aren’t crafting the policy by themselves. They have all these other connections, through relationships, that have a huge stake in this legislation.” Trudy Lieberman with the Columbia Journalism Review also highlighted and linked to Paul’s post and the graphic he and Kerry Mitchell, Sunlight’s creative director, produced. The “study shows exactly what advocates of real and substantive health reform are up against,” Lieberman wrote, adding that Sunlight provides clarity on just who has the senator’s ear.

    Speaking of Kerry’s graphic art skills, The New York Times‘ “First Look” blog includes one of his illustrations in a post highlighting great visualizations created by designers using the Times APIs that “both beautify and clarify information.” Kerry’s graphic illustrates the Times’ usage of the word “transparency” since 1990.

    David Talbot at MIT’s Technology Review, in an article how volunteers are using the Web to help make the U.S. government more accountable, highlighted Transparency Corps. Talbot quoted Clay Johnson, director of Sunlight Labs, “Government puts out a ton of data that is really interesting about what it does, but people can’t understand it.” Transparency Corps launch roughly coincided with the launch earlier this month of the White House’s IT Dashboard, the administration’s effort to chart the progress of information-technology projects in various federal agencies. The article quotes Andrew Rasiej, Sunlight’s senior technology advisor and co-founder of Personal Democracy Forum, saying the dashboard may be just the tip of the iceberg heralding a new age of transparency regarding federal spending. “Once people get used to this type of information being so readily accessible, they will demand to see (it) for all other federal spending too, and then the genie will be completely out of the bottle.”

    Dan Eggen at The Washington Post wrote how the debate about health-care reform has been a boon to the political fortunes of the 52 members of the Blue Dog Coalition, who have become key brokers in shaping legislation in the House. Eggan used Party Time data to show show U.S. Rep. Mike Ross (Ark.), a leader of the Blue Dogs, has had a steady schedule of fundraising events sponsored by the health industry or lobbying firms that represent health-care companies. Eggen used data from the Center for Responsive Politics that showed Ross had received nearly $1 million in contributions from the health-care sector and insurance industry during his five terms in Congress. On the topic of Party Time, be sure not to miss National Journal’s interview with Party Time’s director Nancy Watzman.

    (Continue reading…)

  • This Week in Transparency – July 17, 2009

    Here are a few of the more interesting media mentions of Sunlight and our friends and allies from the week:

    Jeff Jacoby, columnist for The Boston Globe, mentioned ReadTheBill.org in a piece he wrote calling on congressional lawmakers read legislation before they vote on it. Glenn Reynolds, at his Instapundit blog, linked to Jacoby’s column. Andrew Sullivan’s blog, The Daily Dish, followed by linking to Reynolds.

    In Washington Monthly’s July/August edition, Charles Homans wrote about the Obama administration’s “experiments with data-driven democracy.” The article centers on the work of Vivek Kundra, the White House’s chief information officer, and mentions both the District of Columbia’s Apps for Democracy contest and Sunlight’s Apps for America contest. Homans quotes Clay Johnson, Sunlight Labs’ director, saying Kundra has his work cut out for him. “I have nothing but respect for what he’s trying to do. But it’s a hard job, and it’s going to take some time for this to actually happen right. I mean years.” While discussing Kundra’s launch of Data.gov, Homans again quotes Clay, “The top data source is on the world’s copper smelters, which isn’t going to tell us very much about what’s going on inside of our government.”

    As Ellen Miller, Sunlight’s director, wrote earlier this week, “When it comes to following the money that’s flowing to power on Capitol Hill, no one does it better than the Center for Responsive Politics.” For instance, MAPLight.org used CRP data to show how money watered down the energy bill, the American Clean Energy and Security Act of 2009 (HR 2454). With Congress debating health care reform, Forbes used CRP data to show how America’s Health Insurance Plans, the political advocacy and trade group for the health insurance industry, has spent nearly $10 million on lobbying Congress in the past two years. Robert J. S. Ross, writing at The Huffington Post, quotes CRP about how the insurance industry has contributed $568 million to political campaigns since 1998. CNN’s Jonathan Mann used CRP data in noting how doctors have spent roughly two-thirds of a billion dollars lobbying lawmakers in the last 10 years.

    (Continue reading…)

  • This Week In Transparency – June 26, 2009

    Here are a few of the more interesting media mentions of Sunlight and our friends and allies from the week:

    CNN interviewed Ellen Miller, Sunlight’s executive director, in an article on lobbyists and the need for disclosure of their interactions with congressional lawmakers and other federal officials.

    Katharine Q. Seelye at The New York Times reported on the fact that, five months into his administration, President Obama has signed two dozen bills, but he has almost never waited the five days, as he promised during his election campaign. She noted how open government and other watchdog groups have criticized the president for not living up to his pledge. Seelye quotes Ellen as saying it’s less important for the president to wait before signing a bill than it is for the Congress to wait 72 hours before voting on it. “There isn’t anybody in this town who doesn’t know that commenting after a bill has been passed is meaningless.” The article also has an accompanying video.

    Politico’s Victoria McGrane reported on how the Senate is considering putting all their office expenses — including staff salaries — online, as well as requiring campaign fundraising reports to be published on the Web. The mere fact that the Senate leadership has conducted a whip count is an encouraging sign for the reforms’ passage, McGrane writes. And she quotes Lisa Rosenberg, Sunlight’s , “They wouldn’t be talking about bringing it up for a vote if it wasn’t pretty solid.”

    The Washington Examiner reports on Citizens for Responsibility and Ethics in Washington calling on the Obama administration to release the names of health care executives who have visited the White House. “If you are going to criticize other people for secrecy, you better have an open door,” said Melanie Sloan, CREW’s executive director. “They talk about transparency more than they exhibit it.”

    (Continue reading…)

  • This Week In Transparency – June 12, 2009

    Here are a few of the more interesting media mentions of Sunlight and our friends and grantees from this past week:

    Federal law prohibits lobbyists and those that hire them from giving gifts or campaign contributions to congressional lawmakers. No such law exists prohibiting them from spending unlimited amounts to honor lawmakers or contributing to non-profits connected to them. Quite a limitation on the distinction, if you ask me. However, Congress passed ethics rules in 2007 requiring for the first time that lobbyists must report all such payments. On Monday, USA Today’s Fredreka Shouten and Paul Overberg reported on the paper’s comprehensive analysis of lobbying reports that found 2,759 payments, totaling $35.8 million, were made in 2008. They quote Ellen Miller, Sunlight’s executive director, “It’s another example of the many pockets of a politician’s coat.” The spending amounts to an “end-run” around campaign-finance laws “that are designed to limit the appearance of undue influence.”
    (Continue reading…)

  • Weekly Media Roundup – May 15, 2009

    Here are a few of the more interesting media mentions of Sunlight and our friends and grantees from this week:

    Saturday evening, Ellen Miller, Sunlight’s executive director, appeared on CNN talking about Recovery.gov. She made the point that Recovery.gov needs to be updated in real time so people can keep government accountable as it happens, instead of after the fact. Below is the video of the segment:

    The New York Times published an editorial calling for Congress to provide Congressional Research Service reports online for all Americans to access free. The Times ran the editorial a week after Ellen met with an editorial writer at the paper. Last week, The Times published an article about the campaign being waged by Open CRS, a project of the Center for Democracy and Technology, OpenTheGovernment.org and Sunlight to get Congress to agree to release all CRS reports to the public.

    (Continue reading…)

  • Weekly Media Roundup – May 8, 2009

    Today, May 8th, marks the 125th birthday of Harry S Truman, our 33rd president. He once said, “Secrecy and a free, democratic government don’t mix.” Amen, Mr. President.

    Here are a few of the more interesting media mentions of Sunlight and our friends and grantees from this week:

    Monday morning, Tom Lee, a technology director at Sunlight, appeared on C-SPAN’s “Washington Journal” taking questions about Recovery.gov, the Web site set up to track spending under the federal government’s economic stimulus program. Tom is working on SubsidyScope, a project of The Pew Charitable Trusts, that looks at the role of federal subsidies in the economy. Below is the video of the segment:

    Speaking of Recovery.gov, Matt Kelley with USA Today reported that the Web site won’t have details on contracts and grants until October and may not be complete until next spring — halfway through the program. Kelley quotes Greg Elin, Sunlight’s chief evangelist, saying people accustomed to getting easily searchable information quickly could be frustrated. “If we have to wait until October to get the information or to the end of the year to get a powerful recovery.gov site, the Obama administration will have missed an important opportunity.”

    (Continue reading…)

  • S. Res. 118 – Free CRS Reports

    Once again, Sen. Joe Lieberman (Conn.) has introduced a resolution in the Senate to put non-confidential Congressional Research Service (CRS) reports online. Heather West at the Center for Democracy & Technology’s “PolicyBeta” blog  writes that a solid bi-partisan group of senators have joined Lieberman as co-sponsors. S. Res. 118 is a Senate resolution, which means the Senate Rules Committee and an overall Senate vote are all that’s needed to open the reports to the public — who paid for them to be produced in the first place.

    CRS is a $100 million funded “think tank” housed in the Library of Congress that researches and writes reports for Congressional lawmakers and their staff on current topics. They include serious and smart analysis, and the reports are well worth reading if you are interested in the hot issues of the day. These reports exist on an internal server on the Hill, but the public is denied access to them. The only way you can get them in by calling a lawmaker’s office and requesting a copy. (Of course, how do you know to ask about a report if its existence isn’t publicly listed someplace…A classic Washington Catch-22.)

    (Continue reading…)

  • New Bill to Make CRS Reports Widely Available

    Yesterday, Sen. Joe Lieberman introduced a resolution (S. Res. 118), with a bipartisan cast of cosponsors, to allow for the public release of Congressional Research Service (CRS) reports. CRS reports are some of the best research documents in the nation and are currently used by lawmakers and their staff to inform their decisions and help in crafting legislation. Currently, CRS reports are not supposed to be released to the public, however, some web sites collect them from lawmaker offices distributing them anonymously. Many of these sites are pay sites, save for Open CRS, which is operated by the Center for Democracy and Technology (CDT).

    CDT lauds the new Lieberman resolution in a blog post:

    The public can also purchase copies of the reports from CRS report resellers, but obtaining copies of all the reports that are relevant would cost a great deal of money for reports that are entirely taxpayer funded in the first place.

    Senate Resolution 118 would change that by allowing lawmakers to provide access to CRS services to the public on official website. Rather than creating a new tool for public access, the resolution would let Members and Committees share reports with the public using the same online services that are available on Congress’ internal CRS website.

    Critically, the new resolution also requires that an index of CRS issue briefs and reports to be made public. Currently, Open CRS receives updates on reports as they are published from an anonymous lawmaker, but a public index of reports would simplify this process. It would be simple to provide this index, and to let the public know what their lawmakers are reading- and for them to read it too. It is high time for an officially sanctioned, free way to distribute the reports to the people.

    This is a resolution that deserves strong support. The free release of CRS reports has always been a top priority of The Open House Project.

  • Show Us The Data: Most Wanted Federal Documents

    On the occasion of Sunlight Week, our colleagues (and grantees) at the Center for Democracy and Technology (CDT) and OpenTheGovernment.org are releasing “Show Us The Data: Most Wanted Federal Documents” (PDF), a report based on the results of a  survey  funded by Sunlight and a Web-based collaborative tool created by Sunlight Labs. It cites documents and data that the federal government should make easier to find and to use, and recommends policy changes to make government more open.

    Similar reports OTG and CDT have compiled in the past have shown national security concerns lead to too much secrecy. But not any more. Here’s a list of the top 10 most wanted government documents, according to the survey:

    1.     Public Access to All Congressional Research Service Reports
    2.     Information About the Use of TARP and Bailout Funds
    3.     Open and Accessible Federal Court Documents Through the PACER System
    4.     Current Contractor Projects
    5.     Court Settlements Involving Federal Agencies
    6.     Access to Comprehensive Information About Legislation and Congressional Actions  via THOMAS or Public Access to Legislative Information Service
    7.     Online Access to Electronic Campaign Disclosures
    8.     Daily Schedules of the President and Cabinet Officials
    9.     Personal Financial Disclosures from Policymakers Across Government
    10.     State Medicaid Plans and Waivers

    Those involved in writing the report include Patrice McDermott and Amy Fuller of OpenTheGovernment.org and Ari Schwartz, Jud Watkins, and Heather West of CDT. Sunlight’s own Bill Allison, Ali Felski, James Turk and Clay Johnson lent their hands in making it all happen, as well.

    Check it out!