Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

December 06 2012

The United States (Code) is on Github

When Congress launched Congress.gov in beta, they didn’t open the data. This fall, a trio of open government developers took it upon themselves to do what custodians of the U.S. Code and laws in the Library of Congress could have done years ago: published data and scrapers for legislation in Congress from THOMAS.gov in the public domain. The data at github.com/unitedstates is published using an “unlicense” and updated nightly. Credit for releasing this data to the public goes to Sunlight Foundation developer Eric Mill, GovTrack.us founder Josh Tauberer and New York Times developer Derek Willis.

“It would be fantastic if the relevant bodies published this data themselves and made these datasets and scrapers unnecessary,” said Mill, in an email interview. “It would increase the information’s accuracy and timeliness, and probably its breadth. It would certainly save us a lot of work! Until that time, I hope that our approach to this data, based on the joint experience of developers who have each worked with it for years, can model to government what developers who aim to serve the public are actually looking for online.”

If the People’s House is going to become a platform for the people, it will need to release its data to the people. If Congressional leaders want THOMAS.gov to be a platform for members of Congress, legislative staff, civic developers and media, the Library of Congress will need to release structured legislative data. THOMAS is also not updated in real-time, which means that there will continue to be a lag between a bill’s introduction and the nation’s ability to read the bill before a vote.

Until that happens, however, this combination of scraping and open source data publishing offers a way forward on Congressional data to be released to the public, wrote Willis, on his personal blog:

Two years ago, there was a round of blog posts touched off by Clay Johnson that asked, “Why shouldn’t there be a GitHub for data?” My own view at the time was that availability of the data wasn’t as much an issue as smart usage and documentation of it: ‘We need to import, prune, massage, convert. It’s how we learn.’

Turns out that GitHub actually makes this easier, and I’ve had a conversion of sorts to the idea of putting data in version control systems that make it easier to view, download and report issues with data … I’m excited to see this repository grow to include not only other congressional information from THOMAS and the new Congress.gov site, but also related data from other sources. That this is already happening only shows me that for common government data this is a great way to go.

In the future, legislation data could be used to show iterations of laws and improve the ability of communities at OpenCongress, POPVOX or CrunchGov to discover and discuss proposals. As Congress incorporates more tablets on the floor during debates, such data could also be used to update legislative dashboards.

The choice to use Github as a platform for government data and scraper code is another significant milestone in a breakout year for Github’s use in government. In January, the British government committed GOV.UK code to Github. NASA, after contributing its first code in January added 11 code repositories this year. In August, the White House committed code to Github. In September, the Open Gov Foundation open sourced the MADISON crowd sourced legislation platform.

The choice to use Github for this scraper and legislative data, however, presents a new and interesting iteration in the site’s open source story.

“Github is a great fit for this because it’s neutral ground and it’s a welcoming environment for other potential contributors,” wrote Sunlight Labs director Tom Lee, in an email. “Sunlight expects to invest substantial resources in maintaining and improving this codebase, but it’s not ours: we think the data made available by this code belongs to every American. Consequently the project needed to embrace a form that ensures that it will continue to exist, and be free of encumbrances, in a way that’s not dependent on any one organization’s fortunes.”

Mill, an open government developer at Sunlight Labs, shared more perspective in the rest of our email interview, below.

Is this based on the GovTrack.us scraper?

Eric Mill: All three of us have contributed at least one code change to our new THOMAS scraper; the majority of the code was written by me. Some of the code has been taken or adapted from Josh’s work.

The scraper that currently actively populates the information on GovTrack is an older Perl-based scraper. None of that code was used directly in this project. Josh had undertaken an incomplete, experimental rewrite of these scrapers in Python about a year ago (code), but my understanding is it never got to the point of replacing GovTrack’s original Perl scripts.

We used the code from this rewrite in our new scraper, and it was extremely helpful in two ways &mddash; providing a roadmap of how THOMAS’ URLs and sitemap work, and parsing meaning out of the text of official actions.

Parsing the meaning out of action text is, I would say, about half the value and work of the project. When you look at a page on GovTrack or OpenCongress and see the timeline of a bill’s life — “Passed House,” “Signed by the President,” etc. — that information is only obtainable by analyzing the order and nature of the sentences of the official actions that THOMAS lists. Sentences are finicky, inconsistent things, and extracting meaning from them is tricky work. Just scraping them out of THOMAS.gov’s HTML is only half the battle. Josh has experience at doing this for GovTrack. The code in which this experience was encapsulated drastically reduced how long it took to create this.

How long did this take to build?

Eric Mill: Creating the whole scraper, and the accompanying dataset, was about 4 weeks of work on my part. About half of that time was spent actually scraping — reverse engineering THOMAS’ HTML — and the other half was spent creating the necessary framework, documentation, and general level of rigor for this to be a project that the community can invest in and rely on.

There will certainly be more work to come. THOMAS is shutting down in a year, to be replaced by Congress.gov. As Congress.gov grows to have the same level of data as THOMAS, we’ll gradually transition the scraper to use Congress.gov as its data source.

Was this data online before? What’s new?

Eric Mill: All of the data in this project has existed in an open way at GovTrack.us, which has provided bulk data downloads for years. The Sunlight Foundation and OpenCongress have both created applications based on this data, as have many other people and organizations.

This project was undertaken as a collaboration because Josh and I believed that the data was fundamental enough that it should exist in a public, owner-less commons, and that the code to generate it should be in the same place.

There are other benefits, too. Although the source code to GovTrack’s scrapers has been available, it depends on being embedded in GovTrack’s system, and the use of a database server. It was also written in Perl, a language less widely used today, and produced only XML. This new Python scraper has no other dependencies, runs without a database, and generates both JSON and XML. It can be easily extended to output other data formats.

Finally, everyone who worked on the project has had experience in dealing with legislative information. We were able to use that to make various improvements to how the data is structured and presented that make it easier for developers to use the data quickly and connect it to other data sources.

Searches for bills in Scout use data collected directly from this scraper. What else are people doing with the data?

Eric Mill: Right now, I only know for a fact that the Sunlight Foundation is using the data. GovTrack recently sent an email to its developer list announcing that in the near future, its existing dataset would be deprecated in favor of this new one, so the data should be used in GovTrack before long.

Pleasantly, I’ve found nearly nothing new by switching from GovTrack’s original dataset to this one. GovTrack’s data has always had a high level of quality. So far, the new dataset looks to be as good.

Is it common to host open data on Github?

Eric Mill: Not really. Github’s not designed for large-scale data hosting. This is an experiment to see whether this is a useful place to host it. The primary benefit is that no single person or organization (besides Github) is paying for download bandwidth.

The data is published as a convenience, for people to quickly download for analysis or curiosity. I expect that any person or project that intends to integrate the data into their work on an ongoing basis will do so by using the scraper, not downloading the data repeatedly from Github. It’s not our intent that anyone make their project dependent on the Github download links.

Laudably, Josh Tauberer donated his legislator dataset and converted it to YAML. What’s YAML?

Eric Mill: YAML is a lightweight data format intended to be easy for humans to both read and write. This dataset, unlike the one scraped from THOMAS, is maintained mostly through manual effort. Therefore, the data itself needs to be in source control, it needs to not be scary to look at and it needs to be obvious how to fix or improve it.

What’s in this legislator dataset? What can be done with it?

Eric Mill: The legislator dataset contains information about members of Congress from 1789 to the present day. It is a wealth of vital data for anyone doing any sort of application or analysis of members of Congress. This includes a breakdown of their name, a crosswalk of identifiers on other services, and social media accounts. Crucially, it also includes a member of Congress’ change in party, chamber, and name over time.

For example, it’s a pretty necessary companion to the dataset that our scraper gathers from THOMAS. THOMAS tells you the name of the person who sponsored this bill in 2003, and gives you a THOMAS-specific ID number. But it doesn’t tell you what that person’s party was at the time, or if the person is still a member of the same chamber now as they were in 2003 (or whether they’re in office at all). So if you want to say “how many Republicans sponsored bills in 2003,” or if you’d like to draw in information from outside sources, such as campaign finance information, you will need a dataset like the one that’s been publicly donated here.

Sunlight’s API on members of Congress is easily the most prominent API, widely used by people and organizations to build systems that involve legislators. That API’s data is a tiny subset of this new one.

You moved a legal citation and extractor into this code. What do they do here?

Eric Mill: The legal citation extractor, called “Citation,” plucks references to the US Code (and other things) out of text. Just about any system that deals with legal documents benefits from discovering links between those documents. For example, I use this project to power US Code searches on Scout, so that the site returns results that cite some piece of the law, regardless of how that citation is formatted. There’s no text-based search, simple or advanced, that would bring back results matching a variety of formats or matching subsections — something dedicated to the arcane craft of citation formats is required.

The citation extractor is built to be easy for others to invest in. It’s a stand-alone tool that can be used through the command line, HTTP, or directly through JavaScript. This makes it suitable for the front-end or back-end, and easy to integrate into a project written in any language. It’s very far from complete, but even now it’s already proven extremely useful at creating powerful features for us that weren’t possible before.

The parser for the U.S. Code itself is a dataset, written by my colleague Thom Neale. The U.S. Code is published by the government in various formats, but none of them are suitable for easy reuse. The Office of Law Revision Counsel, which publishes the U.S. Code, is planning on producing a dedicated XML version of the US Code, but they only began the procurement process recently. It could be quite some time before it appears.

Thom’s work parses the “locator code” form of the data, which is a binary format designed for telling GPO’s typesetting machines how to print documents. It is very specialized and very complicated. This parser is still in an early stage and not in use in production anywhere yet. When it’s ready, it’ll produce reliable JSON files containing the law of the United States in a sensible, reusable form.

Does Github’s organization structure makes a data commons possible?

Eric Mill: Github deliberately aligns its interests with the open source community, so it is possible to host all of our code and data there for free. Github offers unlimited public repositories, collaborators, bandwidth, and disk space to organizations and users at no charge. They do this while being an extremely successful, profitable business.

On Github, there are two types of accounts: users and organizations. Organizations are independent entities, but no one has to log in as an organization or share a password. Instead, at least one user will be marked as the “owner” of an organization. Ownership can easily change hands or be distributed amongst various users. This means that Josh, Derek, and I can all have equal ownership of the “unitedstates” repositories and data. Any of us can extend that ownership to anyone we want in a simple, secure way, without password sharing.

Github as a company has established both a space and a culture that values the commons. All software development work, from hobbyist to non-profit to corporation, from web to mobile to enterprise, benefits from a foundation of open source code. Github is the best living example of this truth, so it’s not surprising to me that it was the best fit for our work.

Why is this important to the public?

Eric Mill: The work and artifacts of our government should be available in bulk, for easy download, in accessible formats, and without license restrictions. This is a principle that may sound important and obvious to every technologist out there, but it’s rarely the case in practice. When it is, the bag is usually mixed. Not every member of the public will be able or want to interact directly with our data or scrapers. That’s fine. Developers are the force multipliers of public information. Every citizen can benefit somehow from what a developer can build with government information.

Related:

September 20 2012

Congress launches Congress.gov in beta, doesn’t open the data

The Library of Congress is now more responsive — at least when it comes to web design. Today, the nation’s repository for its laws launched a new beta website at Congress.gov and announced that it would eventually replace Thomas.gov, the 17-year-old website that represented one of the first significant forays online for Congress. The new website will educate the public looking for information on their mobile devices about the lawmaking process, but it falls short of the full promise of embracing the power of the Internet. (More on that later).

Tapping into a growing trend in government new media, the new Congress.gov features responsive design, adapting to desktop, tablet or smartphone screens. It’s also search-centric, with Boolean search and, in an acknowledgement that most of its visitors show up looking for information, puts a search field front and center in the interface. The site includes member profiles for U.S. Senators and Representatives, with associated legislative work. In a nod to a mainstay of social media and media websites, the new Congress.gov also has a “most viewed bills” list that lets visitors see at a glance what laws or proposals are gathering interest online. (You can download a fact sheet on all the changes as a PDF).

On the one hand, the new Congress.gov is a dramatic update to a site that desperately needed one, particularly in a historic moment where citizens are increasingly connecting to the Internet (and one another) through their mobile devices.

On the other hand, the new Congress.gov beta has yet to realize the potential of Congress publishing bulk open legislative data. There is no application programming interface (API) for open government developers to build upon. In many ways, the new Congress.gov replicates what was already available to the public at sites like Govtrack.us and OpenCongress.org.

In response to my tweets about the site, former law librarian Meg Lulofs Kuhagan (@librarylulu) noted on Twitter that there’s “no data whatsoever, just window dressing” in the new site — but that “it looks good on my phone. More #opengov if you have a smartphone.”

Aaron E. Myers, the director of new media for Senator Major Leader Harry Reid, commented on Twitter that legislative data is a “tough nut to crack,” with the text of amendments, SCOTUS votes and treaties missing from new Congress.gov. In reply, Chris Carlson, the creative director for the Library of Congress, tweeted that that information is coming soon and that all the data that is currently in Thomas.gov will be available on Congress.gov.

Emi Kolawole, who reviewed the new Congress.gov for the Washington Post, reported that more information, including the categories Meyers cited, will be coming to the site soon, during its beta, including the Congressional Record and Index. Here’s hoping that Congress decides to publish all of its valuable Congressional Research Reports, too. Currently, the public has to turn to OpenCRS.com to access that research.

Carlson was justifiably proud of the beta of Congress.gov: “The new site has clean URLs, powerful search, member pages, clean design,” he tweeted. “This will provide access to so many more people who only have a phone for internet.”

While the new Congress.gov is well designed and has the potential to lead to more informed citizens, the choice to build a new website versus release the data disappointed some open government advocates.

“Another hilarious/clueless misallocation of resources,” commented David Moore, co-founder of OpenCongress. “First liberate bulk open gov data; then open API; then website.”

“What’s noticeable about this evolving beta website, besides the major improvements in how people can search and understand legislative developments, is what’s still missing: public comment on the design process and computer-friendly bulk access to the underlying data,” wrote Daniel Schuman, legislative counsel for the Sunlight Foundation. “We hope that Congress will now deeply engage with the public on the design and specifications process and make sure that legislative information is available in ways that most encourage analysis and reuse.”

Kolawole asked Congressional officials about bulk data access and an API and heard that the capacity is there but the approval is not. “They said the system could handle it, but they haven’t received congressional auth. to do it yet,” she tweeted.

Vision and bipartisan support for open government on this issue does exist among Congressional leadership. There has been progress on this front in the 112th Congress: the U.S. House started publishing machine-readable legislative data at docs.house.gov this past January.

“Making legislative data easily available in machine-readable formats is a big victory for open government, and another example of the new majority keeping its pledge to make Congress more open and accountable,” said Speaker of the House John Boehner.

Last December, House Minority Whip Steny Hoyer commented upon on how technology is affecting Congress, his caucus and open government in the executive branch:

For Congress, there is still a lot of work to be done, and we have a duty to make the legislative process as open and accessible as possible. One thing we could do is make THOMAS.gov — where people go to research legislation from current and previous Congresses — easier to use, and accessible by social media. Imagine if a bill in Congress could tweet its own status.

The data available on THOMAS.gov should be expanded and made easily accessible by third-party systems. Once this happens, developers, like many of you here today, could use legislative data in innovative ways. This will usher in new public-private partnerships that will empower new entrepreneurs who will, in turn, yield benefits to the public sector.

For any of that vision of civic engagement and entrepreneurship to can happen around Web, the Library of Congress will need to fully open up the data. Why hasn’t it happened yet, given bipartisan support and a letter from the Speaker of the House?

techPresident managing editor Nick Judd asked the Library of Congress about Congress.gov. The director of the communications for the Library of Congress, Gayle Osterberg, suggested in an email in response that Congress hasn’t been clear about the manner for data release.

“Congress has said what to do on bulk access,” commented Schuman. “See the joint explanatory statement. “There is support for bulk access.”

In June 2012, the House’s leadership has issued a bipartisan statement that adopted the goal of “provid[ing] bulk access to legislative information to the American people without further delay,” putting releasing bulk data among its “top priorities in the 112th Congress” and directed a task force “to begin its important work immediately.”

The 112th Congress will come to a close soon. The Republicans swept into the House in 2010 promising a new era of innovation and transparency. If Speaker Boehner, Rep. Hoyer and their colleagues want to end these two divisive years on a high note, fully opening legislative data to the People would be an enduring legacy. Congressional leaders will need to work with the Library of Congress to make that happen.

All that being said, the new Congress.gov is in beta and looks dramatically improved. The digital infrastructure of the federal legislative system got a bit better today, moving towards a more adaptive government. Stay tuned, and give the Library of Congress (@LibraryCongress) some feedback: there’s a new button for it on every page.

This post has been updated with comments from Facebook, a link and reporting from techPresident, and a clarification from Daniel Schuman regarding the position of the House of Representatives.

July 25 2012

Mr. Issa logs on from Washington

To update an old proverb for the Information Age, digital politics makes strange bedfellows. In the current polarized atmosphere of Washington, certain issues create more interesting combinations than others.

In that context, it would be an understatement to say that’s been interesting to watch how Representative Darrell Issa (CA-R) has added his voice to the open government and Internet policy community over the last several years.

Rep. Issa was a key member of the coalition of open government advocates, digital rights advocates, electronic privacy wonks, Internet entrepreneurs, nonprofits, media organizations and congressmen that formed a coalition to oppose the passage of the Stop Online Piracy Act (SOPA) and PROTECT IP Act (PIPA) this winter. Rep. Issa strongly opposed SOPA after its introduction last fall and, working with key allies on the U.S. House Judicial Committee, effectively filibustered its advance by introducing dozens of amendments during the bill’s markup.

The delay created time over Congress’ holiday recess for opposition to SOPA and its companion bill in the Senate (The PROTECT IP Act) to build, culminating in a historic “black out day” on January 18, 2012. Both bills were halted.

While he worked across the aisle on SOPA and PIPA, Rep. Issa has been fiercely partisan in other respects, using his powerful position as the chairman of the U.S. House Oversight and Government Reform Committee to investigate various policy choices and actions of the Obama administration and federal agencies. During the same time period, he’s also become one of the most vocal proponents of open government data and Internet freedom in Congress, from drafting legislation to standardize federal finance data to opposing bills that stood to create uncertainty in the domain name system. He also sponsored the ill-conceived Research Works Act, which expired after received fierce criticism from open access advocates.

In recent years, Rep. Issa and his office have used the Web and social media to advance his legislative agenda, demonstrating in the process a willingness to directly engage with citizens and public officials alike on Twitter as @DarrellIssa, even to the extent of going onto Reddit to personally do an “Ask Me Anything.” Regardless of where one stands on his politics, the extent to which he and staff have embraced using the Web to experiment with more participatory democracy have set an example that perhaps no other member of Congress has matched.

In June 2012, I interviewed Rep. Issa over the phone, covering a broader range of his legislative and oversight work, including the purpose of this foundation and his views on regulation, open data, and technology policy in general. More context on other political issue, his personal life, business background and political career can be found at his Wikipedia entry and in Ryan Lizza’s New Yorker feature.

Our interview, lightly edited for content and clarity, is broken out into a series of posts that each explore different aspects of the conversation. Below, we talk about open government data and his new “Open Gov Foundation.”

What is the Open Gov Foundation?

In June, Representative Darrell Issa (R-CA) launched an “Open Gov Foundation” at the 2012 Personal Democracy Forum. Rep. Issa said then the foundation would institutionalize the work he’s done while in office, in particular “Project MADISON,” the online legislative markup software that his technology staff and contractors developed and launched after the first Congressional hackathon last December. If you visit the Open Gov Foundation website, you’ll read language about creating “platforms” for government data, from regulatory data to legislative data.

Congressman Issa’s office stated that this Open Gov Foundation will be registered as a non-partisan 501c3 by mid-fall 2012. A year from now, he would like to have made “major headway” on the MADISON project working in a number of different places, not just federal House but elsewhere.

For that to happen, MADISON code will almost certainly need to be open sourced, a prospect that the Congressman indicated is highly likely to in our interview, and integrated into other open government projects. On that count, Congressman Issa listed a series of organizations that he admired in the context of open government work, including the Sunlight Foundation, Govtrack, public.resource.org, the New York State Senate, OpenCongress and the Open Knowledge Foundation

Th general thrust of his admiration, said the Congressman, comes from that fact that these people are not just working hard to get government data out there, to deliver raw data, but to build things that are useful and that use that government data, helping to build tools that help bridge the gap for citizens.

What do you hope to achieve with the Open Government Foundation?

Rep. Issa: I’ve observed over 12 years that this expression that people use in Congress is actually a truism. And the expression they use is you’re entitled to your opinion but not your facts.

Well, the problem in government is that, in fact, facts seem to be very fungible. People will have their research, somebody will have theirs. Their ability to get raw data in a format where everybody can see it and then reach, if you will, opinions as to what it means, tends to be limited.

The whole goal that I’d like to have, whether it’s routing out waste and fraud — or honestly knowing what somebody’s proposal is, let’s just say SOPA and PIPA — is [to] get transparency in real-time. Get it directly to any and all consumers, knowing that in some cases, it can be as simple as a Google search by the public. In other cases, there would need to be digesting and analysis, but at least the raw data would be equally available to everyone.

What that does is it eliminates one of the steps that people like Ron Wyden and myself find ourselves in. Ron and I probably reach different conclusions if we’re given the same facts. He will see the part of the cup that is empty and needs government to fill it. And I will see the part that exists only because government isn’t providing all of the answers. But first, we have to have the same set of facts. That’s one of the reasons that a lot of our initiatives absolutely are equally desired by the left and the right, even though once we have the facts, we may reach different conclusions on policy.

Does you that mean more bulk data from Congress, which you supported with an amendment to a recent appropriations bill?

Rep. Issa: Let’s say it’s not about the quantity of data; it’s about whether or not there’s meaningful metadata attached to it. If you want to find every penny being spent on breast cancer research, there’s no way to compare different programs, different dollars in different agencies today. And yet, you may want to find that.

What we learned with the control board — or the oversight board that went with the stimulus — was that you’ve got to bring together all of the data if you’re going to find, if you will, people who are doing the same things in different parts of government and not have to find out only forensically after you’ve had rip-off artists rip-off the government.

The other example is on the granting of grants and other programs. That’s what we’re really going for in the DATA Act: to get that level of information that can, in fact, be used across platforms to find like data that becomes meaningful information.

Do you think more open government data remove some of the asynchronies of information around D.C.?

Issa:A lot of people have monetized the compiling of data in addition to monetizing the consulting as to what its meaning is. What we would like to do is take the monetization of data and take it down to a number that is effectively zero. Analysis by people who really have value-added will always be part of the equation.

Do you envision putting the MADISON Code onto GitHub, for open source developers in this country and around the world to use and deploy in their own legislatures if they wish?

Rep. Issa: Actually, the reason that we’ve formed a public nonprofit is for just that reason. I don’t want to own it or control it or to produce it for any one purpose, but rather, a purpose of open government. So if it spawns hundreds of other not-for-profits, that’s great. If people are able to monetize some of the value provided by that service, then I can also live with that.

I think once you create government information and, for that matter, appropriate private sector information, in easier and easier to use formats, people will monetize it. Oddly enough, they’ll monetize it for a fairly low price, because that which is easy, but you have to create value at a low cost. That which is hard, you can charge a fortune to provide that information to those who need it.

Will you be posting the budget of the Open Gov Foundation in an open format so people know where the funding is coming from and what it’s being spent on?

Rep. Issa: Absolutely. Although, at this point, we’re not inviting any other contributions of cash, we will take in-kind contributions. But at least for the short run, I’ll fund it out of my own private foundation. Once we have a board established and a set of policies to determine the relationships that would occur in the way of people who might contribute, then we’ll open it up. And at that point, the posting would become complex. Right now, it’s fairly easy: whatever money it needs, the Issa Family Foundation will provide to get this thing up and going.

Uncertain prospects for the DATA Act in the Senate

The old adage that “you can’t manage what you can’t measure” is often applied to organizations in today’s data-drenched world. Given the enormity of the United States federal government, breaking down the estimated $3.7 trillion dollars in the 2012 budget into its individual allocations, much less drilling down to individual outlays to specific programs and subsequent performance, is no easy task. There are several sources for policy wonks to turn use for applying open data to journalism, but the flagship database of federal government spending at USASpending.gov simply isn’t anywhere near as accurate as it needs to be to source stories. The issues with USASpending.gov have been extensively chronicled by the Sunlight Foundation in its ClearSpending project, which found that nearly $1.3 trillion of federal spending as reported on the open data website was inaccurate.

If the people are to gain more insight into how their taxes are being spent, Congress will need to send President Obama a bill to sign to improve the quality of federal spending data. In the spring of 2012, the U.S. House passed by unanimous voice vote the DATA Act, a signature piece of legislation from Representative Darrell Issa (R-CA). H.R. 2146 requires every United States federal government agency to report its spending data in a standardized way and establish uniform reporting standards for recipients of federal funds.

“The DATA Act will transform how we are able to monitor government spending online,” said Ellen Miller, co-founder and executive director of the Sunlight Foundation, in a prepared statement. “We’ve said time and time again that transparency is not a partisan issue, and we are proud to see there was broad support across the aisle for the bill. The DATA Act will increase transparency for federal spending data and expand when, where and how it is available online,” said Ellen Miller, co-founder and executive director of the Sunlight Foundation. The DATA Act also received support from a broad range of other open government stalwarts, from OMB Watch to Citizens for Responsibility and Ethics in Washington (CREW):

Orgs in Support of DATA Act

Discussing DATA

I spoke with Rep. Issa, who serves as the chairman of the U.S. House Government Reform and Oversight Committee, about the DATA Act and the broader issues around open government data at the Strata Conference in New York City.

Daniel Schuman, the Sunlight Foundation’s legislative counsel, summarized our conversation on open government data over at the Sunlight Foundation’s blog. Video of our discussion is embedded below.

Rep. Issa: …when I work with [Inspector Generals], they would love to have access to predictive [data analytics tools]. Today, they only have forensic. And in many cases, they have like stove pipe forensic. They only know after the fact, a portion of the data, and it frustrates them. We’re going to change that.

The DATA Act is bipartisan, which here in Washington is very unusual. One of the reasons is that people who want to know from the left and the right want to be in the know. We believe that by mandating standard reporting and a process of greater transparency and, of course, the tools created to make this easy and inexpensive for the private sector to participate in will give us an opportunity which will at some time be used by the left or the right or often used by simply people who have a vested interested in advising the private sector accurately on what is, has and will become events in government or for that matter, events in the private sector that are being aggregated through the government.

Your industry is going to be essential because if we give you more accurate, more easily compiled data, unless you turn it into information that’s valuable, we haven’t really accomplished what we want to. The same is true, though, unless you do it, my IGs won’t have private sector solutions that allow them to pick up COTS or near COTS solutions that are affordable and valuable and use them in evaluating government to drive out waste and fraud in government.

What’s next for the DATA Act?

The Senate version of the DATA Act, which is sponsored by Senator Mark Warner (D-VA) remains “pending” in the Homeland Security and Government Affairs Committee after a hearing last week, despite the considerable efforts of a new Data Transparency Coalition to move the bill. The hearing came one week after the coalition held a public DATA Demo Day that featured technology companies demonstrating different uses of standardized federal spending data, including claims that it could have prevented the scandal over excessive conference spending in the General Services Agency.

At the hearing, Senator Warner proposed an amended version of the DATA Act that would drop the independent board modeled on the Recovery Accountability and Transparency Board that oversaw spending from the American Recovery and Reinvestment Act of 2009, as Joseph Marks reported for Next Gov.

The DATA Act, however, received a hearing but not a markup, as Daniel Schuman, the legislative counsel of the Sunlight Foundation, wrote at the transparency advocate’s blog. For those who aren’t well versed in the legislative process, “markups” are when amendments are considered. The DATA Act will have to pass through the HSGAC committee to get to the floor of the Senate.) In his summary of the hearing, Schuman highlighted the opposition of the Office of Management and Budget and U.S. Treasury Department to the Act’s provisions:

Gene Dodaro, the Comptroller General, testified about a newly-issued GAO report on federal spending transparency, which alternatively praised and criticized OMB’s efforts to comply with legislation to improve information availability. During the Q&A, Dodaro explained that it may be helpful for Congress to enact legislation declaring what spending information it wants to have available to the public, as a way of establishing priorities and direction.

OMB Controller Daniel Werfel’s testimony [PDF] focused on OMB’s efforts to improve the accuracy and availability of spending information, largely defending the administration’s record. During the Q&A, Werfel emphasized that new legislation is not necessary to implement spending transparency as the administration already has the necessary authority. While his testimony highlighted the administration’s claims of what it has accomplished, it did not engage the concerns that OMB has dragged its feet over the last 4 years, or that OMB — as an arm of the president — may have mixed incentives about releasing potentially politically damaging information. He did explain that OMB has not released a statement of administration policy on the DATA Act, but that OMB (unsurprisingly) is less than enthusiastic about shifting responsibility over standard-setting and implementation to an independent body.

Treasury Department Assistant Fiscal Secretary Richard Gregg testified [PDF] about ongoing internal efforts at Treasury to improve data quality and projects that will yield results in the future. During the Q&A, Gregg explained that legislation isn’t needed for financial transparency, leadership in the executive branch would be sufficient. This raises the question of whether sufficient leadership is being exercised.

The question of leadership that Schuman raised is a good one, as is one regarding incentives. During July’s International Open Government Data Conference in DC, Kaitlin Bline, the senior developer working on the Sunlight Foundation’s Clearspending project, said that the problems with USASpending.gov government spending data come from oversight, not technology.

Bline was blunter in her post on aGeneral Accounting Office, Congressional committees performing oversight of federal agencies, or special commission, notably the Truman committee during World War II. In the decades since, the work of inspector generals and Congressional staffers has been augmented by fraud detection technology, a critical innovation given the estimated $70 billion dollars in improper payments made by the federal government within the Medicare and Medicaid programs alone. (The fraud detection technology that was developed at PayPal and spun out into Palantir Technologies, in fact, has been deployed to that end.)

The promise of standardizing federal spending data, grant data — or performance data — is that those entrusted with oversight could be empowered with predictive data analytics tools and teams to discover patterns and shift policy to address them.

While the huge budget deficit in the United States is highly unlikely to be closed by cutting fraud and waste alone, making federal spending machine-readable and putting it online clearly holds promise to save taxpayers dollars. First, however, the quality of government spending data must be improved.

Important questions about the DATA Act remain, from the cost of its implementation for cities and states, which would have to report federal grants, to the overall cost of the bill to federal government. The Congressional Budget Office estimated that the DATA Act would cost the government $575 million to implement over 5 years. In response to the CBO, House Oversight staff have estimated that annual savings from standards and centralized spending database that would more than offset that outlay, including:

  • $41 million in funds recovered from questionable recipients
  • $63 million in funds withheld from questionable recipients
  • $5 billion in savings recommended by inspectors general
  • unknown savings resulting from better internal spending control and better oversight by Congressional appropriators.

No formal subsequent action on the DATA Act has been scheduled in the Senate and, with the August recess looming and many eyes turning to cybersecurity legislation, there are uncertain prospects for its passage in this election year’s legislative calendar.

The need for the federal government, watchdogs and the people to be able to accurately track the spending of taxpayer dollars through high quality open government data, however, remains acute.

April 27 2012

Passage of CISPA in the U.S. House highlights need for viable cybersecurity legislation

To paraphrase Ben Franklin, he who sacrifices online freedom for the sake of cybersecurity deserves neither. Last night, the Cyber Intelligence Sharing and Protection Act (CISPA) (H.R. 3523) through the United States House of Representatives was sent to a vote a day earlier than scheduled. CISPA passed the House by a vote of 250-180, defying a threatened veto from the White House. The passage of CISPA now sets up a fierce debate in the Senate, where Senate Majority Leader Harry Reid (D-NV) has indicated that he wishes to bring cybersecurity legislation forward for a vote in May.

The votes on H.R. 3523 broke down along largely partisan lines, although dozens of both Democrats and Republicans voted for or against CISPA it in the finally tally. CISPA was introduced last November and approved by the House Intelligence Committee by a 17-1 vote before the end of 2011, which meant that the public has had months to view and comment upon the bill. The bill has 112 cosponsors and received no significant opposition from major U.S. corporations, including the social networking giants and telecommunications companies who would be subject to its contents.

In fact, as an analysis of campaign donations by Maplight showed, over the past two years interest groups that support CISPA have outspent those that oppose it by 12 to 1, ranging from defense contractors, cable and satellite TV providers, software makers, cellular companies and online computer services.

While the version of CISPA that passed shifted before the final vote, ProPublica's explainer on CISPA remains a useful resource for people who wish to understand its contents. Declan McCullagh, CNET's tech policy reporter, has also been following the bill closely since it was introduced and he has published an excellent FAQ explaining how CISPA would affect you.

As TechDirt observed last night, the final version of CISPA — available as a PDF from docs.house.gov contained more scope on the information types collected in the name of security. Specifically, CISPA now would allow the federal government to use information for the purpose of investigation and prosecution of cybersecurity crimes, protection of individuals, and the protection of children. In this context, a "cybersecurity crime" would be defined as any crime that involves network disruption or "hacking."

Civil libertarians, from the Electronic Frontier Foundation (EFF) to the American Civil Liberties Union, have been fiercely resisting CISPA for months. "CISPA goes too far for little reason," said Michelle Richardson, the ACLU legislative counsel, in a statement on Thursday. "Cybersecurity does not have to mean abdication of Americans' online privacy. As we've seen repeatedly, once the government gets expansive national security authorities, there's no going back. We encourage the Senate to let this horrible bill fade into obscurity."

Today, there is widespread alarm online over the passage of CISPA, from David Gewirtz calling it heinous at ZDNet to Alexander Furnas exploring its troubling aspects to it being called a direct threat to Internet privacy over at WebProNews.

The Center for Democracy and Technology issued a statement that it was:

"... disappointed that House leadership chose to block amendments on two core issues we had long identified — the flow of information from the private sector directly to NSA and the use of that information for national security purposes unrelated to cybersecurity. Reps. Thompson, Schakowsky, and Lofgren wrote amendments to address those issues, but the leadership did not allow votes on those amendments. Such momentous issues deserved a vote of the full House. We intend to press these issues when the Senate takes up its cybersecurity legislation."

Alexander Furnas included a warning in his nuanced exploration of the bill at The Atlantic:

"CISPA supporters — a list that surprisingly includes SOPA opponent Congressman Darrell Issa — are quick to point out that the bill does not obligate disclosure of any kind. Participation is 'totally voluntary.' They are right, of course, there is no obligation for a private company to participate in CISPA information sharing. However, this misses the point. The cost of this information sharing — in terms of privacy lost and civil liberties violated — is borne by individual customers and Internet users. For them, nothing about CISPA is voluntary and for them there is no recourse. CISPA leaves the protection of peoples' privacy in the hands of companies who don't have a strong incentive to care. Sure, transparency might lead to market pressure on these companies to act in good conscience; but CISPA ensures that no such transparency exists. Without correctly aligned incentives, where control over the data being gathered and shared (or at least knowledge of that sharing) is subject to public accountability and respectful of individual right to privacy, CISPA will inevitably lead to an eco-system that tends towards disclosure and abuse."

The context that already exists around digital technology, civil rights and national security must also be acknowledged for the purposes of public debate. As the EFF's Trevor Timm emphasized earlier this week, once national security is invoked, both civilian and law enforcement wield enormous powers to track and log information about citizens' lives without their knowledge nor practical ability to gain access to the records involved.

On that count, CISPA provoked significant concerns from the open government community, with the Sunlight Foundation's John Wonderlich calling the bill terrible for transparency because it proposes to limit public oversight of the work of information collection and sharing within the federal government.

"The FOIA is, in many ways, the fundamental safeguard for public oversight of government's activities," wrote Wonderlich. "CISPA dismisses it entirely, for the core activities of the newly proposed powers under the bill. If this level of disregard for public accountability exists throughout the other provisions, then CISPA is a mess. Even if it isn't, creating a whole new FOIA exemption for information that is poorly defined and doesn't even exist yet is irresponsible, and should be opposed."

What's the way forward?

The good news, for those concerned about what passage of the bill will mean for the Internet and online privacy, is that now the legislative process turns to the Senate. The open government community's triumphalism around the passage of the DATA Act and the gathering gloom and doom around CISPA all meet the same reality in this respect: checks and balances in the other chamber of Congress and a threatened veto from the White House.

Well done, founding fathers.

On the latter count, the White House has made it clear that the administration views CISPA as a huge overreach on privacy, driving a truck through existing privacy protections. The Obama administration has stated (PDF) that CISPA:

"... effectively treats domestic cybersecurity as an intelligence activity and thus, significantly departs from longstanding efforts to treat the Internet and cyberspace as civilian spheres. The Administration believes that a civilian agency — the Department of Homeland Security — must have a central role in domestic cybersecurity, including for conducting and overseeing the exchange of cybersecurity information with the private sector and with sector-specific Federal agencies."

At a news conference yesterday in Washington, the Republican leadership of the House characterized the administration's position differently. "The White House believes the government ought to control the Internet, government ought to set standards, and government ought to take care of everything that's needed for cybersecurity," said Speaker of the House John Boehner (R-Ohio), who voted for CISPA. "They're in a camp all by themselves."

Representative Mike Rogers (R-Michigan) -- the primary sponsor of the bill, along with Representative Dutch Ruppersberger (D-Maryland) -- accused opponents of "obfuscation" on the House floor yesterday.

While there are people who are not comfortable with the Department of Homeland Security (DHS) holding the keys to the nation's "cyberdefense" — particularly given the expertise and capabilities that rest in the military and intelligence communities — the prospect of military surveillance of citizens within the domestic United States is not likely to be one that the founding fathers would support, particularly without significant oversight from the Congress.

CISPA does not, however, formally grant either the National Security Agency or DHS any more powers than they already hold under existing legislation, such as the Patriot Act. It would, however, enable more information sharing between private companies and government agencies, including threat information pertinent to legitimate national security concerns.

It's crucial to recognize that cybersecurity legislation has been percolating in the Senate for years now without passage. That issue of civilian oversight is a key issue in the Senate wrangling, where major bills have been circulating for years now without passage, from proposals from Senator Lieberman's office on cybersecurity to the ICE Act from Senator Carper to Senator McCain's proposals.

If the fight over CISPA is "just beginning", as Andy Greenberg wrote in Forbes today, it's important for everyone that's getting involved because of concerns over civil liberties or privacy recognizes that CISPA is not like SOPA, as Brian Fung wrote in the American Prospect, particularly after provisions regarding intellectual property were dropped:

"At some point, privacy groups will have to come to an agreement with Congress over Internet legislation or risk being tarred as obstructionists. That, combined with the fact that most ordinary Americans lack the means to distinguish among the vagaries of different bills, suggests that Congress is likely to win out over the objections of EFF and the ACLU sooner rather than later. Thinking of CISPA as just another SOPA not only prolongs the inevitable — it's a poor analogy that obscures more than it reveals."

That doesn't mean that those objections aren't important or necessary. It does mean, however, that anyone who wishes to join the debate must recognize that genuine security threats do exist, even though massive hype about a potential "Cyber 9/11" perpetuated by contractors that stand to benefit from spending continues to pervade the media. There are legitimate concerns regarding the theft of industrial secrets, "crimesourcing" by organized crime and the reality of digital agents from the Chinese, Iranian and Russian governments — along with non-state actors — exploring the IT infrastructure of the United States.

The simple reality is that in Washington, national security trumps everything. It's not like intellectual property or energy or education or healthcare. What anyone who wishes to get involved in this debate will need to do is to support an affirmative vision for what roles the federal government and the private sector should play in securing the nation's critical infrastructure against electronic attacks. And the relationship of business and government complicates cybersecurity quite a bit, as "Inside Cyber Warfare" author Jeffrey Carr explained here at Radar in February:

"Due to the dependence of the U.S. government upon private contractors, the insecurity of one impacts the security of the other. The fact is that there are an unlimited number of ways that an attacker can compromise a person, organization or government agency due to the interdependencies and connectedness that exist between both."

The good news today is that increased awareness of the issue will drive more public debate about what's to be done. During the week the Web changed Washington in January, the world saw how the Internet can act as a platform for collective action against a bill.

Civil liberties groups have vowed to continue advocating against the passage of any vaguely drafted bill in the Senate.

On Monday, more than 60 distinguished IT security professionals, academics and engineers published an open letter to Congress urging opposition to any "'cybersecurity' initiative that does not explicitly include appropriate methods to ensure the protection of users’ civil liberties."

The open question now, as with intellectual property, is whether major broadcast and print media outlets in the United States will take their role of educating citizens seriously enough for the nation to meaningfully participate in legislative action.

This is a debate that will balance the freedoms that the nation has fought hard to achieve and defend throughout its history against the dangers we collectively face in a century when digital technologies have become interwoven into the everyday lives of citizens. We live in a networked age, with new attendant risks and rewards.

Citizens should hold their legislators accountable for supporting bills that balance civil liberties, public oversight and privacy protections with improvements to how the public and private sector monitors, mitigates and shares information about network security threats in the 21st century.

January 20 2012

December 21 2011

December 12 2011

Can the People's House become a social platform for the people?

Congressional hackathon
InSourceCode developers work on "Madison" with volunteers.

There wasn't a great deal of hacking, at least in the traditional sense, at the "first congressional hackathon." Given the general shiver that the word still evokes in many a Washingtonian in 2011, that might be for the best. The attendees gathered together in the halls of the United States House of Representatives didn't create a more interactive visualization of how laws are made or a mobile health app. As open government advocate Carl Malamud observed, the "hack" felt like something even rarer in the "Age of the App for That:"

In a time when partisanship and legislative gridlock have defined Congress for many citizens, seeing the leadership of the United States House of Representatives agree on the importance of using the power of data and social networking to open government was an early Christmas present.

"Increased access, increased connection with our constituents, transparency, openness is not a partisan issue," said House Majority Leader Eric Cantor.

"The Republican leader and I may debate vigorously on many issues, but one area where we strongly agree is on making Congress more transparent and accessible," said House Democratic Whip Steny Hoyer in his remarks. "First, Congress took steps to open up the Capitol building so citizens can meet with their representatives and see the home of their legislature. In the same way, Congress is now taking steps to update how it connects with the American people online."

An open House

While the event was branded as a "Congressional Facebook Developer Hackathon," what emerged more closely resembled a loosely organized conference or camp.

Facebook executives and developers shared the stage with members of Congress to give keynotes to the 200 or so attendees before everyone broke into discussion groups to talk about constituent communications, press relations and legislative data. The event might be more aptly described as a "wonk-a-thon," as Sunlight Foundation's Daniel Schuman put it last week.

This "hackathon" was organized to have some of the feel of an unconference, in the view of Matt Lira, digital director for the House Majority Leader. Lira sat down for a follow-up interview last Thursday.

"There's a real model to CityCamp," he said. "We had 'curators' for the breakout. Next time, depending on how we structure it, we might break out events that are designed specifically for programming, with others clustered around topics. We want to keep it experimental."

Why? "When Aneesh Chopra and I did that session at SXSW, that personally for me was what tripped my thinking here," said Lira. "We came down from the stage and formed a circle. I was thinking the whole time that it would have been a waste of intellectual talent to have Tim O'Reilly and Clay Shirky in the audience instead of engaging in the conversation. I was thinking I never want to do a panel again. I want it to be like this."

Part of the challenge, so to speak, of Congress hosting a hackathon in the traditional sense, with judging and prizes, lies in procurement rules, said Lira."There are legal issues around challenges or prizes for Congress," he explained. "They're allowed in the executive branch, under DARPA, and now every agency under the COMPETES Act. We can't choose winners or losers, or give out prizes under procurement rules."

Whatever you call it, at the end of the event, discussion leaders from the groups came back and presented on the ideas and concepts that had been hashed out. You can watch a short video that EngageDC produced for the House Majority Leader's office below:

What came out of this unprecedented event, in other words, won't necessarily be measured in lines of code. It's that Congress got geekier. It's that the House is opening its doors to transparency through technology.

Given the focus on Facebook, it's not surprising that social media took center stage in many of the discussions. The idea for it came from a trip to Silicon Valley, where Representative Cantor said he met with Facebook founder Mark Zuckerberg and COO Sheryl Sandberg, and discussed how to make the House more social. After that conversation, Lira and Steve Dwyer, director of online communications and technology for the House Democratic Whip, organized the event.

For a sense of the ideas shared by the working groups, read the story of the first congressional "hackathon" on Storify.

"For government, I don't think we could have done anything more purposeful than this as a first meeting," said Lira in our interview. "Next, we'll focus on building this group of people, strengthening the trust, which will prove instrumental when we get into the pure coding space. I have 100% confidence that we could do a programming-only event now and would have attendance."

A Likeocracy in alpha

As the Sunlight Foundation's John Wonderlich observed earlier this year, access to legislative data brings citizens closer to their representatives.

"When developers and programmers have better access to the data of Congress, they can better build the databases and tools that let the rest of us connect with the legislature," he wrote.

If more open legislative data goes online, when we talk about what's trending in Congress, those conversations will be based upon insight into how the nation is reacting to them on social networks, including Facebook, Twitter, and Google+.

Facebook developers Roddy Lindsay, Tyler Brock, Eric Chaves, Porter Bayne, and Blaise DiPersia coded up a simple proof of concept of what making legislative data might look like. "LikeOcracy" pulls legislation from a House XML feed and makes it more social. The first version added Facebook's ubiquitous "Like" buttons to bill elements. A second version of the app adds more opportunities for reaction by integrating ReadrBoard, which enables users to rate sections or individual lines as "Unnecessary, Problematic, Great Idea or Confusing." You can try it out on three sample bills, including the Stop Online Piracy Act.

Would "social legislation" in a Facebook app catch on? The growth of civic startups like PopVox, OpenCongress and Votizen suggests that the idea has legs. [Disclosure: Tim O'Reilly was an early angel investor in PopVox.]

Likeocracy doesn't tap into Facebook's Open Graph, but it does hint at what integration might look like in the future. Justin Osofsky, Facebook's director of platform partnerships, described how the interests of constituents could be integrated with congressional data under Facebook's new Timeline. Citizens might potentially be able to simply "subscribe" to a bill, much like they can now for any web page, if Facebook's "Subscribe" plug-in was applied to the legislative process.

Opening bill markup online

The other app presented at the hackathon came not from the attendees but from the efforts of InSourceCode, a software development firm that's also coded for Congressman Mike Pence and the Republican National Committee.

Rep. Darrell Issa, chairman of the House Committee on Oversight and Government Reform, introduced the beta version of MADISON on Wednesday, a new online tool to crowdsource legislative markup. The vision is that MADISON will work as a real-time markup engine to let the public comment on bills as they move through the legislative process. "The assumption is that legislation should be open in Congress," said Issa. "It should be posted, interoperable and commented upon."

As Nick Judd reported at techPresident, the first use of MADISON is to host Issa and Sen. Ron Wyden's "OPEN bill," which debuted on the app. Last week, the congressmen released the Online Protection and Enforcement of Digital Trade Act (OPEN) at Keepthewebopen.com. The OPEN legislation removes one of the most controversial aspects of SOPA, using the domain name system for enforcement, and instead places authority with the International Trade Commission to address enforcement of IP rights on websites that are primarily infringing upon copyright.

Issa said that his team had looked at the use of wikis by Rep. John Culberson, who put the healthcare reform bill online in a wiki. "There are some problems with editors who are not transparent to all of us," said Issa. "That's one of the challenges. We want to make sure that if you're an editor, you're a known editor."

MADISON includes two levels of authentication: email for simple commenting and a more thorough vetting process for organizations or advocacy groups that wish to comment. "Like most things that are a 1.0 or beta, our assumption is that we'll learn from this," said Issa. "Some members may choose to have an active dialog. Others may choose to have it be part of pre-markup record."

Issa fielded a number of questions on Wednesday, including one from web developer Brett Stubbs: "Will there be open access or an API? What we really want is just data." Issa indicated that future versions might include that.

Jayson Manship, the "chief nerd" at InSourceCode, said that MADISON was built in four days. According to Manship, the idea came from conversations with Issa and Seamus Kraft, director of digital strategy for the House Committee on Oversight and Government Reform. MADISON is built with PHP and MySQL, and hosted in RackSpace's cloud so it can scale with demand, said Manship.

"It's important to be entrepreneurial," said Lira in our interview. "There are partners throughout institutions that would be willing to do projects of different sizes and scopes. MADISON is something that Issa and Seamus wanted to do. They took it upon themselves to get the ball rolling. That's the attitude we need."

"We're working to hold the executive accountable to taxpayers," said Kraft last week. "Opening up what we do here in these two halls of Congress is equally important. MADISON is our first shot at it. We're going to need a lot of help to make it better."

Kraft invited the remaining developers present to come to the Rayburn Office Building, where Manship and his team had brought in half a dozen machines, to help get MADISON ready for launch. While I was there, there were conversations about decisions, plug-ins and ideas about improving the interface or functionality, representing a bona fide collaboration to make the app better.

There's a larger philosophical issue relating to open government that Nick Judd touched upon over at techPresident in a follow-up post on MADISON:

The terms for the site warn the user that anything they write on it will become public domain — but the code itself is proprietary. Meanwhile, OpenCongress' David Moore points out that the code that powers his organization's website, which also allows users to comment on individual provisions of bill text, is open source and has been available for some time. In theory, this means the Oversight staff could have started from that code and built on it instead of beginning from scratch. The code being proprietary means that while people like Moore might be able to make suggestions, they can't just download it, make their own changes and submit them for community review — which they'd happily do at little or no cost for a project released under an open-source license.

As Moore put it, "Get that code on GitHub, we'll do OpenID, fix the design."

When asked about whether the team had considered making MADISON code open source, Manship said that "he didn't know, although they weren't opposed to it."

While Moore welcomed MADISON, he also observed that Open Congress has had open-source code for bill text commenting for years.

The decision by Issa's office to fund the creation of an app that was already available as open-source software is one that's worth noting, so I asked Kraft why they didn't fork OpenCongress' code, as Judd suggests. "While there was no specific budget expense for MADISON, it was developed by the Oversight Committee," said Kraft.

"While we like and support OpenCongress' code, it didn't fit the needs for MADISON," Kraft wrote in an emailed statement.

What's next is, so to speak, an "OPEN" question, both in terms of the proposed SOPA alternative and the planned markup of SOPA itself on December 15. The designers of OPEN are actively looking for feedback from the civic software development community, both in terms of what functionality exists now and what could be built in future iterations.

THOMAS.gov as a platform

What Moore and long-time open-government advocates like Carl Malamud want to see from Congress is more structural change:

They're not alone. Dan Schuman listed many other ways the House has yet to catch up with 21st century technology:

We have yet to see bulk access to THOMAS or public access to CRS reports, important legislative and ethics documents are still unavailable in digital format, many committee hearings still are not online, and so on.

As Schuman highlighted, the Sunlight Foundation has been focused on opening up Congress through technology since the organization was founded. To whit: "There have been several previous collaborative efforts by members of the transparency community to outline how the House of Representatives can be more open and accountable, of which an enduring touchstone is the Open House Project Report, issued in May 2007," wrote Schuman.

The notion of making THOMAS.gov into a platform received high-level endorsement from a congressional leader when House Minority Whip Steny Hoyer remarked on how technology is affecting Congress, his caucus and open government in the executive branch:

For Congress, there is still a lot of work to be done, and we have a duty to make the legislative process as open and accessible as possible. One thing we could do is make THOMAS.gov — where people go to research legislation from current and previous Congresses — easier to use, and accessible by social media. Imagine if a bill in Congress could tweet its own status.

The data available on THOMAS.gov should be expanded and made easily accessible by third-party systems. Once this happens, developers, like many of you here today, could use legislative data in innovative ways. This will usher in new public-private partnerships that will empower new entrepreneurs who will, in turn, yield benefits to the public sector.

One successful example is how cities have made public transit data accessible so developers can use it in apps and websites. The end result has been commuters saving time every day and seeing more punctual trains and buses as a result of the transparency. Legislative data is far more complex, but the same principles apply. If we make the information available, I am confident that smart people like you will use it in inventive ways.

Hoyer's specific citation of the growth of open data in cities and an ecosystem of civic applications based upon it is further confirmation that the Gov 2.0 meme is moving into the mainstream.

Making THOMAS.gov into a platform for bulk data would change what's possible for all civic developers. What I really want is "data on everything," Stubbs told me last week. "THOMAS is just a visual viewer of the internal stuff. If we could have all of this, we could do something with it. What I would like is a data broker. I'd like a RESTful API with all of the data that I could just query. That's what the government could learn from Facebook. From my point of view, I just want to pull information and compile it."

If Hoyer and the House leadership would like to see THOMAS.gov act as a platform, several attendees at the hackathon suggested to me that Congress could take a specific action: collaborate with the Senate and send the Library of Congress a letter instructing it to provide bulk legislative data access to THOMAS.gov in structured formats so that developers, designers and citizens around the nation can co-create a better civic experience for everyone.

"The House administration is working on standards called for by the rule and the letter sent earlier this year," said Lira. "We think they will be satisfactory to people. The institutions of the House have been following through since the day they were issued. The first step was issuing an XML feed daily. Next year, there will be a steady series of incremental process improvements. When the House Administrative Committee issues standards, the House Clerk will work on them. "

Despite the abysmal public perception of Congress, genuine institutional changes in the House of Representatives driven by the GOP embracing innovation and transparency are incrementally happening. As Tim O'Reilly observed earlier this year, the current leadership of the House on transparency is doing a better job than their predecessors.

In April, Speaker Boehner and Majority Leader Cantor sent a letter to the House Clerk regarding legislative data release. Then, in September, a live XML feed for the House floor went online. Yes, there's a long way to go on open legislative data quality in Congress. That said, there's support for open-government data from both the White House and the House.

"My personal view is that what's important right now is that the House create the right precedents," said Lira. "If we create or adopt a data standard, it's important that it be the right standard."

Even if open government is in beta, there needs to be more tolerance for experiments and risks, said Lira. "I made a mistake in attacking We the People as insufficient. I still believe it is, but it's important to realize that the precedent is as important as the product in government. In technology in general, you'll never reach an end. We The People is a really good precedent, and I look forward to seeing what they do. They've shown a real commitment, and it's steadily improving."

A social Congress

While Sean Parker may predict that social media will determine the outcome of the 2012 election, governance is another story entirely. Meaningful use of social media by Congress remains challenged by a number of factors, not least an online identity ecosystem that has not provided Congress with ideal means to identify constituents online. The reality remains that when it comes to which channels influence Congress, in-person visits and individual emails or phone calls are far more influential with congressional staffers.

As with any set of tools, success shouldn't be measured solely by media reports or press releases but by the outcomes from their use. The hard work of bipartisan compromise between the White House and Congress, to the extent it occurs, might seem unlikely to be publicly visible in 140 characters or less.

"People think it's always an argument in Washington," said Lira in our interview. "Social media can change that. We're seeing a decentralization of audiences that is built around their interests rather than the interests of editors. Imagine when you start streaming every hearing and making information more digestible. All of a sudden, you get these niche audiences. They're not enough to sustain a network, but you'll get enough of an audience to sustain the topic. I believe we will have a more engaged citizenry as a result."

Lira is optimistic. "Technology enables our republic to function better. In ancient Greece, you could only sustain a democracy in the size of city. Transportation technology limited that scope. In the U.S., new technologies enabled global democracy. As we entered the age of mass communication, we lost mass participation. Now with the Internet, we can have people more engaged again."

There may be a 30-year cycle at play here. Lira suggested looking back to radio in the 1920s, television in the 1950s, and cable in the 1980s. "It hasn't changed much since; we're essentially using the same rulebook since the '80s. The changes made in those periods of modernization were unique."

Thirty years on from the introduction of cable news, will the Internet help reinvigorate the founders' vision of a nation of, by and with the people? "I do think that this is a transformational moment," said Lira. "It will be for the next couple of years. When you talk to people — both Republicans and Democrats — you sense we're on the cusp of some kind of change, where it's not just communicating about projects but making projects better. Hearings, legislative government and executive government will all be much more participatory a decade from now. "

In that sweep of history, the "People's House" may prove to be a fulcrum of change. "If any place in government is going to do it, it's the House" said Lira. "It's our job to be close to the public in a way that no other part of government is. In the Federalist Papers, that's the role of the House. We have an obligation to lead the way in terms of incorporating technology into real processes. We're not replacing our system of representative government. We're augmenting it with what's now possible, like when the telegraph let people know what the votes were faster."

Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.

Save 20% on registration with the code RADAR20

November 22 2011

Congress considers anti-piracy bills that could cripple Internet industries



Sections

Imagine a world where YouTube, Flickr, Facebook or Twitter had never been created due to the cost of regulatory compliance. Imagine an Internet where any website where users can upload text, pictures or video is liable for copyrighted material uploaded to it. Imagine a world where the addresses to those websites could not be found using search engines like Google and Bing, even if you typed them in directly.

Imagine an Internet split into many sections, depending upon where you lived, where a user's request to visit another website was routed through an addressing system that could not be securely authenticated. Imagine a world where a government could require that a website hosting videos of a bloody revolution be taken down because it also hosted clips from a Hollywood movie.

Imagine that it's 2012, and much of that world has come to pass after President Obama has signed into law an anti-online piracy bill that Congress enacted in a rare show of bipartisan support. In an election year, after all, would Congress and the President risk being seen as "soft on cybercrime?"

Yes, the examples above represent worst-case scenarios, but unfortunately, they're grounded in reality. In a time when the American economy needs to catalyze innovation to compete in a global marketplace, members of the United States Congress have advanced legislation that could lead to precisely that landscape.

The Stop Online Piracy Act "is a bill that would eviscerate the predictable legal environment created by the DMCA [Digital Millennium Copyright Act], subjecting online innovators to a new era of uncertainty and risk," said David Sohn, senior policy counsel at the Center for Democracy and Technology (CDT) in Washington, D.C., in a statement. "It would force pervasive scrutiny and surveillance of Internet users' online activities. It would chill the growth of social media and conscript every online platform into a new role as content police. And it would lay the groundwork for an increasingly balkanized Internet, directly undercutting U.S. foreign policy advocacy in support of a single, global, open network."

The names of the "Stop Online Piracy Act (H.R. 3261) and "PROTECT IP Act" (S. 968) make it clear what they're meant to do: protect the intellectual property of content creators against online piracy. What they would do, if enacted and signed into law, is more contentious. SOPA is "really a Trojan horse that might be better named the Social Media Surveillance Act," said Leslie Harris, CEO of CDT, in a press conference. "Expect it to have a devastating effect on social media content and expression."

To ground the potential issue in familiar examples, the Electronic Frontier Foundation (EFF) explained how SOPA could affect Etsy, Flickr and Vimeo. Don't use those sites? OK. Substitute eBay, Instagram and YouTube. Or the next generation of online innovation.

Let's be clear: online piracy and the theft of intellectual property are serious problems for the global media. Nor is piracy something that legislators, regulators, publishers or members of the media should condone. Given that context, this legislation has strong support from an industry coalition of content creators, including labor unions, artists guilds, movie studios and television networks.

Those pro-legislation constituencies do have their supporters. Andrew Keen wrote at TechCrunch that the "death of the Internet was exaggerated," disparaging the claims of the organizations, individuals and experts who have come out against the bills. Scott Cleland argued at Forbes, that this "anti-piracy legislation will become law," citing the scope of IP theft and the need to address it by some means.

Neither of these commentators, however, addressed the significant technical, legal and security concerns that persist around the provisions in SOPA and the PROTECT IP Act. The drafters of SOPA apply several enforcement mechanisms to combat online piracy. There's broad support for measures to restrict revenues that support sites that distribute copyrighted material or child pornography. The most controversial provision of the bills centers on the use of the domain name system as a means to prevent people from accessing sites hosting infringing content.

The Stop Online Privacy Act goes further than the Protect IP Act in a number of important ways, and it mirrors provisions in other acts. Nate Anderson wrote at Ars Technica that the House takes the Senate's bad Internet censorship bill and make it worse.

The CDT recommends a more focused "follow-the-money" approach "narrowly targeting clear bad actors and drying up their financial lifeblood, could reduce online infringement without risking so much damage to Internet openness, innovation, and security," said Sohn. "Fighting large-scale infringement is an important goal. But SOPA would do far too much collateral damage to innovation, online expression, and privacy. Congress needs to listen to the full range of stakeholders and seriously rethink how it should address the problem of online infringement."

Significant legal and technical concerns persist about SOPA and the PROTECT IP Act. CDT has a useful SOPA summary that clearly explains these issues. Sohn joined with Andrew McDiarmid to write an editorial in the Atlantic that says SOPA is a "dangerous bill that would threaten legitimate websites."

In a widely cited editorial for the New York Times last week, Rebecca MacKinnon, author and co-founder of Global Voices, argued that the U.S should not support the creation of a "Great Firewall of America":

The potential for abuse of power through digital networks — upon which we as citizens now depend for nearly everything, including our politics — is one of the most insidious threats to democracy in the Internet age. We live in a time of tremendous political polarization. Public trust in both government and corporations is low, and deservedly so. This is no time for politicians and industry lobbyists in Washington to be devising new Internet censorship mechanisms, adding new opportunities for abuse of corporate and government power over online speech. While American intellectual property deserves protection, that protection must be won and defended in a manner that does not stifle innovation, erode due process under the law, and weaken the protection of political and civil rights on the Internet.

Tim O'Reilly said the following about SOPA and PROTECT IP Act:

We're in one of the greatest periods of social and business transformation since the Industrial Revolution, a transformation driven by the open architecture of the Internet. We're still in early stages of that revolution. New technologies, new companies, and new business models appear every day, creating new benefits to society and the economy. But now, fundamental elements of that Internet architecture are under attack. These legislative attacks are not motivated by clear thinking about the future of the Internet or the global economy, but instead are motivated by the desire to protect large, entrenched companies with outdated business models that are threatened by the Internet. Rather than adapting, and competing with new and better services, they are going to Congress asking for protection. If they succeed, they will vitiate the Internet economy.

As a publisher, I have experience from the front lines of the copyright wars. O'Reilly first began putting our books online in 1987. Now, in 2011, ebooks are the fastest growing part of our business. We are proud that we have never used DRM on our books, and that sales have never suffered as a result. Instead, we are selling books in markets around the world that we were never able to reach in print. Existing copyright laws, and the goodwill of our customers, who constantly report pirated editions to us, are more than sufficient to protect our intellectual property and to enable a rich market for paid content. By making our content more accessible to readers around the world, we've expanded our business and our impact.

Opposition from the legal, technical and VC community

How we choose to address the issue of online piracy matters a great deal to both the American people and to the rest of the world. It says something about who we are as a free society, how we value due process and whether Congress listens to the people who understand how something that is to be regulated works.

SOPA "really stands for the proposition that online communication tools, and the DNS, can and should be used for enforcement," said Cynthia Wong, head of the Project on Global Internet Freedom at the CDT, in a press conference. Once these enforcement tools are put in place, she said, they provide a model for governments to restrict hate speech or online criticism of public officials in the name of copyright protection. "What we're really risking is further balkanization of the Internet."

Dozens of law professors say that the PROTECT IP Act is unconstitutional. More than a hundred technology entrepreneurs blasted the PROTECT IP Act in a letter to Congress.

The concerns of the technology industry regarding the effect of SOPA and the PROTECT IP Act on innovation were confirmed by a new study on the impact of Internet copyright regulations by consulting firm Booz Allen Hamilton. They study found that 70% of angel investors said an increase in anti-piracy regulations would deter them from investing in websites that feature user-generated content.

"The debate over digital content is a vast landscape peppered with many opinions and very little real data," says Matthew Le Merle, a partner at Booz & Company, in a prepared statement. "We decided to conduct this empirical study to shed light on one important issue. Would angel investors really take their money elsewhere if the regulatory landscape fundamentally changed with regard to copyright regulation and the internet? The answer was definitive."

"The 'content industries' like to make claims about their economic losses," wrote Tim O'Reilly on Google+. "At last, VCs and startups are starting to point out how much they have to lose from overreaching IP laws." O'Reilly was among those interviewed for the Booz Allen Hamilton study.

Fundamental cybersecurity concerns about PROTECT IP

An eminent group of Internet engineers and security professionals published a PROTECT IP whitepaper that demonstrates why the technical provisions would be both ineffective and would damage online security.

"As a group, we're usually not involved in policy," said David Dagon, a postdoctoral fellow in computer science at Georgia Tech, at a press briefing this fall. This group of technicians, researchers and operational specialists was brought together by the implications of protecting the integrity and security of DNS infrastructure, he said. "The part that most alarms us are the extent to which requirements of the Act would affect DNSSEC," the Domain Name System Security Extensions. DNSSEC is a joint effort between ICANN and VeriSign, with support from the U.S. Department of Commerce, to make the domain name system more secure when used on IP networks.

The old version of DNS is not secure enough, explained Paul Vixie, founder of Internet Systems Consortium, at the press briefing. Bad guys could insert code into it. The basic solution was to add cryptographic signatures added to every answer sent back by a name server. That way there's assurance to the requester of a domain name that the information being received is the same as the information sent by the source. That technology, after a lot of design and development, is in mid-stride for being deployed as DNSSEC around the world.

The value of DNSSEC was recently demonstrated when the FBI revealed "Operation Ghost Click, argued Ernesto Falcon, director of government affairs at Public Knowledge. Operation Ghost Click dismantled a global cybercriminal network that stole $14 million using well-documented security holes in the DNS.

An "absolutely central aspect" of the design of DNSSEC is to detect any change in the answer along the way, said Vixie. "The provisions of this bill try to do the same thing in the spirit of telling the user that the site has been taken down by court order. Unfortunately, the bill tampers at a very low level of architecture of the Internet. The effect will be that it will look like it's been tampered with."

Vixie emphasized that the researchers have "no issue with protecting intellectual property. "Many of us have patents and copyrights and are not empathetic at all with piracy. Other provisions will be effective, he said. "This particular one will not be."

Security researcher Dan Kaminsky (@dakami), who found and fixed a fundamental flaw in DNS, strongly argued that the security challenges that the world faces can't be ignored.

"America is getting hacked," he said. "We're seeing widespread level of our assets getting broken into. It's untenable. It's something we have to fix or our economy can't work."

The problem with the DNS provisions in the PROTECT IP Act is the impact they would have outside of pirate sites, said Kaminsky, which he observed have some 53 billion page views ever year.

Kaminsky said these provisions would both be largely ineffectual and increase the security risks for financial services companies, among others. "The amount of tech work someone needs to do is approximately 30 seconds," he said. "With a click, resolution can be exported overseas. It's not just Pirate Bay. There are lookups to Bank of America and Citibank overseas. We'd be handing over American Internet access to entities we do not trust, entities that are unambiguously bad guys. The best case technology for finally protecting asserts on the Internet is significantly impacted" by these provisions, he said.

That assessment is backed up by components of the U.S. government's own research community. Sandia Labs told CNET that SOPA will negatively impact U.S. cybersecurity.

If we as a country put this around our DNS servers, we'll see an exodus from around the United States, said Dagon. "We believe the volume of users is so large that it will provide opportunities for mischief. Every behavior will be communicated to some server overseas. The volume is such that it's something policy makers should reflect on."

Allan A. Friedman, a fellow in governance studies at the Brookings Institution, detailed significant cybersecurity risks posed by SOPA and the PROTECT IP Act that policy makers should consider. There are "very real threats to cybersecurity in a small section of both bills in their attempts to execute policy through the Internet architecture," he wrote. "While these bills will not 'break the Internet,' they further burden cyberspace with three new risks. First, the added complexity makes the goals of stability and security more difficult. Second, the expected reaction of Internet users will lead to demonstrably less secure behavior, exposing many American Internet users, their computers and even their employers to known risks. Finally, and most importantly, these bills will set back other efforts to secure cyberspace, both domestically and internationally."

It would be "quite a burden on United States companies to follow these rules," said Vixie. "In order to solve these problems on a global basis without affecting our economy to prevent bypass, they would have to work on an international level with countries and have them do takedowns locally. I don't believe that there is a unilaterally imposable technical solution that Congress can mandate to address this issue."

Danny McPherson, the chief security officer for Verisign, agreed. "Were there such a tech solution, I wouldn't have waited for Congress," he said at the briefing. "I would have used it 15 years ago versus malware."

More lawmakers come out against SOPA

As Declan McCullagh reported for CNET, the "SOPA copyright bill's backers include the Republican or Democratic heads of all the relevant House and Senate committees, and groups as varied as the Teamsters and the AFL-CIO." As the week begins, PROTECT IP had 39 co-sponsors in the Senate and SOPA had 24 co-sponsors in the House.

Why is SOPA on the Congressional agenda now, given the huge challenges that the country faces with employment, education, healthcare, energy and the long war abroad? As always, follow the money. An analysis by MapLight.org showed that supporters of SOPA have given 12 times as much money to members of Congress than those opposing it.

Despite that notable imbalance, a growing number of U.S. Representatives and Senators in Congress have expressed principled opposition to the PROTECT IP Act and its companion bill in the House.

In the House, Representatives Issa and Lofgren sent a 'Dear Colleague' letter opposing SOPA to Congressional leaders. Last week, Representatives Eshoo, Lofgren, Paul, Doggett, Honda, Miller, Thompson, Matsui, Doyle and Polis sent a letter opposing SOPA to the leaders of the House Judiciary Committee. House Minority Leader Nancy Pelosi tweeted that Congressional leaders "need to find a better solution than #SOPA #DontBreakTheInternet.

Oregon Senator Ron Wyden has been an important voice against the PROTECT IP Act and took action to put a hold on it earlier this year. Wyden told the audience at Web 2.0 Summit that the Protect IP Act is about letting the content sector attack the innovation sector. In the video below, I interview Senator Wyden more specifically about the issues raised in the PROTECT IP ACT:

In a press conference on Tuesday, Representative Darrell Issa and Representative Zoe Lofgren addressed concerns with SOPA:

Last week, Rep. Issa told The Hill that SOPA has no chance of passing the House and that "Congress was using Google as a piñata." Issa said to Gautham Nagesh that "I don't believe this bill has any chance on the House floor. I think it's way too extreme, it infringes on too many areas that our leadership will know is simply too dangerous to do in its current form."

When asked for further comment on Monday, Congressman Mike Honda made the following statement via email:

"The internet censorship bills currently moving through Congress, the Stop Online Piracy Act (SOPA) and the PROTECT IP Act, set a dangerous precedent and represent a big step backwards in Washington's efforts to foster growth in the digital sector. These bills would have a profound effect on how the internet functions on a basic level, undermining the legal process and overturning long-standing practices like ‘safe harbors’ that were established in the Digital Millennium Copyright Act.

I have serious concerns about the overly broad definitions of theft included in SOPA that could be used to shut down dozens of lawful exchange sites that are valuable outlets for small-scale buying and selling. I am also uneasy about the use of DNS blocking as a viable solution, especially within the lens of consumer security standards like DNSSEC. Finally, the complete immunity from federal and state laws granted in SOPA to several industries could set off an anti-consumer and anti-competitive wave that will strike at the very core of the internet.

The fact that Congress is considering these haphazard bills is a cause for alarm. I agree with the goal of combating online piracy and am committed to coming up with bi-partisan solutions. I am extremely intrigued by the ideas currently proposed as alternatives, particularly the idea of implementing an International Trade Commission complaint process, but these pieces of legislation as currently drafted will cause substantial harm to innovation and the economic opportunities created by the Internet in my Silicon Valley District and to the fundamental openness of the internet."

A Congressional hearing stacked against the Internet

It's not hard to see the House Judiciary Committee has not been equally representing both sides of the debate in either the resources it provides online or in the witnesses it called to testify at a recent hearing.

If you read the U.S. House Judiciary website "resource pages" on "rogue websites, for instance, you'd never know that there's any opposition to the Stop Online Piracy Act at all, even from within the committee.

Similarly, if you visited the hearing for H.R. 3261 on the House Judiciary committee website, you would not see any of the documents that Representatives Lofgren or Issa read into the record during last week's hearing.

If you want a more balanced picture of the hearing, turn to InfoDocket.com, which has collected many more SOPA resources.

For media reports on the SOPA hearing, read The Hill, Politico, The Atlantic Wire, Wired, Washington Post, or, most frank of all, ArsTechnica, which captured a truth that became clear to many observers who sat through all of it: "The hearing was designed to shove the legislation forward and to brand companies who object as siding with 'the pirates'."

As Carl Franzen put it at TPM's Idealab, this hearing provided an official venue for the bill's supporters to explain why SOPA should pass. The problem with that approach is that the witness list (five for SOPA, one against) left the committee wide open to accusations of anti-Internet bias in the witness list.

Opponents of SOPA were dismayed to hear full support for the bill as drafted by the U.S. Register of Copyrights, Maria Pallente, who said that without SOPA, copyright will ultimately fail.

Despite the stacked deck, several representatives raised concerns about freedom of expression and innovation, including Reps. Lofgren, Issa and Maxine Waters.

Rep. Dan Lungren raised a key issue in his questioning, when he asked the representative of the Motion Picture Association of America (MPAA) to respond to the concerns of former Homeland Security Assistant Secretary and former NSA General Counsel, Stewart Baker and Internet engineers regarding SOPA hurting cybersecurity because of its effect on DNSSEC. (The MPAA disagreed, for the record.) Nobody testifying at the hearing said they had the technical expertise to comment on SOPA and DNSSEC, which begged the question: Why weren't any Internet engineers invited?

Based upon the witness list and the resources offered online, it does not appear that the Congressmen who sponsored the bill were being entirely forthright when they said that the Internet industry was welcome to comment. Rep. Issa said at the hearing that the Consumer Electronics Association (CEA) had been denied a request to testify.

There's also an important point about open government to make: this hearing was of great interest to the American people, most of whom could not attend in person. The halls of the Rayburn Congressional office building were full for the hearing, with many people turned to a spillover room. That public interest means that broadcasting the hearing online is even more important — and yet the livestream was choppy or simply inaccessible to many citizens. That effectively shut the public out of the SOPA hearings.

Archived video of the three-and-a-half-hour hearing is available but it's far from user friendly. The Government Oversight and Reform Committee was able to post Rep Issa's remarks on YouTube the day after the hearing.

Matt Lira, director of new media for the House Majority Leader, says that they "are working on that, structurally; it won't be a problem in 2nd session."

Wikileaks, DNS and the Internet commons

What's happening in Congress now needs to be put in context with a longer continuum of proposed legislation, Internet policy choices and government actions.

Over the past year, a spirited debate about what Wikileaks means for the future of journalism, whistleblowing and Internet freedom has revealed a couple of important realities. One of the best outcomes of the Wikileaks saga is that it has catalyzed discussion about how the technical infrastructure of the Internet relates to freedom of expression online. It remains critically important to heighten general awareness of some of the laws relevant to the Internet that are being discussed in Washington, particularly for media organizations and the audiences affected by them.

As someone who has covered the space for a while, I know that the alphabet soup of that surrounds Internet policy is hard enough to swallow for reporters immersed in it. For most people, it's too much to navigate. This article, for example, was originally envisioned as a primer on DNS, COICA, ICE, DHS, ACTA and the issues associated with them. For serious geeks, some of these definitions may be old hat, but the government policy surrounding them is worth tracking. Below, you'll find both explanations of the terms.

The domain name system is one of the hallmark technologies that makes the Internet work. (If you already know how DNS works, skip down to the next section.) DNS stands for Domain Name System, a globally interoperable way for people to easily access websites. This is how an online user is taken to his or her desired website when after entering a URL (Uniform Resource Locator) into the address field of a Web browser. Without the DNS, users would have to know the string of numbers that make up an Internet Protocol (IP) address for a given website. While a few geeks might be able to pull that off, the vast majority of people wouldn't be able to find the website as easily. Historically, the DNS is coordinated by ICANN and a system of regional organizations that help coordinate the global IP addressing system.

Why does DNS matter to the media and citizens they inform? Think of it like this: what would it mean to the ability of broadcast news to reach citizens if it became much more difficult to tune into the station? Nancy Scola explained why DNS matters for Wikileaks over at TechPresident last winter.

Scola examined how Congress seeks to tame the Internet in a recent feature at Salon.com. " For all the rhetoric," she writes, "this isn't even really about copyright. This is about the Internet — and more to the point, the infrastructure and operations of the Internet that make the Internet the Internet. SOPA targets search engines, Internet service providers, ad networks and payment networks precisely because those components are so central to the functioning of the Internet. Those are digital forces that should be messed with only with the greatest of care."

The Internet is a remarkably robust decentralized network, designed to hold up in the event of a nuclear attack. That said, it does have a centralized choke point: the domain name server system. That makes tampering with the DNS as a means to limit access online content attractive to some. That said, the aftermath of the delisting of wikileaks.org showed, however, the organization was able to get another domain (wikileaks.ch) and mirror its content to over a thousand other servers.

At present, responsibility for addressing illegal activity on the Internet, particularly copyright, is spread throughout multiple parties. Copyright issues in the United States are addressed by a Digital Millennium Copyright Act (DMCA) takedown.

Let's take a walk back through some important history on copyright legislation. Last year, the "Combating Online Infringement and Counterfeits Act" (COICA) (S. 3804), introduced by Sen. Patrick Leahy on September 20, 2010, would have changed that dynamic. The bill, which passed the Senate Judiciary Committee, was meant to "fight online copyright infringement."

The mechanisms in the bill for enforcement would have forced domain name registries to prevent resolution of domains that online users try to visit. (Sound familiar?) These registries are the companies and organizations that administer top-level domains like .com, .org, .net, and so on, not a second-level provider like GoDaddy, or a DNS provider like the one that delisted Wikileaks.

It's worth considering how Wikileaks first lost its DNS registration for Wikileaks.org and then the ability to receive donations through PayPal. Another DNS provider put Wikileaks.org back online, but the precedent of how DNS could be used as a mechanism for censorship online was made.

Such precedents are important, both for networks within the borders of the United States and beyond. As more general top-level domains are rolled out by ICANN in the years ahead, more governments will receive control over commercial top-level domains.

If the United States Congress follows through in creating legislation, other governments will have cover and do the same with domains linked to websites that they declare are in violation of their own laws. Domestic actions on Internet policy, in other words, have global impact in a networked age.

COICA enforcement would have required financial transaction providers to prevent transactions for "customers located within the United States based on purchases associated with the domain name." What was particularly notable about Wikileaks as a case study, then, was not the use of DNS, which the organization quickly routed around. It was how cutting off electronic payment mechanisms starved Wikileaks' operation of funding. (The effectiveness of which was noted by Google's Katherine Oyama last week during the hearing before the U.S. House.)

COICA received widespread criticism from civil liberties organizations, although not to the level that COICA's descendants have more recently. Senator Leahy said that the "Chamber of Commerce, organized labor, content owners and a tremendous cross-section of industry groups all support this legislation." On that count, COICA is supported by the Motion Picture Association of America, the U.S. Chamber of Commerce, the Screen Actors Guild, Viacom, and the International Alliance of Theatrical Stage Employees, Moving Picture Technicians, Artists and Allied Crafts of the United States. (These supporters should look familiar as well.)

COICA was opposed by organizations and individuals such as the CDT, the EFF, the Distributed Computing Industry Association, Tim Berners-Lee, the American Civil Liberties Union and Human Rights Watch.

As Mike Masnick wrote at TechDirt, the creator of the World Wide Web, Tim Berners-Lee, came out against COICA:

"We all use the web now for all kinds of parts our lives, some trivial, some critical to our life as part of a social world," says Tim Berners-Lee, creator of the Web. "In the spirit going back to Magna Carta, we require a principle that: No person or organization shall be deprived of their ability to connect to others at will without due process of law, with the presumption of innocence until found guilty. Neither governments nor corporations should be allowed to use disconnection from the Internet as a way of arbitrarily furthering their own aims."

Berners-Lee reiterated his opposition to the U.S. government censoring the Web last week.

Tim Berners Lee tweet

ICE and the Internet

The concerns of the Internet community over the use of DNS for enforcement have played out over the last year. Those concerns emerged when the White House's new intellectual property enforcement office indicated that the Immigration and Customs Enforcement (ICE) division of the Department of Homeland Security (DHS) would expand website takedowns to online pharmacies. The issues around the seizure of domain names became more complex over the 2010 Thanksgiving holiday, when online piracy enforcement moved to music blogs. The operator of one of those music blogs, Joe Hoffman, went on the record to the New York Times to state that his site had no information about what they were being charged with. This highlighted the due process and transparency issues around enforcement.

"A fundamental problem with the ICE seizures is insufficient regard for due process--the right of people to defend themselves before adverse actions are taken against them," said John Bergmayer, staff attorney at Public Knowledge, a Washington, D.C.-based public interest group. "Various kinds of property seizures have been abused in other areas of the law for years. I think there's an important distinction to be drawn on the domestic domain name vs. international domain name issue. In the one case it's domain seizure, in the other it's blocking."

Bergmayer argues that when it comes to domestic domain names, the government has all the tools it needs to redirect domains. "This might be bad policy and law but it doesn't really 'break' DNS per se — it uses extraordinary means to change what the canonical DNS entry is for a domain."

When it comes to international domains, there are different considerations. "End-user ISPs and anyone who operates a DNS server in the U.S. can be directed to actually break DNS for particular sites," said Bergmayer. "They are directed to not follow the canonical DNS entry. [There are] dumb technological problems with that — it both breaks the functioning of the Internet, could fragment it into various incompatible nets, raises security problems, and ultimately could be routed around with a simple Firefox plug-in."

SOPA and Internet freedom

When asked about anti-piracy legislation by Rep. Howard Berman, Secretary of State Hillary Clinton said that there "is no contradiction between intellectual property rights protection and enforcement and ensuring freedom of expression on the Internet."

Sohn disagrees. SOPA "undermines cybersecurity and encourages, country by country, balkanization of the Internet," he said. "It's a blunt instrument. It certainly will affect free speech."

Wong similarly disputes that position. SOPA "is really hard to square with the United States' current foreign policy goal of one Internet," she said. "If adopted, it could have a real effect on human rights defenders. We've seen the ability of tools like Tor to help. If government creates obligations on these services to moderate the behavior of users, it will be hard."

Should Secretary Clinton continue to offer full support for these bills, she could be presented with an additional diplomatic headache: the European Parliament warned of global dangers from U.S. domain revocation proposals on Thursday.

Intermediary liability and ACTA

Another important element of the future of copyright at a global level surrounds the Anti-Counterfeiting Trade Agreement (ACTA), an agreement that would establish international standards for intellectual property.

Why does ACTA matter to the media and citizens? Consider the phrase "intermediary liability." That's the principle that websites on the Internet, like YouTube, Internet service providers, web hosting companies or social networks, should not be held liable for the content created or uploaded by their users.

White House deputy chief technology officer for Internet policy Danny Weitzner explained what intermediary liability is and why it matters in the context of copyright and the PROTECT IP Act at this year's Web 2.0 Summit. Fast forward to minute 12:42 for his remarks.

"Requiring intermediary action is always troubling, but if you're going to do it, it's better to go after direct business relationships than more indirect or technical connections," said Bergmayer. "Thus, one of the big problems with the ICE takedowns is the involvement of registries, who have no direct business relationship with the sites in question (and thus no incentive to try to protect their customers). Online service providers have sometimes been heroes in protecting their customers, and it's good to try to preserve that dynamic."

The CDT has published a comprehensive white paper on intermediary liability (PDF) and strongly advocates for the protection of intermediaries online.

The Google Public Policy blog wrote about intermediary liability back in 2007, when India considered changes to its technology laws. As Doc Searls described it in his post on the Internet in China, holding content carriers accountable for copyright through intermediary liability can be thought of as a kind of "encirclement."

That's because, as MacKinnon wrote in her article on the Internet "self-discipline", in China, "all Internet and mobile companies are held responsible for everything their users post, transmit, or search for."

In theory, as the ACTA FAQ sheets from Canada state, the treaty is meant to focus on copyright issues, not free speech. In practice, as the EFF makes clear in its ACTA brief, the potential for intermediary liability to be an issue in other countries is a legitimate concern.

"When it comes to intermediary-directed actions, there's a fundamental disagreement about whether those count as 'enforcement' provisions," said Bergmayer. "I would argue that it goes beyond ';enforcement' when you take a new party X and tell him he now is legally obligated to do something new. That's the creation of a new obligation, forced deputization. To me, increased 'enforcement' means, basically, that cops start doing their job better, that courts process cases more quickly — not restructuring the balance of legal responsibilities."

The EFF submitted its concerns about ACTA after the official request for comments last December. ACTA is "likely to cause harm to investment and innovation in the U.S. technology sector and to American citizens' ability to engage in currently lawful conduct," said Bergmayer.

On ACTA, Bergmayer made an important point relating to how the U.S. Trade Representative (USTR) tries to frame it as non-binding is often lost.

"The agreement is binding from an international law perspective, and the U.S.TR can't do anything about that. If the U.S. is out of compliance, other countries would have all the usual remedies against the U.S. in international bodies. In U.S. courts, it would not be binding, but only persuasive," he said.

What's at stake for the open Internet with ACTA, SOPA and PROTECT IP Act? It's "what's been at stake for more than 15 years: the possibility that a coalition of forces who are afraid of the Internet will shut it down," Harvard Law professor Yochai Benkler told me at the eG8 forum this spring. I saw Benkler again at the Club de Madrid annual conference this past week and talked more with him about the challenges to the Internet as we know it today, including SOPA. He mentioned that his paper on the latter bill had been receiving more attention and was more relevant in the context of the introduction of the former bill. The paper compares the attack on Wikileaks to key elements of PROTECT IP on a deep level.

"There is still a very powerful counter argument, one that says both for innovation and for freedom, we need an open Net," he said. "Both for growth and welfare, and for democracy and participation, we need to make sure that the Internet remains an open Internet, remains a commons we all share, remains neutral at all layers, the physical layer, at the logical layer, at the data layer, at the content layer — at all of these layers, we must have an open Internet. That's still very strong, but it seems more threatened today than it has been for five or six years. We seem to be closer to the risk we were at in the late '90s, than the risk we were at five years ago."

The sleeping Internet giant awakes

In the weeks since SOPA was introduced, the technology media has done a creditable job raising awareness about the bill, albeit with some rhetoric that might mask the genuinely substantive concerns about its provisions. A coalition of organizations that oppose the bill created a website, FightForTheFuture.org, and a simple video that asserts that "PROTECT IP Breaks the Internet."

Mike Masnick has being blogging non-stop at TechDirt. The EFF has documented an explosion of opposition to the bill, including venture capitalists like Albert Wenger, Brad Burnham, Fred Wilson and many others opposed to the PROTECT IP Act in a letter to Congress. In a rare journey to the nation's capital, Wilson and others visited Capital Hill this fall in an effort to explain to lawmakers why this approach is problematic.

They're far from alone. AOL, Yahoo, Google, Facebook, Twitter, eBay, LinkedIn, Mozilla and Zynga wrote a letter to Congress opposing SOPA.

I joined "The Alyona Show" to talk about growing opposition from the tech industry to the SOPA last week. The show's producers read my article in the Huffington Post, "Internet Companies and Lawmakers Speak Out Against the Stop Online Piracy Act," and asked me to come in to talk about it.

As the CDT has cataloged, there's a growing chorus of opposition to SOPA.

That rising tide begs a questions: If Congress declared war on the Internet, as GigaOm's Mathew Ingram put it, what happens if the Internet fights back?

Last week, websites across the Internet joined in "American Censorship Day" and encouraged citizens to contact their representatives. A White House epetition to "Stop the E-PARASITE Act" gained 18,000 signatures, which means that the White House will respond to it. An Avaaz petition to "Save the Internet now has more than 517,000 signatures.

In what looks like a new horizon for Internet activism, Tumblr said that its users were averaging 3.6 calls every second to Congress at one point using an innovative Internet "click to call" tool the blogging platform created to "protect the Net." Tumblr's historic day resulted in 87,834 call to representatives averaging 53 seconds per call.

Tumblr is not alone. SendWrite received more than 3,000 letters telling Congress to stop SOPA. Votizen collected hundreds of supporters for letters opposing PIPA.

Reddit galvanized its substantial community around censorship. The Reddit community also crowdsourced discovery and aggregation of potentially infringing content on the congressional websites of the representatives that sponsored the legislation. As Ars Technica reported, however, SOPA sponsors are probably not about to make themselves felons.

OpenCongress logged 53,000 site visits and 65,000 pages views to its SOPA and PROTECT IP Act information, with 810 emails sent from the public to Congress. OpenCongress has been hosting an important experiment in open government over the past few days: a public markup of SOPA, where citizens comment on provisions in the bill text.

Will any of these online efforts have an offline impact? Over at Forbes, Kashmir Hill took an in-depth look at whether Internet lobbying can be as effective as money spent in Washington, but the question about impact is left unanswered. We'll see. To know whether all of that effort made a difference, we'd need to know whether the bill was subsequently amended (not yet), withdrawn (unlikely), voted down (no opportunity yet), or if more lawmakers come out against it. Pelosi did so last week, although it's not clear if online activism prompted the move.

The bottom line here is that, as currently drafted, SOPA and the PROTECT IP Act have the potential to negatively affect innovation and Internet security, and enshrine into law the principle that a website hosting user-generated content is liable for any infringing content posted to it. The bills would increase the regulatory burden upon both startups and huge Internet companies, including new requirements to track users in ways that might be conflict with the Federal Trade Commission's (FTC) instructions to create "privacy by design."

"It's not possible to predict the future, but it is possible to shape it for good or ill," Tim O'Reilly said in our interview. "There is a clear and present danger to the future. The threat isn't online piracy. It is ill-considered laws driven by the narrow interests of companies that are unable to compete in a changing marketplace."

As the importance of the Internet as a platform for collective action, commerce, open government and media grows, so too does the need for citizens, officials and journalists to understand how it works.

"Washington is sometimes rightfully criticized for harboring some crazy ideas when it comes to the Internet," wrote Nancy Scola in Salon. "But the federal government has gotten some basic things very right, from funding the Internet in its early stages to having the wisdom to enshrine Section 230 and the safe harbors. It would be a shame to see Congress trash that legacy with a single bill."

Digital literacy involves much more than knowing how to manage Facebook privacy settings, download software updates or choose strong passwords. Similarly, civic literacy involves more than knowing where to vote once every two or four years. If you use social media, watch online video, work on open source software, run an online business or believe in the Internet, it's long past time to become more literate on both counts.

Most American citizens oppose government involvement in blocking access to content online, particularly when the word "censor" is accurately applied. When asked if ISPs, social media sites and search engines should block access — as they would under SOPA — only a third of Americans agree.

Negative attention and significant questions about innovation and cybersecurity appear to have harmed the prospects for SOPA in the U.S. House. While the SOPA debate is far from over, Congressman Issa said on Friday that efforts to grease the skids of SOPA had failed.

Whatever your opinion of SOPA, the PROTECT IP Act or other proposed legislation, there are now a growing number of digital tools to become better informed and to let your legislators know where you stand. Learn more about SOPA. It's never been easier to do so. It is, after all, your Internet.

Reposted bycheg00 cheg00

July 23 2010

Web 2.0 risks and rewards for federal agencies

usgs-earthquake-tweets.jpgThe nature of record keeping and government transparency in the information age is rapidly changing. Officials can text, tweet, direct message, send "Facemail," IM or Skype, all from a personal smartphone. That's why yesterday's testimony of David Ferriero, Archivist of the United States, at a hearing on "Government 2.0: Federal Agency Use Of Web 2.0 Technologies" was both critically relevant and useful. (It's embedded below, after the jump.)Officials are "free to use external accounts as long as emails are captured into records management systems," he said. "Every new technology provides new challenges to what is a record." Ferriero said that new guidance on government use of social media will be released this fall, updating the 2009 guidance issued by the National Archives and Records Administration (NARA).

The biggest challenge, said Ferriero, is whether the record is the whole site or just a portion. "Web 2.0 offers opportunities unimagined a decade ago," he said.

David McClure, associate administrator for citizen services and innovative technologies at the General Services Administration, echoed that sentiment in his testimony."Web 2.0 isn't fundamentally about the technology itself but how people are coming together to achieve extraordinary results," he said, pointing to uses for idea management, ranking or ranking ideas, communication and more. "From an efficiency perspective, a lot of software meets those needs without the need for the agency to build tools, when the market is as robust as it is today."

More on the House subcommittee hearing on Government 2.0 after the jump, including a United States General Accountability Office (GAO) report on Web 2.0 and security in government and videos.

The potential of Web 2.0 technology, as illustrated by the many examples McClure provided in his testimony, is balanced by both risk and privacy concerns. "The expectation is that any tool for government use adheres to the Privacy Act and a Privacy Impact Assessment," said McClure. Applications must be compliant with relevant regulations to be on Apps.gov, said McClure.

Gregory Wilshuen, director of information security issues for the GAO, testified at the hearing about government security challenges posed by the use of Web 2.0 platforms by federal agencies. He delivered a new report, embedded below, and said that the GAO will be looking at the preparation of agencies to retain records from social media platforms. "We've found a number of agencies using technology to interact well," he said. "Several are using technology in an effective manner using videos and blogs." Wilshuen said that the GAO will examine whether information maintained by third-party providers is subject to Freedom of Information requests, which is, as he put it, as "rather strenuous."

In reviewing federal activity, Wilshuen said the GAO found most agencies are using social media platforms. He highlighted three effective examples:

The challenge throughout all of these applications lies in privacy, security and records management, said Wilshuen. "Are these federal records?"

Testimony of United States Archivist

"If we're going to be advising other agencies on how to use these tools, we need to use them ourselves," said Ferriero. For instance, Ferriero said that given the severe budget conditions expected for the year ahead, they're using IdeaScale to crowdsource ideas. Similarly, the National Archives crowdsourced the redesign of its website.

"We need to rethink the definition of a record," said Ferriero. "What part of technology is permanent that we need to keep in perpetuity?" When asked about the areas the committee should focus oversight upon, Ferriero said that it's clear agencies have identified a "moderate to high risk" regarding archiving electronic agencies and that NARA needs to provide more guidance." His written testimony is embedded below:

Testimony of David McClure on Web 2.0 in government

McClure offered one direction for archiving: distinguishing between official government business versus personal use. Some officials have chosen to use multiple accounts for that reason, though gray areas are a real risk given the closeness of Washington society.

McClure was crystal clear on one point: the rapid changes to technology have changed American society. "Web 2.0 tools are essential for responding to shifting expectations of government," he said, citing the hundreds of billions of pieces of content shared on social networks and viewership of YouTube. McClure said that government is "expected to engage" on these new platforms. Using them, however, "should be aligned with core government principles," much as social media is used for business purposes in the private sector. McClure pointed to the Library of Congress, State Department and the TSA's IdeaFactory as examples of agencies using social media to deliver on their missions.

Consumer watchdog: More scrutiny of Web 2.0

Simpson urged the subcommittee not only to look at abstract technologies but also to compare those providing services, their approaches to privacy. In cloud computing and security, there's a "tendency of tech companies to overpraise." Simpson said Google missed the deadline for its Los Angeles cloud implementation, as the city had to come up with more money after Google did not meet security requirements set by the Los Angeles police department. (Google and partner Computer Sciences Corp. agreed to reimburse the city for the cost of the delay, which, according to MarketWatch, should only reach about $135,000).

Simpson reflected that his personal experience of Web 2.0 on the Obama campaign showed him that they can be powerful tools to "improve government transparency, responsiveness and citizen involvement." He balanced that potential with challenges to consumer privacy, including Google's "Wi-Spy" and Buzz missteps and Facebook's changing privacy policy. Simpson also raised concerns about federal agencies implicitly endorsing individual Web 2.0 technology companies, advocating for the enactment of robust privacy laws by Congress with meaningful warnings.

His testimony is embedded below:

Government 2.0, meet politics

Due to continued friction between Republicans and the Obama White House, the oversight hearing got off to a bit of a bumpy start. Politics overshadowed the technology. At issue was the Google Buzz imbroglio that involved White House deputy CTO Andrew McLaughlin and the absence of White House deputy CTO Beth Noveck, who had originally been slated to testify in front of the subcommittee on June 24th. Contrary to earlier reports, Consumer Watchdog's John Simpson did testify.

The substance of the Republican concern goes to whether communications by members of the executive branch are subject to the Presidential Records Act and thus must be disclosed. The position taken by Representative Lacy Clay (D-IL) was that the White House Office of Science and Technology Policy (OSTP), where McLaughlin works, is subject to the Federal Records Act, wherein an individual using a third-party electronic messaging system, such as Gmail, needs to ensure that a record gets into a proper records management system. Given that McLaughlin produced the email records due to a FOIA request and was reprimanded by the White House, Clay considered the matter closed.

Rep. Patrick McHenry (R-SC) did not, nor did Rep. Darrell Issa (R-CA). Issa, the ranking member on the subcommittee, has continued to voice concerns about the administration's Google ties. McHenry also raised concerns about reports that White House staff met lobbyists at Caribou Coffee or used personal email accounts to communication, thereby avoiding visitor logs or records management systems.

The issue of officials and staff in the executive branch using non-federal email systems is far from new, however, as Rep. Eleanor Holmes Norton noted in her response to McHenry. Among the thousands of emails from the Bush Administration were those sent by Karl Rove, which now appear lost to history.

It is unfortunate that, due to the politics surrounding the hearing, Noveck did not testify, a choice perhaps driven by media reports of a "showdown," as her knowledge of the use of social software by the federal government would have offered insight to both the subcommittee and the American people, to whom she and the Representatives are both ultimately responsible. After votes to subpoena Noveck and adjourn the hearing were denied on party lines, 5-4, the subcommittee heard from the four witnesses. Hillicon Valley reported further on the political wrangling yesterday.

On video, transparency and Government 2.0

The Oversight Committee posted videos of the testimony and the back and forth between legislators. Additionally, House IT staff said the government hearing on Government 2.0 would be streamed at oversight.house.gov. It was impossible to miss, however, that the committee Twitter account, @OversightDems, didn't issue a single tweet about the hearing. An ironic committee policy came to light as well: only credentialed press were allowed to use laptops at the hearing, hampering the ability of government staff, bloggers or citizens to participate in documenting, discussing or posting status updates about the event online. Emily Long, a reporter from NextGov, didn't find the hearing to be social at all. Given the focus on the elements of open government in 2010, participation, transparency and collaboration, or Government 2.0 as described by McClure, wherein lightweight tools are used to share information about government activities, that policy deserves to be revisited.

Part 1

Part 2

Part 3

Part 4


Related:

April 19 2010

La Bibliothèque du Congrès archive les tweets

La Bibliothèque du Congrès a annoncé aujourd’hui qu’elle avait acquis les archives de Twitter. Les milliards de tweets postés publiquement depuis le lancement du site de microblogue en 2006 deviendront accessibles à la postérité. L’objectif de la Bibliothèque est de conserver des «traces» d’un certain vécu collectif, dans le but de mieux définir dans le futur, le contexte d’une époque et d’un endroit donné. Actuellement, la Bibliothèque affirme détenir plus de 167 To de données numériques.

@librarycongress

Reposted fromScheiro Scheiro

February 11 2010

02mydafsoup-01
Tuesday February 9, 2010 GRITtv

President Obama promised change in Washington, but one year in we’ve got nothing but gridlock. Professor Lawrence Lessig has known Obama for years, and in this video from our friends at The Nation, Lessig calls on Obama–and all of us–to push for real change: change in Congress. We’ll be discussing this issue with Lessig and others on the show soon!
Reposted bySigalon02 Sigalon02

April 24 2009

Simon Johnson and Michael Perino

Simon Johnson and Michael Perino

March 24 2009

February 25 2009

February 02 2009

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!

Schweinderl