Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

March 06 2012

Profile of the Data Journalist: The Daily Visualizer

Around the globe, the bond between data and journalism is growing stronger. In an age of big data, the growing importance of data journalism lies in the ability of its practitioners to provide context, clarity and, perhaps most important, find truth in the expanding amount of digital content in the world. In that context, data journalism has profound importance for society.

To learn more about the people who are doing this work and, in some cases, building the newsroom stack for the 21st century, I conducted a series of email interviews during the 2012 NICAR Conference.

Matt Stiles (@stiles) , a data journalist based in Washington, D.C., maintains a popular Daily Visualization blog. Our interview follows.

Where do you work now? What is a day in your life like?

I work at NPR, where I oversee data journalism on the State Impact project, a local-national partnership between us and member stations. My typical day always begins with a morning "scrum" meeting among the D.C. team as part of our agile development process. I spend time acquiring and analyzing data throughout each data, and I typically work directly with reporters, training them on software and data visualization techniques. I also spend time planning news apps and interactives, a process that requires close consultation with reporters, designers and developers.

How did you get started in data journalism? Did you get any special degrees or certificates?

No special training or certificates, though I did attend three NICAR boot camps (databases, mapping, statistics) over the years.

Did you have any mentors? Who? What were the most important resources they shared with you?

I have several mentors, both on the reporting side and the data side. For data, I wouldn't be where I am today without the help of two people: Chase Davis and Jennifer LaFleur. Jen got me interested early, and has helped me with formal and informal training over the years. Chase helped me with day-to-day questions when we worked together at the Houston Chronicle.

What does your personal data journalism "stack" look like? What tools could you not live without?

I have a MacBook that runs Windows 7. I have the basic CAR suite (Excel/Access, ArcGIS, SPSS, etc.) but also plenty of open-source tools, such as R for visualization or MySQL/Postgres for databases. I use Coda and Text Mate for coding. I use BBEdit and Python for text manipulation. I also couldn't live without Photoshop and Illustrator for cleaning up graphics.

What data journalism project are you the most proud of working on or creating?

I'm most proud of the online data library I created (and others have since expanded) at The Texas Tribune, but we're building some sweet apps at NPR. That's only going to expand now that we've created a national news apps team, which I'm joining soon.

Where do you turn to keep your skills updated or learn new things?

I read blogs, subscribe to email lists and attend lots of conferences for inspiration. There's no silver bullet. If you love this stuff, you'll keep up.

Why are data journalism and "news apps" important, in the context of the contemporary digital environment for information?

More and more information is coming at us every day. The deluge is so vast. Data journalism at its core is important because it's about facts, not anecdotes.

Apps are important because Americans are already savvy data consumers, even if they don't know it. We must get them thinking -- or, even better, not thinking -- about news consumption in the same way they think about syncing their iPads or booking flights on Priceline or purchasing items on eBay. These are all "apps" that are familiar to many people. Interactive news should be, too.

This interview has been edited and condensed for clarity.

March 05 2012

OpenCorporates opens up new database of corporate directors and officers

In an age of technology-fueled transparency, corporations are subject to the same powerful disruption as governments. In that context, data journalism has profound importance for society. If a researcher needs data for business journalism, OpenCorporates is a bonafide resource.

Today, OpenCorporates is making a new open database of corporate officers and directors available to the world.

"It's pretty cool, and useful for journalists, to be able to search not just all the companies with directors for a given name in a given state, but across multiple states," said Chris Taggart, founder of Open Corporates, in an email interview. "Not surprisingly, loads of people, from journalists to corruption investigators, are very interested in this."

OpenCorporates is the largest open database of companies and corporate data in the world. The service now contains public data from around the world, from health and safety violations in the United Kingdom to official public notices in Spain to a register of federal contractors. The database has been built by the open data community, under a bounty scheme in conjunction with ScraperWiki. The site also has a useful Google Refine reconciliation function that matches legal entities to company names. Taggart's presentation on OpenCorporates from the 2012 NICAR conference, which provides an overview, is embedded below:

The OpenCorporates open application programming interface can be used with or without a key, although an API key does increase usage limits. The open data site's business model comes with an interesting hook: while OpenCorporates makes its data both free and open under a Share-Alike Attribution Open Database License, users who wish import the data into a proprietary database or use it without attribution must pay to do so.

"The critical thing about our Directors import, and *all* the other data in OpenCorporates, is that we give the provenance, both where and when we got the information," said Taggart. "This is in contrast to the proprietary databases who never give this, because they don't want you to go straight to the source, which also means it's problematic in tracing the source of errors. We've had several instances of the data being wrong at the source, like U.K. health and safety violations."

Taggart offered more perspective on the source of OpenCorporates director data, corporate data availability and the landscape around a universal business ID in the rest of our interview:

Where does the officer and director data come from? How is it validated and cleaned?

It's all from the official company registers. Most are scraped (we've scraped millions of pages), a couple (e.g. Vermont) are from downloads that the registries provide. We just need to make sure we're scraping and importing properly. We do some cleaning up (e.g. removing some of the '**NO DIRECTOR**' entries, but to a degree this has to be done post import, as you often don't know these till they're imported (which is why there are still a few in there).

By the way, in case you were wondering, the reason there are so many more directors than in the filters to the right is that there are about 3 million and counting Florida directors.

Was this data available anywhere before? If no, why not?

As far as I'm aware, only in proprietary databases. Proprietary databases have dominated company data. The result is massive duplication of effort, databases that have opaque errors in them, because they don't have many eyes on them, and lack of access to the public, small businesses, and as you will have heard from NICAR, journalists. I'm tempted to offer a bottle of champagne to the first journalist who finds a story in the directors data.

Who else is working on the universal business ID issue? I heard Beth Noveck propose something along these lines, for instance.

Several organizations have been working on this, mostly from a semi-proprietary point of view, or at least trying to generate a monopoly ID. In other words, it might be open, but in order to get anything on the company, you have to use their site as a lookup table.

OpenCorporates is different in that if you know the URI you know the jurisdiction and identity issued by the company register and vice versa. This means you don't need to ask OpenCorporates what the company ID is, as it's there in the ID. It also works with the EU/W3C's Business Vocabulary, which has just been published.

ISO has been working on one, but it's got exactly this problem. Also, their database won't contain the company number, meaning it doesn't link to the legal entity. Bloomberg have been working on one, as have Thomson Reuters, as they need an alternative to the DUNS number, but from the conversations I had in D.C., nobody's terribly interested in this.

I don't really know the status of Beth's project. They were intending to create a new ID too. From speaking to Jim Hendler, it didn't seem to be connected to the legal entity but instead to represent a search of the name (actually a hash of a SPARQL query). You can see a demo site at http://tw.rpi.edu/orgpedia/companies. I have severe doubts regarding this.

Finally, there's the Financial Stability Board's (part of the G20) work on a global legal entity identifier -- we're on the advisory board for this. This also would be a new number, and be voluntary, but on the other hand will be openly licensed.

I don't think it's a solution to the problem, as it won't be complete and for other reasons, but it may surface more information. We'd definitely provide an entity resolution service to it.

Sponsored post
soup-sponsored
Reposted bySchrammelhammelMrCoffeinmybetterworldkonikonikonikonikoniambassadorofdumbgroeschtlNaitliszpikkumyygittimmoejeschge

February 21 2012

Building the health information infrastructure for the modern epatient

To learn more about what levers the government is pulling to catalyze innovation in the healthcare system, I turned to Dr. Farzad Mostashari (@Farzad_ONC). As the National Coordinator for Health IT, Mostashari is one of the most important public officials entrusted with improving the nation's healthcare system through smarter use of technology.

Dr. Farzad MostashariMostashari, a public-health informatics specialist, was named ONC chief in April 2011, replacing Dr. David Blumenthal. Mostashari's full biography, available at HHS.gov, notes that he "was one of the lead investigators in the outbreaks of West Nile Virus and anthrax in New York City, and was among the first developers of real-time electronic disease surveillance systems nationwide."

I talked to Mostashari on the same day that he published a look back over 2011, which he hailed as a year of momentous progress in health information technology. Our interview follows.

What excites you about your work? What trends matter here?

Farzad Mostashari‏: Well, it's a really fun job. It feels like this is the ideal time for this health IT revolution to tie into other massive megatrends that are happening around consumer and patient empowerment, payment and delivery reform, as I talked about in my TED Med Talk with Aneesh Chopra.

These three streams [how patients are cared for, how care is paid for, and how people take care of their own health] coming together feels great. And it really feels like we're making amazing progress.

How does what's happening today grow out of the passage of the Health Information Technology for Economic and Clinical Health Act (HITECH) Act in 2009?

Farzad Mostashari‏: HITECH was a key part of ARRA, the American Recovery and Reinvestment Act. This is the reinvestment part. People think of roadways and runways and railways. This is the information infrastructure for healthcare.

In the past two years, we made as much progress on adoption as we had made in the past 20 years before that. We doubled the adoption of electronic health records in physician offices between the time the stimulus passed and now. What that says is that a large number of barriers have been addressed, including the financial barriers that are addressed by the health IT incentive payments.

It also, I think, points to the innovation that's happening in the health IT marketplace, with more products that people want to buy and want to use, and an explosion in the number of options people have.

The programs we put in place, like the Regional Health IT Extension Centers modeled after the Agriculture Extension program, give a helping hand. There are local nonprofits throughout the country that are working with one-third of all primary care providers in this country to help them adopt electronic health records, particularly smaller practices and maybe health centers, critical access hospitals and so forth.

This is obviously a big lift and a big change for medicine. It moves at what Jay Walker called "med speed," not tech speed. The pace of transformation in medicine that's happening right now may be unparalleled. It's a good thing.

Healthcare providers have a number of options as they adopt electronic health records. How do you think about the choice between open source versus proprietary options?

Farzad Mostashari‏: We're pretty agnostic in terms of the technology and the business model. What matters are the outcomes. We've really left the decisions about what technology to use to the people who have to live with it, like the doctors and hospitals who make the purchases.

There are definitely some very successful models, not only on the EHR side, but also on the health information exchange side.

(Note: For more on this subject, read Brian Ahier's Radar post on the Health Internet.)

What role do open standards play in the future of healthcare?

Farzad Mostashari‏: We are passionate believers in open standards. We think that everybody should be using them. We've gotten really great participation by vendors of open source and proprietary software, in terms of participating in an open standards development process.

I think what we've enabled, through things like modular certification, is a lot more innovation. Different pieces of the entire ecosystem could be done through reducing the barrier to entry, enabling a variety of different innovative startups to come to the field. What we're seeing is, a lot of the time, this is migrating from installed software to web services.

If we're setting up a reference implementation of the standards, like the Connect software or popHealth, we do it through a process where the result is open source. I think the government as a platform approach at the Veterans Affairs department, DoD, and so forth is tremendously important.

How is the mobile revolution changing healthcare?

We had Jay Walker talking about big change [at a recent ONC Grantee Meeting]. I just have this indelible image of him waving in his left hand a clay cone with cuneiform on it that is from 2,000 B.C. — 4,000 years ago — and in his right hand he held his iPhone.

He was saying both of them represented the cutting edge of technology that evolved to meet consumer need. His strong assertion was that this is absolutely going to revolutionize what happens in medicine at tech speed. Again, not "med speed."

I had the experience of being at my clinic, where I get care, and the pharmacist sitting in the starched, white coat behind the counter telling me that I should take this medicine at night.

And I said, "Well, it's easier for me to take it in the morning." And he said, "Well, it works better at night."

And I asked, acting as an empowered patient, "Well, what's the half life?" And he answered, "Okay. Let me look it up."

He started clacking away at his pharmacy information system; clickity clack, clickity clack. I can't see what he's doing. And then he says, "Ah hell," and he pulls out his smartphone and Googles it.

There's now a democratization of information and information tools, where we're pushing the analytics to the cloud. Being able to put that in the hand of not just every doctor or every healthcare provider but every patient is absolutely going to be that third strand of the DNA, putting us on the right path for getting healthcare that results in health.

We're making sure that people know they have a right to get their own data, making sure that the policies are aligned with that. We're making sure that we make it easy for doctors to give patients their own information through things like the Direct Project, the Blue Button, meaningful use requirements, or the Consumer E-Health Pledge.

We have more than 250 organizations that collectively hold data for 100 million Americans that pledge to make it easy for people to get electronic copies of their own data.

Do you think people will take ownership of their personal health data and engage in what Susannah Fox has described as "peer-to-peer healthcare"?

Farzad Mostashari‏: I think that it will be not just possible, not even just okay, but actually encouraged for patients to be engaged in their care as partners. Let the epatient help. I think we're going to see that emerging as there's more access and more tools for people to do stuff with their data once they get it through things like the health data initiative. We're also beginning to work with stakeholder groups, like Consumer's Union, the American Nurses Association and some of the disease groups, to change attitudes around it being okay to ask for your own records.

This interview was edited and condensed. Photo from The Office of the National Coordinator for Health Information Technology.

Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.

Save 20% on registration with the code RADAR20

Related:

February 16 2012

Four short links: 16 February 2012

  1. The Undue Weight of Truth (Chronicle of Higher Education) -- Wikipedia has become fossilized fiction because the mechanism of self-improvement is broken.
  2. Playfic -- Andy Baio's new site that lets you write text adventures in the browser. Great introduction to programming for language-loving kids and adults.
  3. Review of Alone Together (Chris McDowall) -- I loved this review, its sentiments, and its presentation. Work on stuff that matters.
  4. Why ESRI As-Is Can't Be Part of the Open Government Movement -- data formats without broad support in open source tools are an unnecessary barrier to entry. You're effectively letting the vendor charge for your data, which is just stupid.

February 14 2012

The bond between data and journalism grows stronger

While reporters and editors have been the traditional vectors for information gathering and dissemination, the flattened information environment of 2012 now has news breaking first online, not on the newsdesk.

That doesn't mean that the integrated media organizations of today don't play a crucial role. Far from it. In the information age, journalists are needed more than ever to curate, verify, analyze and synthesize the wash of data.

To learn more about the shifting world of data journalism, I interviewed Liliana Bounegru (@bb_liliana), project coordinator of SYNC3 and Data Driven Journalism at the European Journalism Centre.

What's the difference between the data journalism of today and the computer-assisted reporting (CAR) of the past?

Liliana Bounegru: There is a "continuity and change" debate going on around the label "data journalism" and its relationship with previous journalistic practices that employ computational techniques to analyze datasets.

Some argue [PDF] that there is a difference between CAR and data journalism. They say that CAR is a technique for gathering and analyzing data as a way of enhancing (usually investigative) reportage, whereas data journalism pays attention to the way that data sits within the whole journalistic workflow. In this sense, data journalism pays equal attention to finding stories and to the data itself. Hence, we find the Guardian Datablog or the Texas Tribune publishing datasets alongside stories, or even just datasets by themselves for people to analyze and explore.

Another difference is that in the past, investigative reporters would suffer from a poverty of information relating to a question they were trying to answer or an issue that they were trying to address. While this is, of course, still the case, there is also an overwhelming abundance of information that journalists don't necessarily know what to do with. They don't know how to get value out of data. As Philip Meyer recently wrote to me: "When information was scarce, most of our efforts were devoted to hunting and gathering. Now that information is abundant, processing is more important."

On the other hand, some argue that there is no difference between data journalism and computer-assisted reporting. It is by now common sense that even the most recent media practices have histories as well as something new in them. Rather than debating whether or not data journalism is completely novel, a more fruitful position would be to consider it as part of a longer tradition but responding to new circumstances and conditions. Even if there might not be a difference in goals and techniques, the emergence of the label "data journalism" at the beginning of the century indicates a new phase wherein the sheer volume of data that is freely available online combined with sophisticated user-centric tools enables more people to work with more data more easily than ever before. Data journalism is about mass data literacy.

Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.

Save 20% on registration with the code RADAR20

What does data journalism mean for the future of journalism? Are there new business models here?

Liliana Bounegru: There are all kinds of interesting new business models emerging with data journalism. Media companies are becoming increasingly innovative with the way they produce revenues, moving away from subscription-based models and advertising to offering consultancy services, as in the case of the German award-winning OpenDataCity.

Digital technologies and the web are fundamentally changing the way we do journalism. Data journalism is one part in the ecosystem of tools and practices that have sprung up around data sites and services. Quoting and sharing source materials (structured data) is in the nature of the hyperlink structure of the web and in the way we are accustomed to navigating information today. By enabling anyone to drill down into data sources and find information that is relevant to them as individuals or to their community, as well as to do fact checking, data journalism provides a much needed service coming from a trustworthy source. Quoting and linking to data sources is specific to data journalism at the moment, but seamless integration of data in the fabric of media is increasingly the direction journalism is going in the future. As Tim Berners-Lee says, "data-driven journalism is the future".

What data-driven journalism initiatives have caught your attention?

Liliana Bounegru: The data journalism project FarmSubsidy.org is one of my favorites. It addresses a real problem: The European Union (EU) is spending 48% of its budget on agriculture subsidies, yet the money doesn't reach those who need it.

Tracking payments and recipients of agriculture subsidies from the European Union to all member states is a difficult task. The data is scattered in different places in different formats, with some missing and some scanned in from paper records. It is hard to piece it together to form a comprehensive picture of how funds are distributed. The project not only made the data available to anyone in an easy to understand way, but it also advocated for policy changes and better transparency laws.

LRA Crisis Tracker

Another of my favorite examples is the LRA Crisis Tracker, a real-time crisis mapping platform and data collection system. The tracker makes information about the attacks and movements of the Lord's Resistance Army (LRA) in Africa publicly available. It helps to inform local communities, as well as the organizations that support the affected communities, about the activities of the LRA through an early-warning radio network in order to reduce their response time to incidents.

I am also a big fan of much of the work done by the Guardian Datablog. You can find lots of other examples featured on datadrivenjournalism.net, along with interviews, case studies and tutorials.

I've talked to people like Chicago Tribune news app developer Brian Boyer about the emerging "newsroom stack." What do you feel are the key tools of the data journalist?

Liliana Bounegru: Experienced data journalists list spreadsheets as a top data journalism tool. Open source tools and web-based applications for data cleaning, analysis and visualization play very important roles in finding and presenting data stories. I have been involved in organizing several workshops on ScraperWiki and Google Refine for data collection and analysis. We found that participants were quite able to quickly ask and answer new kinds of questions with these tools.

How does data journalism relate to open data and open government?

Liliana Bounegru: Open government data means that more people can access and reuse official information published by government bodies. This in itself is not enough. It is increasingly important that journalists can keep up and are equipped with skills and resources to understand open government data. Journalists need to know what official data means, what it says and what it leaves out. They need to know what kind of picture is being presented of an issue.

Public bodies are very experienced in presenting data to the public in support of official policies and practices. Journalists, however, will often not have this level of literacy. Only by equipping journalists with the skills to use data more effectively can we break the current asymmetry, where our understanding of the information that matters is mediated by governments, companies and other experts. In a nutshell, open data advocates push for more data, and data journalists help the public to use, explore and evaluate it.

This interview has been edited and condensed for clarity.

Photo on associated home and category pages: NYTimes: 365/360 - 1984 (in color) by blprnt_van, on Flickr.

Related:

February 13 2012

Open innovation works in the public sector, say federal CTOs

President Barack Obama named Aneesh Chopra as the nation’s first chief technology officer in April 2009. In the nearly three years since, he was a tireless, passionate advocate for applying technology to make government and society work better. If you're not familiar with the work of the nation's first CTO, make sure to read Nancy Scola's extended "exit interview" with Aneesh Chopra at the Atlantic. where he was clear about his role: "As an advisor to the president, I have three main responsibilities," he said: "To make sure he has the best information to make the right policy calls for the country, which is a question of my judgment."

On his last day at the White House, Chopra released an "open innovator's toolkit" that highlights twenty different case studies in how he, his staff and his fellow chief technology officers at federal agencies have been trying to stimulate innovation in government.

Chopra announced the toolkit last week at a forum on open innovation at the Center for American Progress in Washington. The forum was moderated by former Virginia congressman Tom Perriello, who currently serves as counselor for policy to the Center for American Progress and featured Todd Park, U.S. Department of Health and Human Services CTO, Peter Levin, senior advisor to the Veterans Affair Secretary and U.S. Department of Veterans Affairs CTO, and Chris Vein, deputy U.S. CTO for government innovation at the White House Office of Science and Technology Policy. Video of the event is embedded below:

An open innovator's toolkit

"Today, we are unveiling 20 specific techniques that are in of themselves interesting and useful -- but they speak to this broader movement of how we are shifting, in many ways, or expanding upon the traditional policy levers of government," said Chopra in his remarks on Wednesday. In the interview with the Atlantic and in last week's forum, Chopra laid out four pillars in the administration's approach to open innovation:

  • Moving beyond providing public sector data by request to publishing machine-readable open data by default
  • Engaging with the public not simply as a regulator but as "impatient convener"
  • Using prizes and competitions to achieve outcomes, not just procurements
  • Focusing on attracting talented people to government by allowing them to serve as “entrepreneurs-in-residence.”

"We are clearly moving to a world where you don't just get data by requesting it but it's the default setting to publish it," said Chopra. "We're moving to a world where we're acting beyond the role of regulator to one of 'impatient convening.' We are clearly moving to a world where we're not just investing through mechanisms like procurement and RFPs to one where where we're tapping into the expertise of the American people through challenges, prizes and competition. And we are changing the face of government, recruiting individuals who have more of an entrepreneur-in-residence feel than a traditional careerist position that has in it the expectation of a lifetime of service. "

"Entrepreneurs and innovators around the country are contributing to our greater good. In some cases, they're coming in for a tour of duty, as you'll hear from Todd and Peter. But in many others, they're coming in where they can and how they can because if we tap into the collective expertise of the American people we can actually overcome some of the most vexing challenges that today, when you read the newspaper and you watch Washington, you say, 'Gosh, do we have it in us' to get beyond the divisions and these challenges, not just at the federal government but across all level of the public sector."

Open innovation, applied

Applying open innovation "is a task we’ve seen deployed effectively across our nation’s most innovative companies," writes Chopra in the memorandum on open innovation that the White House released this week. "Procter & Gamble’s “Connect+Develop” strategy to source 50% of its innovations from the outside; Amazon’s “Just Do It” awards to celebrate innovative ideas from within; and Facebook’s “Development Platform” that generated an estimated 180,000 jobs in 2011 focused on growing the economy while returning benefits to Facebook in the process."

The examples that Chopra cited are "bonafide," said MIT principal research professor Andrew McAfee, via email. "Open innovation or crowdsourcing or whatever you want to call it is real, and is (slowly) making inroads into mainstream (i.e. non high-tech) corporate America. P&G is real. Innocentive is real. Kickstarter is real. Idea solicitations like the ones from Starbucks are real, and lead-user innovation is really real."

McAfee also shared the insight of Eric Von Hippel on innovation:

“What is changing,” is that it is getting easier for consumers to innovate, with the Internet and such tools, and it is becoming more visible for the same reason. Historically though the only person who had the incentive to publicize innovation was the producer. People build institutions around how a process works and the mass production era products were built by mass production companies, but they weren’t invented by them. When you create institutions like mass production companies you create the infrastructure to help and protect them such as heavy patent protection. Now though we see that innovation is distributed, open collaborative.”

In his remarks, Chopra hailed a crowdsourced approach to the design of DARPA's next-generation combat vehicle, where an idea from a U.S. immigrant led to a better outcome. "The techniques we’ve deployed along the way have empowered innovators, consumers, and policymakers at all levels to better use technology, data, and innovation," wrote Chopra in the memo.

"We’ve demonstrated that “open innovation,” the crowdsourcing of citizen expertise to enhance government innovation, delivers real results. Fundamentally, we believe that the American people, when equipped with the right tools, can solve many problems." To be fair, the "toolkit" in question amounts more to a list of links and case studies than a detailed manual or textbook, but people interested in innovating in government at the local, state and national level should find it useful.

The question now is whether the country and its citizens will be the "winners in the productivity revolutions of the future," posed Chopra, looking to the markets for mobile technology, healthcare and clean energy. In that context, Chopra said that "open data is an active ingredient" in job creation and economic development, citing existing examples. 6 million Californians can now download their energy data through the Green Button, said Chopra, with new Web apps like Watt Quiz providing better interfaces for citizens to make more informed consumption decision.

More than 76,000 Americans found places to get treatment or health services using iTriage, said Chopra, with open data spurring better healthcare decisions by a more informed mobile citizenry. He hailed the role of collaborative innovation in open government, with citing mobile healthcare app ginger.io.

Open government platforms

During his tenure as US CTO, Chopra was a proponent of open data, participatory platforms and one of the Obama administration's most prominent evangelists for the use of technology to make government more open and collaborative. Our September 2010 interview on his work is embedded below:

In his talk last Wednesday, Chopra highlighted two notable examples of open government. First, he described the "startup culture" at the Consumer Financial Protection Bureau, highlighting the process by which the new .gov agency designed a better mortgage disclosure form.

Second, Chopra cited two e-petitions to veto the Stop Online Piracy Act and Protect IP Act on the White House e-petition platform, We The People, as an important example of open government in actions. The e-petitions, which gathered more than 103,000 signatures, are proof that when citizens are given the opportunity to participate, they will, said Chopra. The White House response, which came at a historic moment in the week the Web changed Washington. "SOPA/PIPA is exactly what We the People was meant to do," Chopra told Nancy Scola.

Traditionally, Congress formally requests a Statement of Administration Policy, called a "SAP." Requests for SAPs come in all the time from Congress. We respond based on the dynamics of Washington, priorities and timelines. One would argue that a Washington-centric approach would have have been to await the request for a SAP and publish it, oftentimes when a major vote is happening. If you contrast that were SOPA/PIPA was, still in committee or just getting out of committee, and not yet on the floor, traditionally a White House would not issue a SAP that early. So the train we were on, the routine Washington line of business, we would have awaited the right time to issue a SAP, and done it at congressional request. It just wasn't time yet. The We the People process flipped upside-down to whom we are responsible for providing input. In gathering over a hundred thousand signatures, on SOPA/PIPA, the American people effectively demanded a SAP.

Innovation for healthcare and veterans

"I think people will embrace the open innovation approach because it works," said Todd Park at last week's forum, citing examples at Novartis, Aventis and Walgreens, amongst others. Park cited "Joy's Law," by Sun Microsystems computer science pioneer Bill Joy: "no matter who you are, you have to remember that most of the smart people don't work for you."

Part of making that work is opening up systems in a way that enables citizens, developers and industry to collaborate in creating solutions. "We're moving the culture away from proprietary, closed systems … into something that is modular, standards-based & open, said Peter Levin.

If you went to the Veterans Affairs website in 2009, you couldn't see where you were in the process, said Levin. One of the ways to solve that problem is to create a platform for people to talk to each other, he explained, which the VA was able to do that through its Facebook page.

That may be a "colossal policy change," in his view, but it had an important result: "the whole patronizing fear that if we open up dialogue, open up channels, you'll create a problem you can't undo - that's not true for us," he said.

If you want to rock and roll, emphasized Park, don't just have your own smart people work on a challenge. That's an approach that Aventis executives found success using in a data diabetes challenge. Walgreens will be installing "Health Guides" at its stores to act as a free "health concierge," said Park, as opposed to what they would have done normally. They launched a challenge and, in under three months, got 50 credible prototypes. Now, said Park, mHealthCoach is building Health Guides for Walgreens.

One of the most important observations Park made, however, may have been that there has been too much of a focus on apps created from open data, as opposed to data informing policy makers and care givers. If you want to revolutionize the healthcare industry, open data needs to be at the fingertips of the people who need it most, where then need it most, when they need it most.

For instance, at a recent conference, he said, "Aetna rolled out this innovation called a nurse." If you want to have data help people, built a better IT cockpit for that nurse that helps that person become more omniscient. Have the nurse talk over the telephone with a human who can be helped by the power of the open data in front of the healthcare worker.

Who will pick up the first federal CTO's baton?

Tim O'Reilly made a case for Chopra in April 2009, when the news of his selection leaked. Tim put the role of a federal CTO in the context of someone who provides "visionary leadership, to help a company (or in this case, a government) explore the transformative potential of new technology." In many respects, he delivered upon that goal during his tenure. The person who fills the role will need to provide similar leadership, and to do so in a difficult context, given economic and political headwinds that confront the White House.

As he turns the page towards the next chapter of his career -- one which sources cited by the Washington Post might lead him into politics in Virginia -- the open question now will be who President Obama will choose to be the next "T" in the White House Office of Science and Technology Policy, a role that remains undefined, in terms of Congressional action.

The administration made a strong choice in federal CIO Steven VanRoekel. Inside of government, Park or Levin are both strong candidates for the role, along with Andrew Blumenthal, CTO at the Bureau of Alcohol, Tobacco and Firearms. In the interim, Chris Vein, deputy chief technology office for public sector innovation, is carrying the open government innovation banner in the White House.

In this election year, who the administration chooses to pick up the baton from Chopra will be an important symbol of its commitment to harnessing technology on behalf of the American people. Given the need for open innovation to addressing the nation's grand challenges, from healthcare to energy to education, the person tapped to run this next leg will play an important role in the country's future.

Related:

February 01 2012

With GOV.UK, British government redefines the online government platform

The British Government has launched a beta of its GOV.UK platform, testing a single domain for that could be used throughout government. The new single government domain will eventually replace Directgov, the UK government portal which launched back in 2004. GOV.UK is aimed squarely as delivering faster digital services to citizens through a much improved user interface at decreased cost.

Unfortunately, far too often .gov websites cost millions and don't deliver as needed. GOV.UK is mobile-friendly, platform agnostic, uses HTML5, scalable, open source, hosted in the cloud and open for feedback. Those criteria collectively embody the default for how government should approach their online efforts in the 21st century.

gov.uk screenshot

“Digital public services should be easy to find and simple to use - they must also be cost effective and SME-friendly," said Francis Maude, the British Minister for the Cabinet Office, in a prepared statement. "The beta release of a single domain takes us one step closer to this goal."

Tom Loosemore, deputy director of government digital service at UK Government, introduced the beta of GOV.UK at the Government Digital Service blog, including a great deal of context on its development and history. Over at the Financial Times Tech blog, Tim Bradshaw published an excellent review of the GOV.UK beta.

As Bradshaw highlights, what's notable about the new beta is not just the site itself but the team and culture behind it: that of a large startup, not the more ponderous bureaucracy of Whitehall, the traditional "analogue" institution..

GOV.UK is a watershed in how government approaches Web design, both in terms of what you see online and how it was developed. The British team of developers, designers and managers behind the platform collaboratively built GOV.UK in-house using agile development and the kind of iterative processes one generally only sees in modern Web design shops. Given that this platform is designed to serve as a common online architecture for the government of the United Kingdom, that's meaningful.

“Our approach is changing," said Maude. "IT needs to be commissioned or rented, rather than procured in huge, expensive contracts of long duration. We are embracing new, cloud-based start-ups and enterprise companies and this will bring benefits for small and medium sized enterprises here in the UK and so contribute to growth.”

The designers of GOV.UK, in fact, specifically describe it as "government as a platform," in terms of something that others can build upon. It was open from the start, given that the new site was built entirely using open source tools. The code behind GOV.UK has been released as open source code on GitHub.

"For me, this platform is all about putting the user needs first in the delivery of public services online in the UK," said Mike Bracken, executive director of government digital services. Bracken is the former director of digital development at the Guardian News and Media and was involved in setting up MySociety. "For too long, user need has been trumped by internal demands, existing technology choices and restrictive procurement practices. Gov.uk puts user need firmly in charge of all our digital thinking, and about time too."

The Gov.UK stack

Reached via email, Bracken explained more about the technology choices that have gone into GOV.UK, starting with the platform diagram below.

gov.uk screenshot

Why create an open source stack? "Why not?" asked Bracken."It's a government platform, and as such it belongs to us all and we want people to contribute and share in its development."

While many local, state and federal sites in the United States have chosen to adapt and use Wordpress or Drupal as open government platforms, the UK team started with afresh.

"Much of the code is based on our earlier alpha, which we launched in May last year as an early prototype for a single platform," said Bracken. "We learnt from the journey, and rewrote some key components recently, one key element of the prototype in scale."

According to Bracken, the budget for the beta is £1.7 million pounds, which they are running under at present. (By way of contrast, the open government reboot of FCC.gov was estimated to cost 1.35 million dollars.) There are about 40 developers coding on GOV.UK, said Bracken, but the entire Government Digital Service has around 120 staff, with up to 1800 external testers. They also used several external development houses to complement their team, some for only two weeks at a time.

Why build an entirely new open government platform? "It works," said Bracken. "It's inherently flexible, best of breed and completely modular. And it doesn't require any software licenses."

Bracken believes that the GOV.UK will give the British government agility, flexibility and freedom to change as they go, which are, as he noted not characteristics aligned with the usual technology build in the UK -- or elsewhere, for that matter.

Given the British government's ambitious plans for open data, the GOV.UK platform also will need to be act as, well, a platform. On that count, they're still planning, not implementing.

"With regard to API's, our long term plan is to 'go wholesale,' by which we mean expose data and services via API's," said Bracken. "We are at the early stages of mapping out key attributes, particularly around identity services, so to be fair it's early days yet. The inherent flexibility does allow for us to accommodate future changes, but it would be premature to make substantial claims to back up API delivery at this point."

The GOV.UK platform will be adaptable for the purposes of city government as well, over time. "We aim to migrate key department sites onto it in the first period of migration, and then look at government agencies," said Bracken. "The migration, with over 400 domains to review, will take more than a year. We aim to offer various platform services which meet the needs of all Government service providers."

Making GOV.UK citizen-centric

The GOV.UK platform was also designed to be citizen-centric, keeping the tasks that people come to a government site to accomplish in mind. Its designers, apparently amply supplied with classic British humor, dubbed the engine that tracks them the "Needotron."

"We didn't just identify top needs," said Loosemore, via email. "We built a machine to manage them for us now and in the future. Currently there are 667!" Loosemore said that they've open sourced the Needotron code, for those interested in tracking needs of their own.

"There are some of the Top needs we've not got to properly yet," said Loosemore. "For example, job search is still sub-optimal, as is the stuff to do with losing your passport."

According to Loosemore, some the top needs that citizens have when they come to a site in the UK are determining the minimum wage, learning when the public and bank holidays are or when the clocks change for British Summer Time. They also come to central government to pay their council tax, which is actually a local function, but GOV.UK is designed to route those users to the correct site using geolocation.

This beta will have the top 1000 things you would need to do government, said Maude, speaking at the Sunlight Foundation this week. (If that's so, there's over 300 more yet to go.)

"There's massive change needed in our approach to how to digitize what we do," he said. "Instead of locking in with a massive supplier, we need to be thinking of it the other way around. What do people need from government? Work from the outside in and redesign processes."

In his comments, Maude emphasized the importance of citizen-centricity, with respect to interfaces. We don't need to educate people on how to use a service, he said. We need to educate government on how to serve the citizen.

"Like U.S., the U.K. has a huge budget deficit," he said. "The public expects to be able to transact with government in a cheap, easy way. This enables them to do it in a cheaper, easier way, with choices. It's not about cutting 10 or 20% from the cost but how to do it for 10 or 20% of the total cost."

The tech behind Gov.UK

James Stewart, who was the tech lead on the beta of GOV.UK, recently blogged about and browser support. He emailed me the following breakdown of the rest of the technology behind GOV.UK.

Hosting and Infrastructure:

  • DNS hosted by Dyn.com
  • Servers are Amazon EC2 instances running Ubuntu 10.04LTS
  • Email (internal alerts) sending via Amazon SES and Gmail
  • Miscellaneous file storage on Amazon S3
  • Jetty application server
  • Nginx, Apache and mod_passenger
  • Jenkins continuous integration server
  • Caching by Varnish
  • Configuration management using Puppet

Front end

  • Javascript uses jQuery, jQuery UI, Chosen, and a variety of other plugins
  • Gill Sans, provided by fonts.com
  • Google web font loader

Languages, Frameworks and Plugins

"Most of the application code is written in Ruby, running on a mixture of Rails and Sinatra," said Stewart. "Rails and Sinatra gave us the right balance of productivity and clean code, and were well known to the team we've assembled. We've used a range of gems along with these, full details of which can be found in the Gemfiles at Github.com/alphagov."

The router for GOV.UK is written in Scala and uses Scalatra for its internal API, said Stewart. "The router distributes requests to the appropriate backend apps, allowing us to keep individual apps very focused on a particular problem without exposing that to visitors," said Stewart. "We did a bake-off between a ruby implementation and a Scala implementation and were convinced that the Scala version was better able to handle the high level of concurrency this app will require."

Databases

  • MongoDB. "We started out building everything using MySQL but moved to MongoDB as we realised how much of our content fitted its document-centric approach," said Stewart. "Over time we've been more and more impressed with it and expect to increase our usage of it in the future."
  • MySQL, hosted using Amazon's RDS platform. "Some of the data we need to store is still essentially relational and we use MySQL to store that," said Stewart. "Amazon RDS takes away many of the scaling and resilience concerns we had with that, without requiring changes to our application code."
  • MaPit geocoding and information service from mySociety. "MaPit not only does conventional geocoding, " said Stewart, in terms of determining what the given the longitude or latitude is for a postcode, but " italso gives us details of all the local government areas a postcode is in, which lets us point visitors to relevant local services."

Collaboration tools

gov.uk screenshot

  • Campfire for team chat
  • Google Apps
  • MediaWiki
  • Pivotal Tracker
  • Many, many index cards.

Related:

January 24 2012

O'Reilly Radar 01/24/12: Info overload vs over-consumption

Below you'll find the script and associated links from the January 24, 2011 episode of O'Reilly Radar. An archive of past shows is available through O'Reilly Media's YouTube channel and you can subscribe to episodes of O'Reilly Radar via iTunes.


Do you suffer from information overload?

If so you may be surprised to learn that Clay Johnson, author of "The Information Diet," believes that consumption — not overload — is the source of our information problems. My interview with Johnson is coming up in just a moment.

Also in this episode of O'Reilly Radar:

We take a look at top stories recently published across O'Reilly's platforms.

And O'Reilly's Alex Howard sits down with San Francisco mayor Ed Lee to discuss open data, open government, and bridging the digital divide.


The Radar interview: Clay Johnson

Coming up next in the Radar interview, "Information Diet" author Clay Johnson explains the difference between information overload and information over-consumption.

Radar top stories

Up next we take a look at some of the top stories recently published across O'Reilly's platforms.

Alistair Croll says the information economy is giving way to something new: the feedback economy. Alistair notes that the efficiencies and optimizations that come from constant feedback will soon become the norm for businesses and governments. Read the post.

In his piece, "Epatients: The hackers of the healthcare world," Fred Trotter explains what an epatient is — the "e" stands for "empowered" — and he offers a collection of epatient resources and first steps. Read the post.

Finally, Strata chair Edd Dumbill looks at the five key themes that will define the data world in the months ahead. Edd expects to see developments in streaming data frameworks and data marketplaces, along with a maturation in the roles and processes of data science. Read the post.

Links to these stories and other resources mentioned during this episode are available at radar.oreilly.com/show.


Radar video spotlight

At his recent swearing-in ceremony, new San Francisco major Ed Lee noted:

"We in government should not be afraid of disruption. We should embrace it."

In the following interview, conducted at Web 2.0 Summit last fall, you'll learn how Lee and San Francisco are putting that disruption to use through open data and open government.

Closing

Just a reminder that you can always catch episodes of O'Reilly Radar at youtube.com/oreillymedia and subscribe to episodes through iTunes.

All of the links and resources mentioned during this episode are posted at radar.oreilly.com/show.

That's all we have for this episode. Thanks for joining us and we'll see you again soon.

January 23 2012

January 19 2012

Play fullscreen

Justin Reich, Berkman Center Fellow

Will Free Benefit the Rich? How Free and Open Education Might Widen Digital Divides (permalink - Berkman Center)

Tuesday, Janary 17, 2012

The explosion of open education content resources and freely available collaboration and media production platforms represents one of the most exciting emerging trends in education. These tools create unprecedented opportunities for teachers to design and personalize curriculum and to give students opportunities to collaborate, publish, and take responsibility for their own learning.  Many education technology and open education advocates hope that the widespread availability of free resources and platforms will disproportionately benefit disadvantaged students, by making technology resources broadly available that were once only available to affluent students. It is possible, however, that affluent schools and students have a greater capacity to take up new innovations, even free ones, and so new tools and resources that appear in the ecology of education will widen rather than ameliorate digital divides. In this presentation, we will examine evidence for both the "tech as equalizer" and "tech as accelerator of digital divides" hypotheses, and we will examine technology innovations and interventions that specifically target learners with the most needs. A lively discussion will follow to consider how educators, technologists, and policymakers can address issues of educational digital inequalities in their work. An introduction to these issues can be found in this video op-ed.

About Justin

I’m a doctoral student at the Harvard Graduate School of Education and a Fellow at the Berkman Center for the Internet and Society. I’m the project manager for the Distributed Collaborative Learning Community, a Hewlett Foundation funded initiative to study issues of excellence, equity and analytics in the use of social technologies in K-12 settings.

I’m also the co-director of EdTechTeacher, a social venture that provides professional learning services to schools and teachers. Our mission is to help educators leverage technology to create student-centered, inquiry-based learning environments. We also publish the Best of History Web Sites and Teaching History with Technology.

Fundamentally, I’m motivated by the belief that young people are tremendously capable, and we need to develop educational systems that tap their energy, creativity, drive and talent.

Personally, I’m a husband and father and an avid adventurer and traveler. I have a long association with Camp Chewonki.

Links

December 30 2011

2011 Gov 2.0 year in review

By most accounts, the biggest stories of 2011 were the Arab Spring, the historic earthquake and tsunami in Japan, and the death of Osama Bin Laden. In each case, an increasingly networked world experienced those events together through the growing number of screens. At the beginning of the year, a Pew Internet survey emphasized the Internet's importance in civil society. By year's end, more people were connected than ever before.

Time magazine named 2011 the year of the protester, as apt a choice as "You" was in 2006. "No one could have known that when a Tunisian fruit vendor set himself on fire in a public square, it would incite protests that would topple dictators and start a global wave of dissent," noted Time. "In 2011, protesters didn't just voice their complaints; they changed the world."

The Arab Spring extended well through summer, fall and winter, fueled by decades of unemployment, repression, and autocratic rule in Tunisia, Egypt, Libya, Syria, Yemen and Bahrain. This year's timeline of protest, revolution and uprising was not created by connection technologies, but by year's end, it had been accelerated by millions of brave young people connected to one another and the rest of the world through cell phones, social networks and the Internet.  

"We use Facebook to schedule the protests, Twitter to coordinate, and YouTube to tell the world," said an unnamed activist in Cairo in January.

In the months that followed, the Occupy Wall Street movement used the same tools in the parks and streets of the United States to protest economic inequality and call for accountability in the financial industry, albeit without the same revolutionary results.

This was the year where unemployment remained stubbornly high in the United States and around the world, putting job creation and economic growth atop the nation's priority list.

The theme that defined governments in Europe, particularly England, was austerity, as a growing debt crisis and financial contagion spread and persisted throughout the year. In Washington, the theme might be gridlock, symbolized by a threatened government shutdown in April and then brinkmanship over the debt crisis during the summer. As the year came to a close, a dispute between the White House, Senate and House over the extension of payroll tax cuts rounded out a long year of divided government.

We also saw a growing conflict between closed and open. It was a year that included social media adoption by government and a year where governments took measures to censor and block it. It was a year when we learned to think different about hacking, even while the "hacktivism" embodied in groups like Anonymous worried officials and executives in boardrooms around the world.

The United States bid farewell to its first CIO, Vivek Kundra, and welcomed his replacement, Steven VanRoekel, who advanced a "future first" vision for government that focuses on cloud, open standards, modularity and shared services. VanRoekel brought a .com mentality to the FCC, including a perspective that "everything should be an API," which caught the attention of some tech observers. While Kundra may have left government, his legacy remains: cloud computing and open data aren't going away in federal government, according to his replacement and General Services Administration (GSA) officials.

This was the year where the death of Steve Jobs caused more than a few people to wonder what Jobs would do as president. His legacy will resonate for many years to come, including the App Store that informed the vision of government as a platform.

If you look back at a January interview with Clay Johnson on key trends for Gov 2.0 and open government in 2011, some of his predictions bore out. The House of Representatives did indeed compete with the White House on open government, though not in story lines that played out in the national media or Sunday morning talk shows. The Government Oversight and Reform Committee took a tough look at the executive's progress in a hearing on open government. Other predictions? Not so much. Rural broadband stalled. Transparency as infrastructure is still in the future. We're still waiting on that to be automated, though when the collective intelligence of people in Washington looks at new versions of bills tied to the social web, there's at least a kludge.

Many of the issues and themes in 2011 were extensions of those in the 2010 Gov 2.0 Year in Review: the idea of government as a platform spread around the world; gated governments faced disruption; open government initiatives were stuck in beta; open data went global; and laws and regulations were chasing technology, online privacy, cloud computing, open source and citizen engagement.

"It's tough to choose which issue dominated the year in transparency, but I'd say that the Open Government Partnership, the E-government funding fight, and the Super Committee all loomed large for Sunlight," said John Wonderlich, policy director for the Sunlight Foundation. "On the state level, I'd include Utah's fight over FOI laws, Tennessee's Governor exempting himself from financial disclosure requirements, and the Wisconsin fight as very notable issues.  And the rise of Super PACs and undisclosed money in politics is probably an issue we're only just starting to see."

Three dominant tech policy issues

Privacy, identity and cybersecurity dominated tech policy headlines coming out of D.C. all year. By year's end, however, no major cybersecurity or consumer privacy bill had made it through the U.S. Congress to the president's desk. In the meantime, the Federal Trade Commission (FTC) made its own moves. As a result, Google, Facebook and Twitter are all now subject to "audits" by the FTC every two years.

On the third issue — cybersecurity — there was progress: The U.S. government's National Strategy for Trusted Identities in Cyberspace addressed key issues around creating an "identity ecosystem online." Implementation, however, will require continued effort and innovation from the private sector. By year's end, Verizon became the first identity provider to receive Level of Access 3 credentialing from the U.S. government. Look for more identity providers to follow in 2012, with citizens gaining increased access to government services online as a result.

A meme goes mainstream

This was the year when the story of local governments using technology with citizens earned more attention from mainstream media, including outlets like the Associated Press and National Public Radio.

In February, the AP published a story about how cities are using tech to cull ideas from citizens. In the private sector, leveraging collective intelligence is often called crowdsourcing. In open government, it's "citizensourcing." In cities around the country, the approach is gaining traction.

At Yahoo Canada, Carmi Levy wrote that the future of government is citizen focused. In his view, open government is about leveraging technology and citizens to do more with less. It's about doing more than leaving or speaking up: it's making government work better.

In November, NPR listeners learned more about the open government movement around the country when the Kojo Nnamdi Show hosted an hour-long discussion on local Gov 2.0 on WAMU in Washington, D.C. Around the same time, the Associated Press reported that a flood of government data is fueling the rise of city apps:

New York, San Francisco and other cities are now working together to develop data standards that will make it possible for apps to interact with data from any city. The idea, advocates of open data say, is to transform government from a centralized provider of services into a platform on which citizens can build their own tools to make government work better.

Gov 2.0 goes local

All around the country, pockets of innovation and creativity could be found, as "doing more with less" became a familiar mantra in many councils and state houses. New open data platforms or citizen-led initiatives sprouted everywhere.

Here's just a sample of what happened at the local level in 2011:

If you want the full fire hose, including setbacks to open government on the state level, read the archives of the Sunlight Foundation's blog, which aggregated news throughout the year.

Several cities in the United States hopped on the open government and open data bandwagon in 2011. Baltimore empowered its citizens to acts as sensors with new mobile apps and Open311. New York City is opening government data and working to create new relationships with citizens and civic developers in the service of smart government. Further afield, Britain earned well deserved attention for seeking alpha, with its web initiatives and an open architecture that could be relevant to local governments everywhere.

In 2011, a model open government initiative gained traction in Cook County. In 2012, we'll see if other municipalities follow. The good news is that the Pew Internet and Life Project found that open government is tied to higher levels of community satisfaction. That carrot for politicians comes up against the reality that in a time of decreased resources, being more open has to make economic sense and lead to better services or more efficiency, not just be "the right thing to do."

One of the best stories in open government came from Chicago, where sustainability and analytics are guiding Chicago's open data and app contest efforts. The city's approach offers important insights to governments at all levels. Can the Internet help disrupt the power of Chicago lobbyists through transparency? We'll learn more in 2012.

Rise of the civic startups

This year, early entrants like SeeClickFix and Citysourced became relatively old hat with the rise of a new class of civic startups that aspire to interface with the existing architectures of democracy. Some hope to augment what exists, others to replicate democratic institutions in digital form.  [Disclosure: O'Reilly AlphaTech Ventures is an investor in SeeClickFix.]

This year, new players like ElectNext, OpenGovernment.org, Civic Commons, Votizen and POPVOX entered the mix alongside many other examples of social media and government innovation. [Disclosure: Tim O'Reilly was an early angel investor in POPVOX.]

In Canada, BuzzData aspires to be the GitHub of datasets. Simpl launched as a platform to bridge the connection between social innovators and government. Nation Builder went live with its new online activism platform.

Existing civic startups made progress as well. BrightScope unlocked government data on financial advisers and made the information publicly available so it could be indexed by search engines. The Sunlight Foundation put open government programming on TV and a health app in your pocket. Code for America's 2011 annual report offered insight into the startup nonprofit's accomplishments.

Emerging civic media

The 2011 Knight News Challenge winners illustrated data's ascendance in media and government. It's clear that data journalism and data tools will play key roles in the future of media and open government.

It was in that context that the evolution of Safecast offered us a glimpse into the future of networked accountability, as citizen science and open data help to inform our understanding of the world. After a tsunami caused a nuclear disaster in Japan, a radiation detection network starting aggregating and publishing data. Open sensor networks look like an important part of journalism's future.

Other parts of the future of news are more nebulous, though there was no shortage of discussion about it. The question of where citizens will get their local news wasn't answered in 2011. A Pew survey of local news sources revealed the influence of social and mobile trends, along with a generation gap. As newsprint fades, what will replace it for communities? We don't know yet.

Some working models are likely to be found in civic media, where new change agents aren't just talking about the future of news; they're building it. Whether it's mobile innovation or the "Freedom Box," there's change afoot.

This was also a deadly year for journalists. The annual report from the Committee to Protect Journalists found 44 journalists were killed in the line of duty, with the deaths of dozens more potentially associated with the process of gathering and sharing information. Only one in six people lives in a country with a free press, according to the 2011 report on world press freedom from Freedom House.

Open source in government

At the federal level, open source continued its quiet revolution in government IT. In April, the new version of FCC.gov incorporated the principles of Web 2.0 into the FCC's online operations. From open data to platform thinking, the reboot elevated FCC.gov from one of the worst federal websites to one of the best. In August, the Energy Department estimated that the new Energy.gov would save $10 million annually through a combination of open source technology and cloud computing.

The White House launched IT Dashboard and released parts of it as open source code. (It remains to be seen whether the code from those platforms is re-used in the market.)

NASA's commitment to open source and its game plan for open government were up for discussion at the recent NASA Open Source Summit. One of NASA's open source projects, Nebula, saw its technology used in an eponymous startup. Nebula, the company, combines open source software and hardware in an appliance. If Nebula succeeds, its "cloud controller" could enable every company to implement cloud computing.

In cities, the adoption of "Change By Us" in Philadelphia and OpenDataPhilly in Chattanooga showed the potential of reusable civic software.

At the end of 2011, Civic Commons opened up its marketplace. The Marketplace is designed to be a resource for open source government apps. As Nick Judd observed at techPresident, both Civic Commons and its Marketplace "propose to make fundamental changes to the way local governments procure IT goods and services."

Open government goes global

As White House tech talent comes and goes, open government continued to grow globally.

In September, a global Open Government Partnership (OGP) launched in New York City. Video of the launch, beginning with examples of open government innovation from around the world, is embedded below:

Making the Open Government Partnership work won't be easy, but it's an important initiative to watch in 2011. As The Economist's review of the Open Government Partnership highlights, one of the most important elements is the United States' commitment to join the Extractive Industries Transparency Initiative. If this initiative bears fruit, citizens will have a chance to see how much of the payments oil and gas companies send to governments actually end up in the public's coffers.

Even before the official launch of the OGP, there was reason to think that something important was afoot globally in the intersection of governments, technology and society. In Africa, the government of Kenya launched Open Kenya and looked to the country's dynamic development community to make useful applications for its citizens. In Canada, British Columbia joined the ranks of governments embracing open government platforms. Canadian citizens in the province of British Columbia now have three new websites that focus on open government data, making information related to accountability available and providing easier access to services and officials. In India, the seeds of Gov 2.0 started bearing fruit through a growing raft of civil society initiatives. In Russia, Rospil.info aimed to expose state corruption.

For open government advocates, the biggest advance of the year was "the recognition of the need for transparency of government information world wide as a means for holding government and its officials accountable," said Ellen Miller, executive director of the Sunlight Foundation, via email. "The transparency genie is out of the bottle — world wide — and it's not going back into the darkness of that lantern ever again.  Progress will be slow, but it will be progress."

Federal open government initiatives

"Cuts in e-gov funds, Data.gov evolution, Challenge.gov and the launch of many contests were the big stories of the year," commented Steve Ressler, the founder of Govloop. Ressler saw Gov 2.0 go from a shiny thing to people critically asking how it delivers results.

At the beginning of the year, OMB Watch released a report that found progress on open government but a long road ahead. At the end of 2011, the Sunlight Foundation assessed the Open Government Directive two years on and found "mixed results." John Wonderlich put it this way:

Openness without information is emptiness.  If some agencies won't even share the plans they've made for publishing new information, how far can their commitment to openness possibly go? The Open Government Directive has caused a lot of good.  And it has also often failed to live up to its promise, the administration's rhetoric, and agencies' own self-imposed compliance plans. We should remember that Presidential rhetoric and bureaucratic commitments are not the same thing as results, especially as even more administration work happens through broad, plan-making executive actions and plans.

In 2011, reports of the death of open government were greatly exaggerated. That doesn't mean its health in the United States federal government is robust. In popular culture, of course, its image is even worse. In April, Jon Stewart and the Daily Show mocked the Obama administration and the president for a perceived lack of transparency.

Stewart and many other commentators have understandably wondered why the president's meeting with open government advocates to receive a transparency award wasn't on the official schedule or covered by the media. A first-hand account of the meeting from open government advocate Danielle Brian offered a useful perspective on the issues that arose that go beyond a sound bite or one-liner.

Some projects are always going to be judged as more or less effective in delivering on the mission of government than others. An open government approach to creating a "Health Internet" may be the most disruptive of them. For those who expected to see rapid, dynamic changes in Washington fueled by technology, however, the bloom has long since come off of the proverbial rose. Open government is looking a lot more like an ultra-marathon than a 400-yard dash. As a conference at the National Archives reminded the open government community, media access to government information also has a long way to go.

Reports on citizen participation and rulemaking from America Speaks offered open government guidance beyond technology. Overall, the administration received mixed marks. While America Speaks found that government agencies "display an admirable willingness to experiment with new tools and techniques to involve citizens with their decision-making processes," it also found the "Open Government Initiative and most Federal Agency plans have failed to offer standards for what constitutes high-quality public participation."

On the one hand, agencies are increasing the number of people devoted to public engagement and using a range of online and offline forums. On the other, "deliberative processes, in which citizens learn, express points of view, and have a chance to find common ground, are rarely incorporated." Getting to a more social open government is going to take a lot more work.

There were other notable landmarks. After months of preparation, the local .gov startup went live. While ConsumerFinance.gov went online back in February, the Consumer Financial Protection Board (CFPB) officially launched on the anniversary of H.R.4173 (the Dodd-Frank Wall Street Reform and Consumer Protection Act),  with Richard Cordray nominated to lead it.  By year's end, however, he still had not been confirmed. Questions about the future of the agency remain, but to place credit where credit is due: the new consumer bureau has been open to ideas about how it can do its work better. This approach is what led New York Times personal finance columnist Ron Lieber to muse recently that "its openness thus far suggests the tantalizing possibility that it could be the nation's first open-source regulator."

When a regulator asks for help redesigning a mortgage disclosure form, something interesting is afoot.

It's extremely rare that an agency gets built from scratch, particularly in this economic and political context. It's notable, in that context, that the 21st century regulator embraced many of the principles of open government in leveraging technology to stand up the Consumer Financial Protection Bureau.

This fall, I talked with Danny Weitzner, White House deputy chief technology officer for Internet policy, about the administration's open government progress in 2011. Our interview is embedded below:

In our interview, we talked about what the Internet means to government and society, intellectual property, the risks of a balkanized Internet, digital privacy, the Direct Project, a "right to connect," ICE takedowns and open data initiatives. On the last issue, the Blue Button movement, which enables veterans to download a personal health record, now has a website: BlueButtonData.org. In September, Federal CTO Aneesh Chopra challenged the energy industry to collaborate in the design of a "green button" modeled after that Blue Button. All three of California's public utilities have agreed to standardize energy data for that idea.

Tim O'Reilly talked with Chopra and White House deputy CTO for public sector innovation Chris Vein about the White House's action plan for open government innovation at the Strata Summit in September. According to Chopra, the administration is expanding Data.gov communities to agencies, focusing on "smart disclosure" and building out "government as a platform," with an eye to embracing more open innovators.

As part of its commitments to the Open Government Partnership, the White House also launched an e-petitions platform this fall called "We The People."

The White House has now asked for feedback on the U.S. Open Government National Action Plan, focusing on best practices and metrics for public participation. Early responses include focusing on outcomes first and drawing attention to success, not compliance. If you're interested in giving your input, Chopra is asking the country questions on Quora.

Opening the People's House

Despite the abysmal public perception of Congress, genuine institutional changes in the House of Representatives, driven by the GOP embracing innovation and transparency, are incrementally happening. As Tim O'Reilly observed earlier in the year, the current leadership of the House is doing a better job on transparency than their predecessors.

In April, Speaker John Boehner and Majority Leader Eric Cantor sent a letter to the House Clerk about releasing legislative data. Then, in September, a live XML feed for the House floor went online. Yes, there's a long way to go on open legislative data quality in Congress — but at year's end,  following the first "Congressional hackathon," the House approved sweeping open data standards.

The House also made progress in opening up its recorded videos to the nation. In January, Carl Malamud helped make the hearings of the House Committee on Oversight and Government Reform available on the Internet in high-quality video at house.resource.org. Later in the year, HouseLive.gov brought live video to mobile devices.

Despite the adoption of Twitter and Facebook by the majority of senators and representatives, Congress as a whole still faces challenges in identifying constituents on social media.

It's also worth noting that, no matter what efforts have been made to open the People's House through technology, at year's end, this was the least popular Congress in history.

Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.

Save 20% on registration with the code RADAR20

Open data

The open data movement received three significant endorsements on the world stage in 2011.

1. Open government data was featured in the launch of the Open Government Partnership.

That launch, however, offered an opportunity to reflect upon the fundamental conditions for open government to exist. Simply opening up data is not a replacement for a Constitution that enforces a rule of law, free and fair elections, an effective judiciary, decent schools, basic regulatory bodies or civil society, particularly if the data does not relate to meaningful aspects of society. That said, open data is a key pillar of how policy makers are now thinking about open government around the world.

2. The World Bank continued to expand what it calls "open development" with its own open data efforts

The World Bank is building upon the 2010 launch of data.worldbank.org. It's now helping countries prepare and launch open government data platforms, including support for Kenya. In December, the World Bank hosted a webinar about how countries can start and run open government data ecosystems, launched an online open data community, and published a series of research papers on the topic.

Realizing the Vision of Open Government Data (Long Version): Opportunities, Challenges and Pitfalls

3. The European Union's support for open data

The BBC reported that Europe's governments are "sitting on assets that could be worth 40bn euros ($52bn, £33.6bn) a year" in public sector data. In addition, the European Commission has launched an open data strategy for the EU. Here's Neelie Kroes, vice president of the European Commission, on public data for all:

Big data means big opportunities. These opportunities can flow from public and private data — or indeed from mixing the two. But a public sector lead can set an example, allowing the same taxpayers who have paid for the data to be gathered to benefit from its wider use. In my opinion, data should be open and available by default and exceptions should be justified — not the other way around, as is too often the case still today.

Access to public data also has an important and growing economic significance. Open data can be fuel for innovation, growth and job creation. The overall economic impact across the whole EU could be tens of billions of Euros per year. That's amazing, of course! But, big data is not just about big money. It promises a host of socially and environmentally beneficial uses too — for example, in healthcare or through the analysis of pollution patterns. It can help make citizens' lives easier, more informed, more connected.


As Glynn Moody wrote at Computer World UK, Europe is starting to get it.

Open data is not a partisan issue, in the view of professor Nigel Shadbolt. In 2012, Shadbolt will lead an "Open Data Institute" in England with Tim Berners-Lee.

Shadbolt is not out on a limb on this issue. In Canada and Britain, conservative governments supported new open data initiatives. In 2011, open government data also gathered bipartisan support in Washington when Rep. Darrell Issa introduced the DATA Act to track government financial spending. We talked about that and other open government issues this fall during an interview at the Strata Conference:

There was no shortage of other open data milestones, from Google adding the Public Data Explorer to its suite of free data tools to an International Open Government Data Camp in Poland.

In New York City, social, mapping and mobile data told the story of Hurricane Irene. In the information ecosystem of 2011, media, government and citizens alike played a critical role in sharing information about what's happening in natural disasters, putting open data to work and providing help to one another.

Here at Radar, MySociety founder Tom Steinberg sounded a cautionary note about creating sustainable open data projects with purpose. The next wave of government app contests need to incorporate sustainability, community, and civic value. Whether developers are asked to participate in app contests, federal challenges, or civic hackathons, in 2012, the architects behind these efforts need to focus on the needs of citizens and sustainability.

Open mapping

One of the biggest challenges government agencies and municipalities have is converting open data to information from which people easily can draw knowledge. One of the most powerful ways humanity has developed to communicate information over time is through maps. If you can take data in an open form and map it out, then you have an opportunity to tell stories in a way that's relevant to a region or personalized to an individual.

There were enough new mapping projects in 2011 that they deserved their own category. In general, the barrier to entry for mapping got lower thanks to new open source platforms like MapBox, which powered the Global Adaptation Index and a map of the humanitarian emergency in the Horn of Africa. And Data.nai.org.afs charted attacks on the media onto an interactive map of Afghanistan.

IssueMap.org, a new project launched by the FCC and FortiusOne, aimed to convert open data into knowledge and insight. The National Broadband Map, one of the largest implementations of open source and open data in government to date, displayed more than 25 million records and incorporated crowdsourced reporting. A new interactive feature posted at WhiteHouse.gov used open data to visualize excess federal property.

"Maps can be a very valuable part of transparency in government," wrote Jack Dangermond, founder of ESRI. "Maps give people a greater understanding of the world around them. They can help tell stories and, many times, be more valuable than the data itself. They provide a context for taxpayers to better understand how spending or decisions are being made in a circumstance of where they work and live. Maps help us describe conditions and situations, and help tell stories, often related to one's own understanding of content."

Social media use grows in government

When there's a holiday, disaster, sporting event, political debate or any other public happening, we now experience it collectively. In 2011, we were reminded that there were a lot of experiences that used to be exclusively private that are now public because of the impact of social media, from breakups to flirting to police brutality. From remembering MLK online to civil disobedience at the #Occupy protests, we now can share what we're seeing with an increasingly networked global citizenry.

Those same updates, however, can be used by autocratic regimes to track down protestors, dissidents and journalists. If the question is whether the Internet and social media are tools of freedom or tools of oppression, the answer may have to be "yes." If online influence is essential to 21st century governance, however, how should government leaders proceed?

Some answers could be found in the lessons learned by the Federal Emergency Management Agency (FEMA), the Red Cross and Crisis Commons that were entered into the Congressional Record when the U.S. Senate heard testimony on the role of social media in crisis response.

If you're a soldier, you should approach social media carefully. The U.S. Army issued a handy social media manual to help soldiers, and the Department of Veterans Affairs issued a progressive social media policy.

A forum on social media at the National Archives featured a preview of a "citizen archivist dashboard" and a lively discussion of the past, present and future of social media — a future which will certainly include the growth of networks in many countries. For instance, in 2011, Chinese social media found its legs.

For a comprehensive discussion of how governments dealt with social media in 2011, check out this piece I wrote for National Journal.

Intellectual property and Internet freedom

In 2011, the United Nations said that disconnecting Internet users is a breach of human rights. That didn't stop governments around the world from considering it under certain conditions. The UN report came at an important time. As Mathew Ingram wrote at GigaOm, reporting on a UNESCO report on freedom of expression online, governments are still trying to kill, replace or undo the Internet.

In 2011, Russia earned special notice when it blocked proposals for freedoms in cyberspace. The Russian blogosphere came under attack in April. This fall, DDoS attacks were used in Russia after the elections in an attempt to squelch free speech. As Russian activists get connected, they'll be risking much to express their discontent.

In May, the eG8 showed that online innovation and freedom of expression still need strong defenders. While the first eG8 Forum in Paris featured hundreds of business and digital luminaries, the policies discussed were of serious concern to entrepreneurs, activists, media and citizens around the world. If the Internet has become the public arena for our time, as the official G8 statement that followed the Forum emphasized, then defending the openness and freedoms that have supported its development is more important than ever.

That need became clearer at year's end when the United States Congress considered anti-piracy bills that could cripple Internet industries. In 2012, the Stop Online Piracy Act (SOPA) and PROTECT IP Act will be before Congress again. Many citizens are hoping that their representatives decide not to break the Internet.

After all, if an open Internet is the basis for democracy flourishing around the world, billions of people will be counting upon our leaders to keep it open and accessible.

What story defined the year for you?

On Govloop, the government social network, the community held its own debate on the issue of the year. There, the threat of a government shutdown led the list. A related issue — "austerity" — was the story that defined government in 2011 in Chris Dorobek's poll. I asked people on Govloop, Quora, Twitter, Facebook and Google+ what the most important Gov 2.0 or open government story of 2011 was and why. Their answers were all about what happened in the U.S., versus the globe, but here's what I heard:

1. The departure of Kundra and White House deputy CTO for open government Beth Noveck mattered

"The biggest story of the year was Vivek Kundra and Beth Noveck leaving the White House," commented Andy Krzmarzick, director of community engagement at Govloop. "Those personnel changes really stalled momentum, generally speaking, on the federal level. I respect their successors immensely, but I think they have an uphill climb as we head into an election year and resisters dig in their heels to wait it out and see if there is a change in administration before they spend a lot of time and energy at this stage of the game. Fortunately, the movement has enough of a ground swell that we'll carry the torch forward regardless of leadership ... but it sure helps to have strong champions."

Terell Jones, director of green IT solutions at EcomNets, agreed. "The departure of Vivek Kundra as CIO of the United States. Under his watch they developed the Cloud Computing Strategy, the 25 Point Plan, and the Federal Data Center Consolidation Initiative (FDCCI). He saved the federal government millions, but they cut his budget so he would be ineffective; so, he escaped to Harvard University," commented Jones. "He may have been frustrated with the speed at which government moves, but he made great strides in the right direction. I hope his replacement will stay the course."

2. Budget cuts to the Office of Management and Budget's E-Government Fund

"I think the biggest story is the Open Government budget cuts," commented Steve Radick, a lead associate with Booz Allen Hamilton, which consults with federal agencies. "After all, these seemed to be the writing on the wall for Vivek's departure, and forced everyone to re-think why open government was so important. It wasn't just for the sake of becoming a more open government — open government needed to be about more than that. It needed to show real mission impact. I think these budget cuts and the subsequent realization of the Gov 2.0 community that Gov 2.0 efforts needed to be deeper than just retweets, friends, and fans was the biggest story of 2011."

3. Insider trading in Congress

"I think the most important story of the year was the 60 Minutes expose on insider trading in Congress," commented Joe Flood, a D.C.-area writer and former web editor at DC.gov and NOAA. "It demonstrated the power of data to illuminate connections that were hidden, showing how members of Congress made stock trades based upon their inside information on pending legislation. It showed what could be done with open data as well as why government transparency is so vital."

4. Hackathons

"I feel like 2011 was kind of the year of the hackathon," commented Karen Suhaka, founder of Legination. "Might just be my perception, but the idea seems to be gaining significant steam."

5. iPads in government

"I think the winner should be iPads on the House Floor and in committee hearings," commented Josh Spayher, a Chicago attorney and creator of GovSM.com. "[It] totally transforms the way members of Congress can access information when they need it."

6. Social media in emergencies, National Archives and Records Administration (NARA), and open government in the European Union

"I think there was significant progress in the use of social media for emergency alerts/warnings and disaster response this year," commented Mollie Walker, editor of FierceGovernmentIT.  "It also shows agencies are letting this evolve beyond a broadcast medium and seeing the value of a feedback loop for mission-critical action. Although it hasn't really come to fruition yet (it's technically in the "operational" phase, though development and migration appear to still be in progress), I think the NARA's electronic record archive has some positive implications for open government going forward. It's something to watch for in 2012, but the fact that NARA tied up a lot of loose ends in 2011 was a big win. The open government efforts in the E.U. are also worth noting. While there have been isolated initiatives in the U.S. and U.K., seeing a governing body such as the E.U. set new standards for openness could have a broader impact on how the rest of the world manages and shares public information."

If you think there's another story that deserves to be listed, please let us know in the comments.

The year ahead

What should we expect in the year ahead? Some predictions are easier than others. The Pew Internet and Life Project found that more than 50% of U.S. adults used the Internet for political purposes during the 2010 midterm elections. Pew's research also showed that a majority of U.S. citizens now turn to the web for news and information about politics. Expect that to grow in 2012.

This year, there was evidence of the maker movement's potential for education, jobs and innovation. That same DIY spirit will matter even more in the year ahead. We also saw the impact of apps that matter, like a mobile geolocation app that connected first responders to heart attack victims. If developers want to make an impact, we need more applications that help us help each another.

In 2011, there were more ways for citizens to provide feedback to their governments than perhaps ever before. In 2012, the open question will be whether "We the People" will use these new participatory platforms to help government work better.

The evolution of these kinds of platforms is neither U.S.-centric nor limited to tech-savvy college students. Citizen engagement matters more now in every sense: crowdfunding, crowdsourcing, crowdmapping, collective intelligence, group translation, and human sensor networks. There's a growth in "do it ourselves (DIO) government," or as the folks at techPresident like to say, "We government." As institutions shift from eGov to WeGov, leaders will be looking more to all of us to help them in the transition.

Related:

December 26 2011

The year in big data and data science

Big data and data science have both been with us for a while. According to McKinsey & Company's May 2011 report on big data, back in 2009 "nearly all sectors in the U.S. economy had at least an average of 200 terabytes of stored data ... per company with more than 1,000 employees." And on the data-science front, Amazon's John Rauser used his presentation at Strata New York (below) to trace the profession of data scientist all the way back to 18th-century German astronomer Tobias Mayer.

Of course, novelty and growth are separate things, and in 2011, there were a number of new technologies and companies developed to address big data's issues of storage, transfer, and analysis. Important questions were also raised about how the growing ranks of data scientists should be trained and how data science teams should be constructed.

With that as a backdrop, below I take a look at three evolving data trends that played an important role over the last year.

The ubiquity of Hadoop

HadoopIt was a big year for investment for Apache Hadoop-based companies. Hortonworks, which was spun out of Yahoo this summer, raised $20 million upon its launch. And when Cloudera announced it had raised $40 million this fall, GigaOm's Derrick Harris calculated that, all told, Hadoop-based startups had raised $104.5 million between May and November of 2011. (Other startups raising investment for their Hadoop software included PlatforaHadapt and MapR.)

But it wasn't just startups that got in on the Hadoop action this year: IBM announced this fall that it would offer Hadoop in the cloud; Oracle unveiled its own Hadoop distribution running on its new Big Data appliance; EMC signed a licensing agreement with MapR; and Microsoft opted to put its own big data processing system, Dryad, on hold, signing a deal instead with Hortonworks to handle Hadoop on Azure.

The growing number of Hadoop providers and adopters has spurred more solutions for managing and supporting Hadoop. This will become increasingly important in 2012 as Hadoop moves beyond the purview of data scientists to become a tool more businesses and analysts utilize.

More data, more privacy and security concerns

Despite all the promise that better tools for handing and analyzing data holds, there were numerous concerns this year about the privacy and security implications of big data, stemming in part from a series of high-profile data thefts and scandals.

In April, a security breach at Sony led to the theft of the personal data of 77 million users. The intrusion into the Playstation Network prompted Sony to pull it offline, but Sony failed to notify its users about the issue for a full week (later admitting that it stored usernames and passwords unencrypted). Estimates of the cost of the security breach to Sony: between $170 million and $24 billion.

That's a wide range of estimates for the damage done to the company, but the point is clear nonetheless: not only do these sorts of data breaches cost companies millions, but the value of consumers' personal data is also increasing — for both legitimate and illegitimate purposes.

iOS mapSony was hardly the only company with security and privacy concerns on its hands. In April, Alasdair Allan and Pete Warden uncovered a file in Apple iOS software that noted users' latitude-longitude coordinates along with a timestamp. Apple responded, insisting that the company "is not tracking the location of your iPhone. Apple has never done so and has no plans to ever do so." Apple fixed what it said was a "bug."

Late this year, almost all handset makers and carriers were implicated by another mobile concern when Android developer Trevor Eckhart reported that the mobile intelligence company Carrier IQ's rootkit software could record all sorts of user data — texts, web browsing, keystrokes, and even phone calls.

That the data from mobile technology was at the heart of these two controversies reflects in some ways our changing data usage patterns. But whether it's mobile or not, as we do more online — shop, browse, chat, check in, "like" — it's clear that we're leaving behind an immense trail of data about ourselves. This year saw the arrival of several open-source efforts, such as the Locker Project and ThinkUp, that strive to give users better control over their personal social data.

And while better control and safeguards can offer some level of protection, it's clear that technology can always be cracked and the goals of data aggregators can shift. So, if digital data is and always will be a moving target, how does that shape our expectations for privacy? In Privacy and Big Data, published this year, co-authors Terence Craig and Mary Ludloff argued that we might be paying too much attention to concerns about "intrusions of privacy" and that instead we need to be thinking about better transparency with how governments and companies are using our data.

Open data's inflection point

Screenshot from the Open Knowledge Foundation's Open Government Data Map
Screenshot from the Open Knowledge Foundation's Open Government Data Map.

When it comes to better transparency, 2011 has been a good year for open data, with strong growth in the number of open data efforts. Canada, the U.K., France, the U.S., and Kenya were a few of the countries unveiling open data initiatives.

There were still plenty of open data challenges: budgets cuts, for example, threatened the U.S. Data.gov initiative. And in his "state of open data 2011" talk, open data activist David Eaves pointed to the challenges of having different schemas and few standards, making it difficult for some datasets to be used across systems and jurisdictions.

Even with a number of open data "wins" at the government level, a recent survey of the data science community by EMC named the lack of open data as one of the obstacles that data scientists and business intelligence analysts said they faced. Just 22% of the former and 12% of the latter said that they "strongly believed" that the employees at their companies have the access they need to run experiments on data. Arguably, more open data efforts have spawned more interest and better understanding of what this can mean.

The demands for more open data has also spawned a demand for more tools. Importantly, these tools are beginning to be open to more than just data scientists or programmers. They include things like visualization-creator Visual.ly, the scraping tool ScraperWiki, and data-sharing site BuzzData.

Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.

Save 20% on registration with the code RADAR20

Related:

December 20 2011

There's a map for that

On November 6, 2012, millions of citizens in the United States will elect or re-elect representatives in Congress. Long before those citizens reach the polls, however, their elected representatives and their political allies in the state legislatures will have selected their voters.

Given powerful new data analysis tools, the practice of "gerrymandering, or creating partisan, incumbent-protected electoral districts through the manipulation of maps, has reached new heights in the 21st century. The drawing of these maps has been one of the least transparent processes in governance. Public participation has been limited or even blocked by the authorities in charge of redistricting.

While gerrymandering has been part of American civic life since the birth of the republic, one of the best policy innovations of 2011 may offer hope for improving the redistricting process. DistrictBuilder, an open-source tool created by the Public Mapping Project, allows anyone to easily create legal districts.

Michael P. McDonald, associate professor at George Mason University and director of the U.S. Elections Project, and Micah Altman, senior research scientist at Harvard University Institute for Quantitative Social Science, collaborated on the creation of DistrictBuilder with Azavea.

"During the last year, thousands of members of the public have participated in online redistricting and have created hundreds of valid public plans," said Altman, via an email. "In substantial part, this is due to the project's effort and software. This year represents a huge increase in participation compared to previous rounds of redistricting — for example, the number of plans produced and shared by members of the public this year is roughly 100 times the number of plans submitted by the public in the last round of redistricting 10 years ago. Furthermore, the extensive news coverage has helped make a whole new set of people aware of the issue and has reframed it as a problem that citizens can actively participate in to solve, rather than simply complain about."

For more on the potential and the challenges present here, watch the C-SPAN video of the Brookings Institution discussion on Congressional redistricting and gerrymandering, including what's happening in states such as California and Maryland. Participants include Norm Ornstein of the American Enterprise Institute and David Wasserman of the Cook Political Report. 

The technology of district building

DistrictBuilder lets users analyze if a given map complies with federal and advocacy-oriented standards. That means maps created with DistrictBuilder are legal and may be submitted to a given's state's authority. The software pulls data from several sources, including the 2010 US Census (race, age, population and ethnicity); election data; and map data, including how the current districts are drawn. Districts can also be divided by county lines, overall competitiveness between parties, and voting age. Each district must have the same total population number, though they are not required to have the same number of eligible voters.

On the tech side, DistrictBuilder is a combination of Django, GeoServer, Celery, jQuery, PostgreSQL, and PostGIS. For more developer-related posts about DistrictBuilder, visit the Azavea website. A webinar that explains how to use DistrictBuilder is available here.

DistrictBuilder is not the first attempt to make software that lets citizens try their hands at redistricting. ESRI launched a web-based application for Los Angeles this year.

"The online app makes redistricting accessible to a wide audience, increasing the transparency of the process and encouraging citizen engagement," said Mark Greninger, geographic information officer for the County of Los Angeles, in a prepared statement. "Citizens feel more confident because they are able to build their own plans online from wherever they are most comfortable. The tool is flexible enough to accommodate a lot of information and does not require specialized technical capabilities."

DistrictBuilder does, however, look like an upgrade to existing options available online. "There are a handful of tools" that enable citizens to participate, said Justin Massa in an email. Massa was the director of project and grant development at the Metro Chicago Information Center (MCIC) and is currently the founder and CEO of Food Genius. "An ESRI plugin and Borderline jump to mind although I know there are more, but all of them are proprietary and quite expensive. There's a few web-based versions, but none of them were usable in my testing."

Redistricting competitions

DistrictBuilder is being used in several state competitions to stimulate more public participation in the redistricting process and improve the maps themselves. "While gerrymandering is unlikely to be the driving force in the trend toward polarization in U.S. politics, it would result in a significant number of seats changing hands, and this could have a substantial effect on what laws get passed," said Altman. "We don't necessarily expect that software alone will change this, or that the legislatures will adopt public plans (even where they are clearly better) but making software and data available, holding competitions, and hosting sites where the public can easily evaluate and create plans that pass legal muster, has increased participation and awareness dramatically."

The New York Redistricting Project (NYRP) is hosting an open competition to redistrict New York congressional and state legislative districts. NYRP is collaborating with the Center for Electoral Politics and Democracy at Fordham University in an effort to see if college students can outclass Albany. The deadline for entering the New York student competition is Jan. 5, and the contest is open to all NY students.

In Philadelphia, FixPhillyDistricts.com included cash prizes when it kicked off in August of this year. By the end of September, citizensourced redistricting efforts reached the finish line, though it's unclear how much impact they had. In Virginia, a similar competition is taking aim at the "rigged redistricting process."

"This [DistrictBuilder] redistricting software is available not only to students, but to the public at large," said Costas Panagopoulos in a phone interview. At Fordham University, Panagopoulos is an assistant professor of political science, the director of the Center for Electoral Politics and Democracy, and the director of the graduate program in Elections and Campaign Management. "It's open source, user friendly and has no costs associated with it. It's a great opportunity for people to get involved and have the tools they need to design maps as alternatives for legislatures to consider."

Panagopoulos says maps created in DistrictBuilder can matter when redistricting disputes end up in the courts. "We have seen evidence from other states where competitions have been held," he said. "Official government entities have looked to maps that have been drawn by students for guidance. In Virginia, students submitted maps that enhanced minority representation. There are elements in the plan that will be officially adopted."

While it might seem unlikely that a map created by a team of students will be adopted, elements created by students in New York could make their way into discussions in Albany, posited Panagopoulos. "Our sense is that the criteria students will use to design maps will be somewhat different than what lawmakers will choose to pursue," he said. "Lawmakers may take concerns about protecting incumbents or partisan interests more to heart than citizens will. At the end of the day, if lawmakers think that a plan is ultimately worse off for both parties, they may adopt something that's more benign. That's what happened in the last round of redistricting. Legislators pushed through a different map rather than the one imposed by a judge."

For a concrete example of how the politics play out in one state, look at Texas. Ross Ramsey, the executive editor of The Texas Tribune, wrote about redistricting in the Texas legislature and courts:

The 2010 elections put overwhelming Republican majorities in both houses of the Legislature just as the time came to draw new political maps for state legislators, the Congressional delegation and members of the State Board of Education. Those Republicans drew maps to give each district an even number of people and to maximize the number of Republican districts that could be created, they thought, under the Voting Rights Act and the federal and state constitutions.

Or look at Illinois, where a Democratic redistricting plan would maximize the number of Democratic districts in that state. Or Pennsylvania, where a new map is drawing condemnation for being "rife with gerrymandering," according to Eric Boehm of the PA Independent.

While redistricting has historically not been the most accessible governance issue to the voting public, historic levels of dissatisfaction with the United States Congress could be channeled into more civic engagement. "The bottom line is that the public never had an opportunity to be as involved in redistricting as they are now," said Panagopoulos. "It's important that the public get involved."

Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.

Save 20% on registration with the code RADAR20

Better redistricting software requires better data

Redistricting is "an emerging open-government issue that, for whatever reason, hasn't gotten a ton of attention yet from our part of the world," wrote Massa. "This scene is filled with proprietary datasets, intentionally confusing legislative proposals, antiquated laws that don't compel the publication of shape files, and election results data that is unbelievably messy."

As is the case with other open-government platforms, DistrictBuilder will only work with the right base data. "About a year ago, MCIC worked on a voting data project just for seven counties around Chicago," said Massa. "We found that none of the data we obtained from county election boards matched what the Census published as part of the '08 boundary files." In other words, a hoary software adage applies: "garbage in, garbage out."

That's where MCIC has played a role. "MCIC has been working with the Midwest Democracy Network to implement DistrictBuilder for six states in the Midwest," wrote Massa. According to Massa, Illinois, Indiana, Wisconsin, Michigan, and Ohio didn't have anything available at a state level. Among these states, according to Massa, only Minnesota publishes clean data. Earlier this year, MCIC launched DistrictBuilder for Minnesota.

"The unfortunate part is that the data to power a truly democratic process exists," said Massa. "We all know that no one is hand-drawing maps and then typing out the lengthy legislative proposals that describe, in text, the boundaries of a district. The fact that the political parties use tech and data to craft their proposals and then, in most cases, refuse to publish the data they used to make their decisions, or electronic versions of the proposals themselves, is particularly infuriating. This is a prime example of data 'empowering the empowered'."

Image Credit: Elkanah Tisdale's illustration of gerrymandering, via Wikipedia.

Related:

December 12 2011

Can the People's House become a social platform for the people?

Congressional hackathon
InSourceCode developers work on "Madison" with volunteers.

There wasn't a great deal of hacking, at least in the traditional sense, at the "first congressional hackathon." Given the general shiver that the word still evokes in many a Washingtonian in 2011, that might be for the best. The attendees gathered together in the halls of the United States House of Representatives didn't create a more interactive visualization of how laws are made or a mobile health app. As open government advocate Carl Malamud observed, the "hack" felt like something even rarer in the "Age of the App for That:"

In a time when partisanship and legislative gridlock have defined Congress for many citizens, seeing the leadership of the United States House of Representatives agree on the importance of using the power of data and social networking to open government was an early Christmas present.

"Increased access, increased connection with our constituents, transparency, openness is not a partisan issue," said House Majority Leader Eric Cantor.

"The Republican leader and I may debate vigorously on many issues, but one area where we strongly agree is on making Congress more transparent and accessible," said House Democratic Whip Steny Hoyer in his remarks. "First, Congress took steps to open up the Capitol building so citizens can meet with their representatives and see the home of their legislature. In the same way, Congress is now taking steps to update how it connects with the American people online."

An open House

While the event was branded as a "Congressional Facebook Developer Hackathon," what emerged more closely resembled a loosely organized conference or camp.

Facebook executives and developers shared the stage with members of Congress to give keynotes to the 200 or so attendees before everyone broke into discussion groups to talk about constituent communications, press relations and legislative data. The event might be more aptly described as a "wonk-a-thon," as Sunlight Foundation's Daniel Schuman put it last week.

This "hackathon" was organized to have some of the feel of an unconference, in the view of Matt Lira, digital director for the House Majority Leader. Lira sat down for a follow-up interview last Thursday.

"There's a real model to CityCamp," he said. "We had 'curators' for the breakout. Next time, depending on how we structure it, we might break out events that are designed specifically for programming, with others clustered around topics. We want to keep it experimental."

Why? "When Aneesh Chopra and I did that session at SXSW, that personally for me was what tripped my thinking here," said Lira. "We came down from the stage and formed a circle. I was thinking the whole time that it would have been a waste of intellectual talent to have Tim O'Reilly and Clay Shirky in the audience instead of engaging in the conversation. I was thinking I never want to do a panel again. I want it to be like this."

Part of the challenge, so to speak, of Congress hosting a hackathon in the traditional sense, with judging and prizes, lies in procurement rules, said Lira."There are legal issues around challenges or prizes for Congress," he explained. "They're allowed in the executive branch, under DARPA, and now every agency under the COMPETES Act. We can't choose winners or losers, or give out prizes under procurement rules."

Whatever you call it, at the end of the event, discussion leaders from the groups came back and presented on the ideas and concepts that had been hashed out. You can watch a short video that EngageDC produced for the House Majority Leader's office below:

What came out of this unprecedented event, in other words, won't necessarily be measured in lines of code. It's that Congress got geekier. It's that the House is opening its doors to transparency through technology.

Given the focus on Facebook, it's not surprising that social media took center stage in many of the discussions. The idea for it came from a trip to Silicon Valley, where Representative Cantor said he met with Facebook founder Mark Zuckerberg and COO Sheryl Sandberg, and discussed how to make the House more social. After that conversation, Lira and Steve Dwyer, director of online communications and technology for the House Democratic Whip, organized the event.

For a sense of the ideas shared by the working groups, read the story of the first congressional "hackathon" on Storify.

"For government, I don't think we could have done anything more purposeful than this as a first meeting," said Lira in our interview. "Next, we'll focus on building this group of people, strengthening the trust, which will prove instrumental when we get into the pure coding space. I have 100% confidence that we could do a programming-only event now and would have attendance."

A Likeocracy in alpha

As the Sunlight Foundation's John Wonderlich observed earlier this year, access to legislative data brings citizens closer to their representatives.

"When developers and programmers have better access to the data of Congress, they can better build the databases and tools that let the rest of us connect with the legislature," he wrote.

If more open legislative data goes online, when we talk about what's trending in Congress, those conversations will be based upon insight into how the nation is reacting to them on social networks, including Facebook, Twitter, and Google+.

Facebook developers Roddy Lindsay, Tyler Brock, Eric Chaves, Porter Bayne, and Blaise DiPersia coded up a simple proof of concept of what making legislative data might look like. "LikeOcracy" pulls legislation from a House XML feed and makes it more social. The first version added Facebook's ubiquitous "Like" buttons to bill elements. A second version of the app adds more opportunities for reaction by integrating ReadrBoard, which enables users to rate sections or individual lines as "Unnecessary, Problematic, Great Idea or Confusing." You can try it out on three sample bills, including the Stop Online Piracy Act.

Would "social legislation" in a Facebook app catch on? The growth of civic startups like PopVox, OpenCongress and Votizen suggests that the idea has legs. [Disclosure: Tim O'Reilly was an early angel investor in PopVox.]

Likeocracy doesn't tap into Facebook's Open Graph, but it does hint at what integration might look like in the future. Justin Osofsky, Facebook's director of platform partnerships, described how the interests of constituents could be integrated with congressional data under Facebook's new Timeline. Citizens might potentially be able to simply "subscribe" to a bill, much like they can now for any web page, if Facebook's "Subscribe" plug-in was applied to the legislative process.

Opening bill markup online

The other app presented at the hackathon came not from the attendees but from the efforts of InSourceCode, a software development firm that's also coded for Congressman Mike Pence and the Republican National Committee.

Rep. Darrell Issa, chairman of the House Committee on Oversight and Government Reform, introduced the beta version of MADISON on Wednesday, a new online tool to crowdsource legislative markup. The vision is that MADISON will work as a real-time markup engine to let the public comment on bills as they move through the legislative process. "The assumption is that legislation should be open in Congress," said Issa. "It should be posted, interoperable and commented upon."

As Nick Judd reported at techPresident, the first use of MADISON is to host Issa and Sen. Ron Wyden's "OPEN bill," which debuted on the app. Last week, the congressmen released the Online Protection and Enforcement of Digital Trade Act (OPEN) at Keepthewebopen.com. The OPEN legislation removes one of the most controversial aspects of SOPA, using the domain name system for enforcement, and instead places authority with the International Trade Commission to address enforcement of IP rights on websites that are primarily infringing upon copyright.

Issa said that his team had looked at the use of wikis by Rep. John Culberson, who put the healthcare reform bill online in a wiki. "There are some problems with editors who are not transparent to all of us," said Issa. "That's one of the challenges. We want to make sure that if you're an editor, you're a known editor."

MADISON includes two levels of authentication: email for simple commenting and a more thorough vetting process for organizations or advocacy groups that wish to comment. "Like most things that are a 1.0 or beta, our assumption is that we'll learn from this," said Issa. "Some members may choose to have an active dialog. Others may choose to have it be part of pre-markup record."

Issa fielded a number of questions on Wednesday, including one from web developer Brett Stubbs: "Will there be open access or an API? What we really want is just data." Issa indicated that future versions might include that.

Jayson Manship, the "chief nerd" at InSourceCode, said that MADISON was built in four days. According to Manship, the idea came from conversations with Issa and Seamus Kraft, director of digital strategy for the House Committee on Oversight and Government Reform. MADISON is built with PHP and MySQL, and hosted in RackSpace's cloud so it can scale with demand, said Manship.

"It's important to be entrepreneurial," said Lira in our interview. "There are partners throughout institutions that would be willing to do projects of different sizes and scopes. MADISON is something that Issa and Seamus wanted to do. They took it upon themselves to get the ball rolling. That's the attitude we need."

"We're working to hold the executive accountable to taxpayers," said Kraft last week. "Opening up what we do here in these two halls of Congress is equally important. MADISON is our first shot at it. We're going to need a lot of help to make it better."

Kraft invited the remaining developers present to come to the Rayburn Office Building, where Manship and his team had brought in half a dozen machines, to help get MADISON ready for launch. While I was there, there were conversations about decisions, plug-ins and ideas about improving the interface or functionality, representing a bona fide collaboration to make the app better.

There's a larger philosophical issue relating to open government that Nick Judd touched upon over at techPresident in a follow-up post on MADISON:

The terms for the site warn the user that anything they write on it will become public domain — but the code itself is proprietary. Meanwhile, OpenCongress' David Moore points out that the code that powers his organization's website, which also allows users to comment on individual provisions of bill text, is open source and has been available for some time. In theory, this means the Oversight staff could have started from that code and built on it instead of beginning from scratch. The code being proprietary means that while people like Moore might be able to make suggestions, they can't just download it, make their own changes and submit them for community review — which they'd happily do at little or no cost for a project released under an open-source license.

As Moore put it, "Get that code on GitHub, we'll do OpenID, fix the design."

When asked about whether the team had considered making MADISON code open source, Manship said that "he didn't know, although they weren't opposed to it."

While Moore welcomed MADISON, he also observed that Open Congress has had open-source code for bill text commenting for years.

The decision by Issa's office to fund the creation of an app that was already available as open-source software is one that's worth noting, so I asked Kraft why they didn't fork OpenCongress' code, as Judd suggests. "While there was no specific budget expense for MADISON, it was developed by the Oversight Committee," said Kraft.

"While we like and support OpenCongress' code, it didn't fit the needs for MADISON," Kraft wrote in an emailed statement.

What's next is, so to speak, an "OPEN" question, both in terms of the proposed SOPA alternative and the planned markup of SOPA itself on December 15. The designers of OPEN are actively looking for feedback from the civic software development community, both in terms of what functionality exists now and what could be built in future iterations.

THOMAS.gov as a platform

What Moore and long-time open-government advocates like Carl Malamud want to see from Congress is more structural change:

They're not alone. Dan Schuman listed many other ways the House has yet to catch up with 21st century technology:

We have yet to see bulk access to THOMAS or public access to CRS reports, important legislative and ethics documents are still unavailable in digital format, many committee hearings still are not online, and so on.

As Schuman highlighted, the Sunlight Foundation has been focused on opening up Congress through technology since the organization was founded. To whit: "There have been several previous collaborative efforts by members of the transparency community to outline how the House of Representatives can be more open and accountable, of which an enduring touchstone is the Open House Project Report, issued in May 2007," wrote Schuman.

The notion of making THOMAS.gov into a platform received high-level endorsement from a congressional leader when House Minority Whip Steny Hoyer remarked on how technology is affecting Congress, his caucus and open government in the executive branch:

For Congress, there is still a lot of work to be done, and we have a duty to make the legislative process as open and accessible as possible. One thing we could do is make THOMAS.gov — where people go to research legislation from current and previous Congresses — easier to use, and accessible by social media. Imagine if a bill in Congress could tweet its own status.

The data available on THOMAS.gov should be expanded and made easily accessible by third-party systems. Once this happens, developers, like many of you here today, could use legislative data in innovative ways. This will usher in new public-private partnerships that will empower new entrepreneurs who will, in turn, yield benefits to the public sector.

One successful example is how cities have made public transit data accessible so developers can use it in apps and websites. The end result has been commuters saving time every day and seeing more punctual trains and buses as a result of the transparency. Legislative data is far more complex, but the same principles apply. If we make the information available, I am confident that smart people like you will use it in inventive ways.

Hoyer's specific citation of the growth of open data in cities and an ecosystem of civic applications based upon it is further confirmation that the Gov 2.0 meme is moving into the mainstream.

Making THOMAS.gov into a platform for bulk data would change what's possible for all civic developers. What I really want is "data on everything," Stubbs told me last week. "THOMAS is just a visual viewer of the internal stuff. If we could have all of this, we could do something with it. What I would like is a data broker. I'd like a RESTful API with all of the data that I could just query. That's what the government could learn from Facebook. From my point of view, I just want to pull information and compile it."

If Hoyer and the House leadership would like to see THOMAS.gov act as a platform, several attendees at the hackathon suggested to me that Congress could take a specific action: collaborate with the Senate and send the Library of Congress a letter instructing it to provide bulk legislative data access to THOMAS.gov in structured formats so that developers, designers and citizens around the nation can co-create a better civic experience for everyone.

"The House administration is working on standards called for by the rule and the letter sent earlier this year," said Lira. "We think they will be satisfactory to people. The institutions of the House have been following through since the day they were issued. The first step was issuing an XML feed daily. Next year, there will be a steady series of incremental process improvements. When the House Administrative Committee issues standards, the House Clerk will work on them. "

Despite the abysmal public perception of Congress, genuine institutional changes in the House of Representatives driven by the GOP embracing innovation and transparency are incrementally happening. As Tim O'Reilly observed earlier this year, the current leadership of the House on transparency is doing a better job than their predecessors.

In April, Speaker Boehner and Majority Leader Cantor sent a letter to the House Clerk regarding legislative data release. Then, in September, a live XML feed for the House floor went online. Yes, there's a long way to go on open legislative data quality in Congress. That said, there's support for open-government data from both the White House and the House.

"My personal view is that what's important right now is that the House create the right precedents," said Lira. "If we create or adopt a data standard, it's important that it be the right standard."

Even if open government is in beta, there needs to be more tolerance for experiments and risks, said Lira. "I made a mistake in attacking We the People as insufficient. I still believe it is, but it's important to realize that the precedent is as important as the product in government. In technology in general, you'll never reach an end. We The People is a really good precedent, and I look forward to seeing what they do. They've shown a real commitment, and it's steadily improving."

A social Congress

While Sean Parker may predict that social media will determine the outcome of the 2012 election, governance is another story entirely. Meaningful use of social media by Congress remains challenged by a number of factors, not least an online identity ecosystem that has not provided Congress with ideal means to identify constituents online. The reality remains that when it comes to which channels influence Congress, in-person visits and individual emails or phone calls are far more influential with congressional staffers.

As with any set of tools, success shouldn't be measured solely by media reports or press releases but by the outcomes from their use. The hard work of bipartisan compromise between the White House and Congress, to the extent it occurs, might seem unlikely to be publicly visible in 140 characters or less.

"People think it's always an argument in Washington," said Lira in our interview. "Social media can change that. We're seeing a decentralization of audiences that is built around their interests rather than the interests of editors. Imagine when you start streaming every hearing and making information more digestible. All of a sudden, you get these niche audiences. They're not enough to sustain a network, but you'll get enough of an audience to sustain the topic. I believe we will have a more engaged citizenry as a result."

Lira is optimistic. "Technology enables our republic to function better. In ancient Greece, you could only sustain a democracy in the size of city. Transportation technology limited that scope. In the U.S., new technologies enabled global democracy. As we entered the age of mass communication, we lost mass participation. Now with the Internet, we can have people more engaged again."

There may be a 30-year cycle at play here. Lira suggested looking back to radio in the 1920s, television in the 1950s, and cable in the 1980s. "It hasn't changed much since; we're essentially using the same rulebook since the '80s. The changes made in those periods of modernization were unique."

Thirty years on from the introduction of cable news, will the Internet help reinvigorate the founders' vision of a nation of, by and with the people? "I do think that this is a transformational moment," said Lira. "It will be for the next couple of years. When you talk to people — both Republicans and Democrats — you sense we're on the cusp of some kind of change, where it's not just communicating about projects but making projects better. Hearings, legislative government and executive government will all be much more participatory a decade from now. "

In that sweep of history, the "People's House" may prove to be a fulcrum of change. "If any place in government is going to do it, it's the House" said Lira. "It's our job to be close to the public in a way that no other part of government is. In the Federalist Papers, that's the role of the House. We have an obligation to lead the way in terms of incorporating technology into real processes. We're not replacing our system of representative government. We're augmenting it with what's now possible, like when the telegraph let people know what the votes were faster."

Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.

Save 20% on registration with the code RADAR20

December 05 2011

White House to open source Data.gov as open government data platform

As 2011 comes to an end, there are 28 international open data platforms in the open government community. By the end of 2012, code from new "Data.gov-in-a-box" may help many more countries to stand up their own platforms. A partnership between the United States and India on open government has borne fruit: progress on making the open data platform Data.gov open source.

In a post this morning at the WhiteHouse.gov blog, federal CIO Steven VanRoekel (@StevenVDC) and federal CTO Aneesh Chopra (@AneeshChopra) explained more about how Data.gov is going global:

As part of a joint effort by the United States and India to build an open government platform, the U.S. team has deposited open source code — an important benchmark in developing the Open Government Platform that will enable governments around the world to stand up their own open government data sites.

The development is evidence that the U.S. and India are indeed still collaborating on open government together, despite India's withdrawal from the historic Open Government Partnership (OGP) that launched in September. Chopra and VanRoekel explicitly connected the move to open source Data.gov to the U.S. involvement in the Open Government Partnership today. While we'll need to see more code and adoption to draw substantive conclusions on the outcomes of this part of the plan, this is clearly progress.

Data.gov in a boxThe U.S. National Action Plan on Open Government, which represents the U.S. commitment to the OGP, included some details about this initiative two months ago, building upon a State Department fact sheet that was released in July. Back in August, representatives from India's National Informatics Center visited the United States for a week-long session of knowledge sharing with the U.S. Data.gov team, which is housed within the General Services Administration.

"The secretary of state and president have both spent time in India over the past 18 months," said VanRoekel in an interview today. "There was a lot of dialogue about the power of open data to shine light upon what's happening in the world."

The project, which was described then as "Data.gov-in-a-box," will include components of the Data.gov open data platform and the India.gov.in document portal. Now, the product is being called the "Open Government Platform" — not exactly creative, but quite descriptive and evocative of open government platforms that have been launched to date. The first collection of open source code, which describes a data management system, is now up on GitHub.

During the August meetings, "we agreed upon a set of things we would do around creating excellence around an open data platform," said VanRoekel. "We owned the first deliverable: a dataset management tool. That's the foundation of an open source data platform. It handles workflow, security and the check in of data -- all of the work that goes around getting the state data needs to be in before it goes online. India owns the next phase: the presentation layer."

If the initiative bears fruit in 2012, as planned, the international open government data movement will have a new tool to apply toward open data platforms. That could be particularly relevant to countries in the developing world, given the limited resources available to many governments.

What's next for open government data in the United States has yet to be written. "The evolution of data.gov should be one that does things to connect to web services or an API key manager," said VanRoekel. "We need to track usage. We're going to double down on the things that are proving useful."

Drupal as an open government platform?

This Open Government Data platform looks set to be built upon Drupal 6, a choice that would further solidify the inroads that the open source content management system has made into government IT. As always, code and architecture choices will have consequences down the road.

"While I'm not sure Drupal is a good choice anymore for building data sites, it is key that open source is being used to disseminate open data," said Eric Gunderson, the founder of open source software firm Development Seed. "Using open source means we can all take ownership of the code and tune it to meet our exact needs. Even bad releases give us code to learn from."

Jeff Miccolis, a senior developer at Development Seed, concurred about how open the collaboration around the Data.gov code has been or will be going forward. "Releasing an application like this as open source on an open collaboration platform like Github is a great step," he said. "It still remains to be seen what the ongoing commitment to the project will be, and how collaboration will work. There is no history in the git repository they have on GitHub, no issues in the issue tracker, nor even an explicit license in the repository. These factors don't communicate anything about their future commitment to maintaining this newly minted open source project."

The White House is hoping to hear from more developers like Miccolis. "We're looking forward to getting feedback and improvements from the open source community," said VanRoekel. "How do we evolve the U.S. data.gov as it sits today?"

Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.

Save 20% on registration with the code RADAR20

Open data impact

From where VanRoekel sits, investing in open source, open government and open data remain important to the administration. He said to me that the fact that he was hired was a "clear indication of the importance" of these issues in the White House. "It wasn't a coincidence that the launch of the Open Government Partnership coincided with my arrival," he said. "There's a lot of effort to meet the challenge of open government," according to VanRoekel. "The president has me and other people involved meeting every week, reporting on progress."

The open questions now, so to speak, are: Will other countries use it? And to what effect? Here in the U.S., there's already code sharing between cities. OpenChattanooga, an open data catalog in Tennessee, is using source code from OpenDataPhilly, an open government data platform built in Philadelphia by GIS software company Azavea. By the time "Data.gov in a box" is ready to be deployed, some cities, states and countries might have decided to use that code in the meantime.

There's good reason to be careful about celebrating the progress here. Open government analysts like Nathaniel Heller have raised concerns about the role of open data in the Open Government Partnership, specifically that:

... open data provides an easy way out for some governments to avoid the much harder, and likely more transformative, open government reforms that should probably be higher up on their lists. Instead of fetishizing open data portals for the sake of having open data portals, I'd rather see governments incorporating open data as a way to address more fundamental structural challenges around extractives (through maps and budget data), the political process (through real-time disclosure of campaign contributions), or budget priorities (through online publication of budget line-items).

Similarly, Greg Michener has made a case for getting the legal and regulatory "plumbing" for open government right in Brazil, not "boutique Gov 2.0" projects that graft technology onto flawed governance systems. Michener warned that emulating the government 2.0 initiatives of advanced countries, including open data initiatives:

... may be a premature strategy for emerging democracies. While advanced democracies are mostly tweaking and improving upon value-systems and infrastructure already in place, most countries within the OGP have only begun the adoption process.

Michener and Heller both raise bedrock issues for open government in Brazil and beyond that no technology solution in of itself will address. They're both right: Simply opening up data is not a replacement for a Constitution that enforces a rule of law, free and fair elections, an effective judiciary, decent schools, basic regulatory bodies or civil society, particularly if the data does not relate to meaningful aspects of society.

"Right now, the problem we are seeing is not so much the technology around how to open data but more around the culture internally of why people are opening data," agreed Gunderson. "We are just seeing a lot of bad data in-house and thus people wanting to stay closed. At some point a lot of organizations and government agencies need to come clean and say 'we have not been managing our decisions with good data for a long time'. We need more real  projects to help make the OGP more concrete."

Heller and Michener speak for an important part of the open government community and surely articulate concerns that exist for many people, particularly for a "good government" constituency whose long term, quiet work on government transparency and accountability may be receiving the same attention as some shinier technology initiatives. The White House consultation on open government that I attended included considerable recognition of the complexities here.

It's worth noting that Heller called the products of open data initiatives "websites," including Kenya's new open government platform. He's not alone in doing so. To rehash an old but important principle, Gov 2.0 is not about "websites" or "portals" — it's about web services and the emerging global ecosystem of big data. In this context, Gov 2.0 isn't simply about setting up social media accounts, moving to grid computing or adopting open standards: it's about systems thinking, where open data is used both by, for and with the people. If you look at what the Department of Health and Human Services is trying to do to revolutionize healthcare with open government data in the United States, that approach may become a bit clearer. For that to happen, countries, states and cities have to stand up open government data platforms.

The examples of open government data being put to use that excite VanRoekel are, perhaps unsurprisingly, on the healthcare front. If you look at the healthcare community pages on Data.gov, "you see great examples of companies and providers meeting," he said, referencing two startups from a healthcare challenge that were acquired by larger providers as a result of their involvement in the open data event.

I'm cautiously optimistic about what this news means for the world, particularly for the further validation of open source in open government. With this step forward, the prospects for stimulating more economic activity, civic utility and accountability under a global open government partnership are now brighter.

Related:

December 01 2011

Gov 2.0 enters the mainstream on NPR and the AP

Regular Radar readers know that "Gov 2.0 has gone local," as local governments look for innovative ways to use technology cooperatively with citizens to deliver smarter government. This week, NPR listeners learned more about the open-government movement around the country when the Kojo Nnamdi Show hosted an hour-long discussion on local Gov 2.0 on WAMU in Washington, D.C.

You can listen to the audio archive of the program and read the transcript at TheKojoNnamdiShow.org.

I was happy to join Bryan Sivak, chief innovation officer of the state of Maryland; Tom Lee, director of Sunlight Labs; and Abhi Nemani, director of strategy and communications at Code For America, as a guest on the show.

Heather Mizeur, a delegate to the Maryland State Assembly, called in to the show to share what her state has been working on with respect to open government, including streaming video, budget transparency and online access. Mizeur had the one-liner of the day: Commenting on the need to improve Maryland.gov, she observed that "our state website is an eight-track tape player in an iPhone universe."

An open government linkology

During the program, the @KojoShow producer shared links to sites and services that were mentioned by the guests. These included:

  • Maryland's solicitation for feedback on helpful or hurtful business regulations at
  • An NPR News feature on the American Legislative Exchange Council and channels of influence in state legislatures.
  • Churnalism, an app to discover PR masquerading as original journalism. Could a churnalism model be used to detect similar subtle influences in state legislatures? Sunlight Labs has an ongoing project at OpenStates. Stay tuned.
  • Civic engagement platform Change By Us launched in Philadelphia and was open sourced into the Civic Commons.
  • Sivak cited Arkansas.gov as a model for well designed government websites. The key is that it's adapted for mobile visitors.
  • Nemani cited Open Data Philly as a local open government platform that uses open standards. It's open sourced, so that other cities, like Chattanooga, Tenn., can use it to stand up their own open-data efforts.
  • The Sportaneous location-aware mobile app uses open-government data to help people find pick-up sports games.
  • The StreetBump app uses a smartphone's accelerometer to automatically report potholes in Boston.
Moving to Big Data: Free Strata Online Conference — In this free online event, being held Dec. 7, 2011, at 9AM Pacific, we'll look at how big data stacks and analytical approaches are gradually finding their way into organizations as well as the roadblocks that can thwart efforts to become more data driven. (This Strata Online Conference is sponsored by Microsoft.)

Register to attend this free Strata Online Conference

Civic applications enter the mainstream

Recently, civic applications and open data pushed further into the national consciousness with a widely syndicated Associated Press story by Marcus Wohlsen. Here's how Wohlsen described what's happening:

Across the country, geeks are using mountains of data that city officials are dumping on the web to create everything from smartphone tree identifiers and street sweeper alarms to neighborhood crime notifiers and apps that sound the alarm when customers enter a restaurant that got low marks on a recent inspection. The emergence of city apps comes as a result of the rise of the open-data movement in U.S. cities, or what advocates like to call "Government 2.0."

The AP covered Gov 2.0 and the open-government data movement in February, when it looked at how cities were crowdsourcing ideas from citizens, or "citizensourcing."

It's great to see what's happening around the country receive more mainstream attention. Over on Google+, Tim O'Reilly commented on the AP story:

Of all the things that made up the "gov 2.0" meme, open data may be one of the most important. It's a key part of government thinking like a platform player rather than an application provider. At Code for America, the work ended up being about liberating data as much as about writing apps. We're just at the beginning of a really interesting new approach to government services.

Wohlsen captured the paradigm behind Gov 2.0 at the end of his article:

New York, San Francisco and other cities are now working together to develop data standards that will make it possible for apps to interact with data from any city. The idea, advocates of open data say, is to transform government from a centralized provider of services into a platform on which citizens can build their own tools to make government work better.

Open311 is a data standard of this sort. So is GTFS. "So much can flow from so little," noted O'Reilly. "Consider how Google Transit began with outreach from the city of Portland to create GTFS, a standard format for transit data, which was subsequently adopted by other cities. Now, you can get transit arrival times from Google as well as from hundreds of smartphone apps, none of which needed to be written by city government."

What lies ahead for Gov 2.0 in 2012 has the potential to improve civic life in any number of interesting ways. If the Gov 2.0 movement is to have a lasting, transformative effect, however, what's described above needs to be the beginning of the story, not the end. That arc will include the results of HHS CTO Todd Park's efforts to revolutionize the healthcare industry or the work of the Alfred brothers at BrightScope to bring more transparency to financial advisors.

Making Gov 2.0 matter will also mean applying different ways of thinking and new technology to other areas, as FutureGov founder Dominic Campbell commented on Google+:

There aren't enough of us working to transform, challenge and change the inside of government. Not enough taking on the really sticky issues beyond relatively quick and easy wins, such as transit data or street-scene related apps. This needs to change before anything can be said to have gone mainstream. Disclaimer: this is exactly what we're looking to do with apps like PatchWorkHQ and CasseroleHQ, starting to hone in on priority, challenging, socially important and costly areas of government, such as child protection and supporting older people to live better independent lives. The journey is far longer and harder, but (we're hoping) even more rewarding.

More awareness of what's possible and available will lead to more use of civic applications and thereby interest and demand for open-government data. For instance, on the AP's Twitter feed, an editor asked more than 634,000 followers this question: "Hundreds of new apps use public data from cities to improve services. Have you tried any?" I'll ask the same of Radar readers: have you used a civic app? If so, what and where? Did it work? Did you keep it? Please let us know in the comments.

Strata Week: New open-data initiatives in Canada and the UK

Here are a few of the data stories that caught my attention this week.

Open data from StatsCan

Statistics CanadaEmbassy Magazine broke the news this week that all of Statistics Canada's online data will be made available to the public for free, released under the Government of Canada's Open Data License Agreement beginning in February 2012. Statistics Canada is the federal agency commissioned with producing statistics to help understand the Canadian economy, culture, resources, and population. (It runs the Canadian census every five years.)

The decision to make the data freely and openly available "has been in the works for years," according to Statistics Canada spokesperson Peter Frayne. The Canadian government did launch an open-data initiative earlier this year, and the move on the part of StatsCan dovetails philosophically with that. Frayne said that the decision to make the data free was not a response to the controversial decision last summer when the agency dropped its mandatory long-form census.

Open government activist David Eaves responds with a long list of "winners" from the decision, including all of the consumers of StatsCan's data:

Indirectly, this includes all of us, since provincial and local governments are big consumers of StatsCan data and so now — assuming it is structured in such a manner — they will have easier (and cheaper) access to it. This is also true of large companies and non-profits which have used StatsCan data to locate stores, target services and generally allocate resources more efficiently. The opportunity now opens for smaller players to also benefit.

Eaves continues, stressing the importance of these smaller players:

Indeed, this is the real hope. That a whole new category of winners emerges. That the barrier to use for software developers, entrepreneurs, students, academics, smaller companies and non-profits will be lowered in a manner that will enable a larger community to make use of the data and therefore create economic or social goods.

Moving to Big Data: Free Strata Online Conference — In this free online event, being held Dec. 7, 2011, at 9AM Pacific, we'll look at how big data stacks and analytical approaches are gradually finding their way into organizations as well as the roadblocks that can thwart efforts to become more data driven. (This Strata Online Conference is sponsored by Microsoft.)

Register to attend this free Strata Online Conference

Open data from Whitehall

The British government also announced the availability of new open datasets this week. The Guardian reports that personal health records, transportation data, housing prices, and weather data will be included "in what promises to be the most dramatic release of public data since the 2010 election."

The government will also form an Open Data Institute (ODI), led by Sir Tim Berners-Lee. The ODI will involve both businesses and academic institutions, and will focus on helping transform the data for commercial benefit for U.K. companies as well as for the government. The ODI will also work on the development of web standards to support the government's open-data agenda.

The Guardian notes that the health data that's to be released will be the largest of its kind outside of U.S. veterans' medical records. The paper cites the move as something recommended by the Wellcome Trust earlier this year: "Integrated databases ... would make England unique, globally, for such research." Both medical researchers and pharmaceutical companies will be able to access the data for free.

Dell open sources its Hadoop deployment tool

HadoopHadoop adoption and investment has been one of the big data trends of 2011, with stories about Hadoop appearing in almost every edition of Strata Week. GigaOm's Derrick Harris contends that Hadoop's good fortunes will only continue in 2012, listing six reasons why next year may actually go down as "The Year of Hadoop."

This week's Hadoop-related news involves the release of the source code to Crowbar, Dell's Hadoop deployment tool. Silicon Angle's Klint Finley writes that:

Crowbar is an open-source deployment tool developed by Dell originally as part of its Dell OpenStack Cloud service. It started as a tool for installing Open Stack, but can deploy other software through the use of plug-in modules called 'barclamps' ... The goal of the Hadoop barclamp is to reduce Hadoop deployment time from weeks to a single day.

Finley notes that Crowbar isn't competition to Cloudera's line of Hadoop management tools.

What Muncie read

What Middletown Read"People don't read anymore," Steve Jobs once told The New York Times. It's a fairly common complaint, one that certainly predates the computer age — television was to blame, then video games. But our knowledge about reading habits of the past is actually quite slight. That's what makes the database based on ledgers from the Muncie, Ind., public library so marvelous.

The ledgers, which were discovered by historian Frank Felsenstein, chronicle every book checked out of the library, along with the name of the patron who checked it out, between November 1891 and December 1902. That information is now available in the What Middletown Read database.

In a New York Times story on the database, Anne Trubek notes that even at the turn of the 20th century, most library patrons were not reading "the classics":

What do these records tell us Americans were reading? Mostly fluff, it's true. Women read romances, kids read pulp and white-collar workers read mass-market titles. Horatio Alger was by far the most popular author: 5 percent of all books checked out were by him, despite librarians who frowned when boys and girls sought his rags-to-riches novels (some libraries refused to circulate Alger's distressingly individualist books). Louisa May Alcott is the only author who remains both popular and literary today (though her popularity is far less). "Little Women" was widely read, but its sequel "Little Men" even more so, perhaps because it was checked out by boys, too.

Got data news?

Feel free to email me.

Related:

November 04 2011

Four short links: 4 November 2011

  1. Beethoven's Open Repository of Research (RocketHub) -- open repository funded in a Kickstarter-type way. First crowdfunding project I've given $$$ to.
  2. KeepOff (GitHub) -- open source project built around hacking KeepOn Interactive Dancing Robots. (via Chris Spurgeon)
  3. Steve Jobs One-on-One (ComputerWorld) -- interesting glimpse of the man himself in an oral history project recording made during the NeXT years. I don't need a computer to get a kid interested in that, to spend a week playing with gravity and trying to understand that and come up with reasons why. But you do need a person. You need a person. Especially with computers the way they are now. Computers are very reactive but they're not proactive; they are not agents, if you will. They are very reactive. What children need is something more proactive. They need a guide. They don't need an assistant.
  4. Bluetooth Violin Bow -- this is awesome in so many directions. Sensors EVERYWHERE! I wonder what hackable uses it has ...

October 28 2011

Four short links: 28 October 2011

  1. Open Access Week -- a global event promoting Open Access as a new norm in scholarship and research.
  2. The Copiale Cipher -- cracking a historical code with computers. Details in the paper: The book describes the initiation of "DER CANDIDAT" into a secret society, some functions of which are encoded with logograms. (via Discover Magazine)
  3. Coordino -- open source Quota-like question-and-answer software. (via Smashing Magazine)
  4. Baroque.me -- visualization of the first prelude from the first Cello Suite by Bach. Music is notoriously difficult to visualize (Disney's Fantasia is the earliest attempt that I know of) as there is so much it's possible to capture. (via Andy Baio)

October 15 2011

International Open Government Data Camp looks to build community

There's a growing international movement afoot worldwide to open up government data and make something useful with it. Civic apps based upon open data are emerging that genuinely serve citizens in a beneficial ways that officials may have not been able to deliver, particularly without significant time or increased expense.

For every civic app, however, there's a backstory that often involves a broad number of stakeholders. Governments have to commit to open up themselves but will in many cases need external expertise or even funding to do so. Citizens, industry and developers have to use the data, demonstrating that there's not only demand but skill outside of government to put open data to work in the service of accountability, citizen utility and economic opportunity. Galvanizing the co-creation of civic services, policies or apps isn't easy but the potential of the civic surplus attracted the attention of governments around the world.

The approach will not be a silver bullet to all of society's ills, given high unemployment, economic uncertainty or high healthcare or energy costs -- but an increasing number of states are standing up platforms and stimulating an app economy. Given the promise of leaner, smarter government that focuses upon providing open data to fuel economic activity, tough, results-oriented mayors like Rahm Emanuel and Mike Bloomberg are committing to opening Chicago and open government data in NYC.

A key ingredient in successful open government data initiatives is community. It's not enough to simply release data and hope that venture capitalists and developers magically become aware of the opportunity to put it to work. Marketing open government data is what has brought federal CTO Aneesh Chopra and HHS CTO Todd Park repeatedly out to Silicon Valley, New York City and other business and tech hubs. The civic developer and startup community is participating in creating a new distributed ecosystem, from BuzzData to Socrata to new efforts like Max Ogden's DataCouch.

As with other open source movements, people interested in open data are self-organizing and, in many cases, are using the unconference model to do so. Over the past decade, camps have sprung up all around the U.S. and, increasingly, internationally, from Asia to India to Europe Africa to South America. Whether they're called techcamps, barcamps, citycamps or govcamps, these forums are giving advocates, activists, civic media, citizens and public officials to meet, exchange ideas, code and expertise.

Next week, the second International Open Government Data Camp will pull together all of those constituencies in Warsaw, Poland to talk about the future of open data. Attendees will be able to learn from plenary keynotes from open data leaders and tracks full of sessions with advocates, activists and technologists. Satellite events around OGD Camp will also offer unstructured time for people to meet, mix, connect and create. You can watch a short film about open government data from the Open Knowledge Foundation below:

To learn more about what attendees should expect, I conducted an email interview with Jonathan Gray, the community coordinator for the Open Knowledge Foundation. For more on specific details about the camp, consult the FAQ at OGDCamp.org. Gray offered more context on open government data at the Guardian this past week:

It's been over five years since the Guardian launched its influential Free Our Data campaign. Nearly four years ago Rufus Pollock coined the phrase "Raw Data Now" which web inventor Sir Tim Berners-Lee later transformed into the slogan for a global movement. And that same year a group of 30 open government advocates met in Sebastopol, California and drafted a succinct text on open government data which has subsequently been echoed and encoded in official policy and legislative documents around the world.

In under half a decade, open data has found its way into digital policy packages and transparency initiatives all over the place - from city administrations in Berlin, Paris and New York, to the corridors of supranational institutions like the European Commission or the World Bank. In the past few years we've seen a veritable abundance of portals and principles, handbooks and hackdays, promises and prizes.

But despite this enthusiastic and energetic reception, open data has not been without its setbacks and there are still huge challenges ahead. Earlier this year there were reports that Data.gov will have its funding slashed. In the UK there are concerns that the ominously titled "Public Data Corporation" may mean that an increasing amount of data is locked down and sold to those who can afford to pay for it. And in most countries around the world most documents and datasets are still published with ambiguous or restrictive legal conditions, which inhibit reuse. Public sector spending cuts and austerity measures in many countries will make it harder for open data to rise up priority lists.

Participants at this year's camp will swap notes on how to overcome some of these obstacles, as well as learning about how to set up and run an open data initiative (from the people behind data.gov and other national catalogues), how to get the legal and technical details right, how to engage with data users, how to run events, hackdays, competitions, and lots more.

What will this camp change?

We want to build a stronger international community of people interested in open data - so people can swap expertise, anecdotes and bits of code. In particular we want to get public servants talking to each other about how to set up an open data initiative, and to make sure that developers, journalists NGOs and others are included in the process.

What did the last camp change?

Many of the participants from the 2010 camp came away enthused with ideas, contacts and energy that has catalysed and informed the development of open data around the world. For example, groups of citizens booted up grassroots open data meetups in several places, public servants set up official initiatives on the back of advice and discussions from the camp, developers started local versions of projects they liked, and so on.

Why does this matter to the tech community?

Public data is a fertile soil out of which the next generation of digital services and applications will grow. It may take a while for technologies and processes to get there, but eventually we hope open data will be ubiquitous and routine.

Why does it matter to the art, design, music, business or nonprofit community?

Journalists need to be able to navigate public information sources, from official documents and transcripts to information on the environment or the economy. Rather than relying on press releases and policy reports, they should be able to have some grasp of the raw information sources upon which these things depend - so they can make up their own mind, and do their own analysis and evaluation. There's a dedicated satellite event on data journalism at the camp, focusing on looking at where EU spending goes.

Similarly, NGOs, think tanks, and community groups should be able to utilise public data to improve their research, advocacy or outreach. Being more literate about data sources, and knowing how to use them in combination will existing free tools and services can be a very powerful way to put arguments into context, or to communicate issues they care about more effectively. This will be a big theme in this year's camp.

Why does it matter to people who have never heard of open data?

Our lives are increasingly governed by data. Having basic literacy about how to use the information around is is important for all sorts of things, from dealing with major global problems to making everyday decisions. In response to things like climate change, the financial crisis, or disease outbreaks, governments must share information with each other and with the public, to respond effectively and to keep citizens informed. We depend on having up-to-date information to plan our journeys, locate public facilities close to see how our taxes are spent.

What are the outcomes that matter from such an event?

We are hoping to build consensus around a set of legal principles for open data so key stakeholders around the world come to a more explicit and formal agreement about under what terms open data should be published (as liberal as possible!). And we'll be working on datacatalogs.org, which aims to be a comprehensive directory of open data catalogues from around the world curated for and by the open data community.

We also hope that some key open data projects will be ported and transplanted to different countries. Perhaps most importantly, we hope that (like last year) the discussions and workshops that take place will give a big boost to open data around the world, and people will continue to collaborate online after the camp.

How is OGD Camp going to be different from other events?

It looks like it will be the biggest open data event to date. We have representation from dozens and dozens of countries around the world. There will be a strong focus on getting things done. We're really excited!

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.
(PRO)
No Soup for you

Don't be the product, buy the product!

close
YES, I want to SOUP ●UP for ...