Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

December 10 2013

The public front of the free software campaign: part I

At a recent meeting of the MIT Open Source Planning Tools Group, I had the pleasure of hosting Zak Rogoff — campaigns manager at the Free Software Foundation — for an open-ended discussion on the potential for free and open tools for urban planners, community development organizations, and citizen activists. The conversation ranged over broad terrain in an “exploratory mode,” perhaps uncovering more questions than answers, but we did succeed in identifying some of the more common software (and other) tools needed by planners, designers, developers, and advocates, and shared some thoughts on the current state of FOSS options and their relative levels of adoption.

Included were the usual suspects — LibreOffice for documents, spreadsheets, and presentations; QGIS and OpenStreetMap for mapping; and (my favorite) R for statistical analysis — but we began to explore other areas as well, trying to get a sense of what more advanced tools (and data) planners use for, say, regional economic forecasts, climate change modeling, or real-time transportation management. (Since the event took place in the Department of Urban Studies & Planning at MIT, we mostly centered on planning-related tasks, but we also touched on some tangential non-planning needs of public agencies, and the potential for FOSS solutions there: assessor’s databases, 911 systems, library catalogs, educational software, health care exchanges, and so on.)

Importantly, we agreed from the start that to deliver on the promise of free software, planners must also secure free and open data — and free, fair, and open standards: without access to data — the raw material of the act of planning — our tools become useless, full of empty promise.

Emerging from the discussion, moreover, was a realization of what seemed to be a natural fit between the philosophy of the free and open source software movement and the overall goals of government and nonprofit planning groups, most notably along the following lines:

  • The ideal (and requirement) of thrift: Despite what you might hear on the street, most government agencies do not exist to waste taxpayer money; in fact, even well-funded agencies generally do not have enough funds to meet all the demands we place on them, and budgets are typically stretched pretty thin. On the “community” side, we see similar budgetary constraints for planners and advocates working in NGOs and community-based organizations, where every dollar that goes into purchasing (or upgrading) proprietary software, subscribing to private datasets, and renewing licenses means one less dollar to spend on program activities on the ground. Added to this, ever since the Progressive Era, governments have been required by law to seek the lowest-cost option when spending the public’s money, and we have created an entire bureaucracy of regulations, procurement procedures, and oversight authorities to enforce these requirements. (Yes, yes, I know: the same people who complain about government waste often want to eliminate “red tape” like this…)  When FOSS options meet the specifications of government contracts, it’s hard to see why they wouldn’t be in fact required under these procurement standards; of course, they often fail to meet the one part of the procurement specification that names a particular program; in essence, such practices “rig” bids in favor of proprietary software.  (One future avenue worth exploring might be to argue for performance-based bid specifications in government procurement.)
  • The concomitant goal of empowerment: Beyond simply saving money, planning and development organizations often want to actually do something; they exist to protect what we have (breathable air and clean drinking water, historic and cultural resources, property values), fix what is broken (vacant lots and buildings, outmoded and failing infrastructure, unsafe neighborhoods), and develop what we need (affordable housing, healthy food networks, good jobs, effective public services). Importantly, as part of the process, planners generally seek to empower the communities they are working in (at least since the 1970s); to extend-by-paraphrase Marshall McLuhan, “the process is the purpose,” and there is little point in working “in the public interest” while simultaneously robbing that same public of its voice, its community power, and its rights of democratic participation. So, where’s the tie-in to FOSS? The key here is to avoid the problem Marx diagnosed as “alienation of the workers from the means of production.” (Recent world events notwithstanding, Marx was still sometimes correct, and he really put his finger on it with this one.) When software code is provided in a free and open format, users and coders can become partners in the development cycle; better still, “open-source” can also become “open-ended,” as different groups are empowered to modify and enhance the programs they use. Without permanent, reliable, affordable — and, some would argue, customizable — access to tools and data, planners and citizens (the “workers,” in this case) become alienated from the means of producing plans for their future.
  • The value of transparency and openness: A third area of philosophical alignment between free software and public planners relates to the importance both groups place on transparency. To some extent — at least in the context of government planners — this aspect seems to combine elements of the previous two: just as government agencies are required under procurement laws to be cost-conscious, they are required under public records and open meeting laws to be transparent. Similarly, in the same way that community empowerment requires access to the tools of planning, it also requires access to the information of planning: in order for democratic participation to be meaningful, the public must have access to information about what decisions are being made, when, by whom, and why (based on what rationale?). Transparency — not just the privilege of “being informed,” but rather the right to examine and audit all the files — is the only way to ensure this access. In short, even if it is not free, we expect our government to be open source.
  • The virtuous efficiency of cooperation and sharing: With a few misguided exceptions (for example, when engaging in “tragedy of the commons” battles over shared resources, or manipulated into “race-to-the-bottom” regional bidding wars to attract sports teams or industrial development), governments and community-based organizations generally do not exist in the same competitive environment as private companies. If one agency or neighborhood develops a new tool or has a smart idea to solve a persistent problem, there is no harm — and much benefit — to sharing it with other places. In this way, the natural inclination of public and non-profit agencies bears a striking resemblance to the share-and-share-alike ethos of open source software developers. (The crucial difference being that, often, government and community-based agencies are too busy actually working “in the trenches” to develop networks for shared learning and knowledge transfer, but the interest is certainly there.)

Added to all this, recent government software challenges hint at the potential benefit of a FOSS development model. For example, given the botched rollout of the online health care insurance exchanges (which some have blamed on proprietary software models, and/or the difficulty of building the new public system on top of existing locked private code), groups like FSF have been presented with a “teachable moment” about the virtues of free and open solutions. Of course, given the current track record of adoption (spotty at best), the recognition of these lines of natural alignment begs the question, “Given all this potential and all these shared values, why haven’t more public and non-profit groups embraced free and open software to advance their work?” Our conversation began to address this question in a frank and honest way, enumerating deficiencies in the existing tools and gaps in the adoption pipeline, but quickly pivoted to a more positive framing, suggesting new — and, potentially, quite productive — fronts for the campaign for free and open source software, which I will present in part two. Stay tuned.

April 18 2013

Sprinting toward the future of Jamaica

Creating the conditions for startups to form is now a policy imperative for governments around the world, as Julian Jay Robinson, minister of state in Jamaica’s Ministry of Science, Technology, Energy and Mining, reminded the attendees at the “Developing the Caribbean” conference last week in Kingston, Jamaica.

photo-22photo-22

Robinson said Jamaica is working on deploying wireless broadband access, securing networks and stimulating tech entrepreneurship around the island, a set of priorities that would have sounded of the moment in Washington, Paris, Hong Kong or Bangalore. He also described open access and open data as fundamental parts of democratic governance, explicitly aligning the release of public data with economic development and anti-corruption efforts. Robinson also pledged to help ensure that Jamaica’s open data efforts would be successful, offering a key ally within government to members of civil society.

The interest in adding technical ability and capacity around the Caribbean was sparked by other efforts around the world, particularly Kenya’s open government data efforts. That’s what led the organizers to invite Paul Kukubo to speak about Kenya’s experience, which Robinson noted might be more relevant to Jamaica than that of the global north.

Kukubo, the head of Kenya’s Information, Communication and Technology Board, was a key player in getting the country’s open data initiative off the ground and evangelizing it to developers in Nairobi. At the conference, Kukubo gave Jamaicans two key pieces of advice. First, open data efforts must be aligned with national priorities, from reducing corruption to improving digital services to economic development.

“You can’t do your open data initiative outside of what you’re trying to do for your country,” said Kukubo.

Second, political leadership is essential to success. In Kenya, the president was personally involved in open data, Kukubo said. Now that a new president has been officially elected, however, there are new questions about what happens next, particularly given that pickup in Kenya’s development community hasn’t been as dynamic as officials might have hoped. There’s also a significant issue on the demand-side of open data, with respect to the absence of a Freedom of Information Law in Kenya.

When I asked Kukubo about these issues, he said he expects a Freedom of Information law will be passed this year in Kenya. He also replied that the momentum on open data wasn’t just about the supply side.

“We feel that in the usage side, especially with respect to the developer ecosystem, we haven’t necessarily gotten as much traction from developers using data and interpreting cleverly as we might have wanted to have,” he said. “We’re putting putting more into that area.”

With respect to leadership, Kukubo pointed out that newly elected Kenyan President Uhuru Kenyatta drove open data release and policy when he was the minister of finance. Kukubo expects him to be very supportive of open data in office.

The development of open data in Jamaica, by way of contrast, has been driven by academia, said professor Maurice McNaughton, director of the Center of Excellence at the Mona School of Business at the University of the West Indies (UWI). The Caribbean Open Institute, for instance, has been working closely with Jamaica’s Rural Agriculture Development Authority (RADA). There are high hopes that releases of more data from RADA and other Jamaican institutions will improve Jamaica’s economy and the effectiveness of its government.

Open data could add $35 million annually to the Jamaican economy, said Damian Cox, director of the Access to Information Unit in the Office of the Prime Minister, citing a United Nations estimate. Cox also explicitly aligned open data with measuring progress toward Millennium Development Goals, positing that increasing the availability of data will enable the civil society, government agencies and the UN to more accurately assess success.

The development of (open) data-driven journalism

Developing the Caribbean focused on the demand side of open data as well, particularly the role of intermediaries in collecting, cleaning, fact checking, and presenting data, matched with necessary narrative and context. That kind of work is precisely what data-driven journalism does, which is why it was one of the major themes of the conference. I was invited to give an overview of data-driven journalism that connected some trends and highlighted the best work in the field.

I’ve written quite a bit about how data-driven journalism is making sense of the world elsewhere, with a report yet to come. What I found in Jamaica is that media there have long since begun experimenting in the field, from the investigative journalism at Panos Caribbean to the relatively recent launch of diGJamaica by the Gleaner Company.

diGJamaica is modeled upon the Jamaican Handbook and includes more than a million pages from The Gleaner newspaper, going back to 1834. The site publishes directories of public entities and public data, including visualizations. It charges for access to the archives.

Legends and legacies

Usain Bolt in JamaicaUsain Bolt in Jamaica

Olympic champion Usain Bolt, photographed in his (fast) car at the UWI/Usain Bolt Track in Mona, Jamaica.

Normally, meeting the fastest man on earth would be the most memorable part of any trip. The moment that left the deepest impression from my journey to the Caribbean, however, came not from encountering Usain Bolt on a run but from within a seminar room on a university campus.

As a member of a panel of judges, I saw dozens of young people present after working for 30 hours at a hackathon at the University of the West Indies. While even the most mature of the working apps was still a prototype, the best of them were squarely focused on issues that affect real Jamaicans: scoring the risk of farmers that needed banking loans and collecting and sharing data about produce.

The winning team created a working mobile app that would enable government officials to collect data at farms. While none of the apps are likely to be adopted by the agricultural agency in its current form, or show up in the Google Play store this week, the experience the teams gained will help them in the future.

As I left the island, the perspective that I’d taken away from trips to Brazil, Moldova and Africa last year was further confirmed: technical talent and creativity can be found everywhere in the world, along with considerable passion to apply design thinking, data and mobile technology to improve the societies people live within. This is innovation that matters, not just clones of popular social networking apps — though the judges saw more than a couple of those ideas flow by as well.

In the years ahead, Jamaican developers will play an important role in media, commerce and government on the island. If attracting young people to engineering and teaching them to code is the long-term legacy of efforts like Developing the Caribbean, it will deserve its own thumbs up from Mr. Bolt. The track to that future looks wide open.

photo-23photo-23

Disclosure: the cost of my travel to Jamaica was paid for by the organizers of the Developing the Caribbean conference.

March 28 2013

Four short links: 28 March 2013

  1. What American Startups Can Learn From the Cutthroat Chinese Software IndustryIt follows that the idea of “viral” or “organic” growth doesn’t exist in China. “User acquisition is all about media buys. Platform-to-platform in China is war, and it is fought viciously and bitterly. If you have a Gmail account and send an email to, for example, NetEase163.com, which is the local web dominant player, it will most likely go to spam or junk folders regardless of your settings. Just to get an email to go through to your inbox, the company sending the email needs to have a special partnership.” This entire article is a horror show.
  2. White House Hangout Maker Movement (Whitehouse) — During the Hangout, Tom Kalil will discuss the elements of an “all hands on deck” effort to promote Making, with participants including: Dale Dougherty, Founder and Publisher of MAKE; Tara Tiger Brown, Los Angeles Makerspace; Super Awesome Sylvia, Super Awesome Maker Show; Saul Griffith, Co-Founder, Otherlab; Venkatesh Prasad, Ford.
  3. Municipal Codes of DC Freed (BoingBoing) — more good work by Carl Malamud. He’s specifically providing data for apps.
  4. The Modern Malware Review (PDF) — 90% of fully undetected malware was delivered via web-browsing; It took antivirus vendors 4 times as long to detect malware from web-based applications as opposed to email (20 days for web, 5 days for email); FTP was observed to be exceptionally high-risk.

March 19 2013

The City of Chicago wants you to fork its data on GitHub

GitHub has been gaining new prominence as the use of open source software in government grows.

Earlier this month, I included a few thoughts from Chicago’s chief information officer, Brett Goldstein, about the city’s use of GitHub, in a piece exploring GitHub’s role in government.

While Goldstein says that Chicago’s open data portal will remain the primary means through which Chicago releases public sector data, publishing open data on GitHub is an experiment that will be interesting to watch, in terms of whether it affects reuse or collaboration around it.

In a followup email, Goldstein, who also serves as Chicago’s chief data officer, shared more about why the city is on GitHub and what they’re learning. Our discussion follows.

Chicago's presence on GitHubChicago's presence on GitHub

The City of Chicago is on GitHub.

What has your experience on GitHub been like to date?

Brett Goldstein: It has been a positive experience so far. Our local developer community is very excited by the MIT License on these datasets, and we have received positive reactions from outside of Chicago as well.

This is a new experiment for us, so we are learning along with the community. For instance, GitHub was not built to be a data portal, so it was difficult to upload our buildings dataset, which was over 2GB. We are rethinking how to deploy that data more efficiently.

Why use GitHub, as opposed to some other data repository?

Brett Goldstein: GitHub provides the ability to download, fork, make pull requests, and merge changes back to the original data. This is a new experiment, where we can see if it’s possible to crowdsource better data. GitHub provides the necessary functionality. We already had a presence on GitHub, so it was a natural extension to that as a complement to our existing data portal.

Why does it make sense for the city to use or publish open source code?

Brett Goldstein: Three reasons. First, it solves issues with incorporating data in open source and proprietary projects. The city’s data is available to be used publicly, and this step removes any remaining licensing barriers. These datasets were targeted because they are incredibly useful in the daily life of residents and visitors to Chicago. They are the most likely to be used in outside projects. We hope this data can be incorporated into existing projects. We also hope that developers will feel more comfortable developing applications or services based on an open source license.

Second, it fits within the city’s ethos and vision for data. These datasets are items that are visible in daily life — transportation and buildings. It is not proprietary data and should be open, editable, and usable by the public.

Third, we engage in projects like this because they ultimately benefit the people of Chicago. Not only do our residents get better apps when we do what we can to support a more creative and vibrant developer community, they also will get a smarter and more nimble government using tools that are created by sharing data.

We open source many of our projects because we feel the methodology and data will benefit other municipalities.

Is anyone pulling it or collaborating with you? Have you used that code? Would you, if it happened?

Brett Goldstein: We collaborated with Ian Dees, who is a significant contributor to OpenStreetMaps, to launch this idea. We anticipate that buildings data will be integrated in OpenStreetMaps now that it’s available with a compatible license.

We have had 21 forks and a handful of pull requests fixing some issues in our README. We have not had a pull request fixing the actual data.

We do intend to merge requests to fix the data and are working on our internal process to review, reject, and merge requests. This is an exciting experiment for us, really at the forefront of what governments are doing, and we are learning along with the community as well.

Is anyone using the open data that wasn’t before, now that it’s JSON?

Brett Goldstein: We seem to be reaching a new audience with posting data on GitHub, working in tandem with our heavily trafficked data portal. A core goal of this administration is to make data open and available. We have one of the most ambitious open data programs in the country. Our portal has over 400 datasets that are machine readable, downloadable and searchable. Since it’s hosted on Socrata, basic analysis of the data is possible as well.

March 08 2013

GitHub gains new prominence as the use of open source within governments grows

github-social-codinggithub-social-codingWhen it comes to government IT in 2013, GitHub may have surpassed Twitter and Facebook as the most interesting social network. 

GitHub’s profile has been rising recently, from a Wired article about open source in government, to its high profile use by the White House and within the Consumer Financial Protection Bureau. This March, after the first White House hackathon in February, the administration’s digital team posted its new API standards on GitHub. In addition to the U.S., code from the United Kingdom, Canada, Argentina and Finland is also on the platform.

“We’re reaching a tipping point where we’re seeing more collaboration not only within government agencies, but also between different agencies, and between the government and the public,” said GitHub head of communications Liz Clinkenbeard, when I asked her for comment.

Overall, 2012 was a breakout year for the use of GitHub by government, with more than 350 government code repositories by year’s end.

Total government GitHub repositoriesTotal government GitHub repositories

Total number of government repositories on GitHub.

In January 2012, the British government committed the code for GOV.UK to GitHub.

NASA, after its first commit, added 11 more code repositories over the course of the year.

In September, the new Open Gov Foundation published the code for the MADISON legislative platform. In December, the U.S. Code went on GitHub.

GitHub’s profile was raised further in Washington this week when Ben Balter was announced as the company’s federal liaison. Balter made some open source history last year, when he was part of the federal government’s first agency-to-agency pull request. He also was a big part of giving the White House some much-needed geek cred when he coded the administration’s digital government strategy in HTML5.

Balter will be GitHub’s first government-focused employee. He won’t, however, be saddled with an undecipherable title. In a sly dig at the slow-moving institutions of government, and in keeping with GitHub’s love for octocats, Balter will be the first “Government Bureaucat,” focused on “helping government to do all sorts of governmenty things, well, more awesomely,” wrote GitHub CIO Scott Chacon.

Part of Balter’s job will be to evangelize the use of GitHub’s platform as well as open source in government, in general. The latter will come naturally to him, given how he and the other Presidential Innovation Fellows approached their work.

“Virtually everything the Presidential Innovation Fellows touched was open sourced,” said Balter when I interviewed him earlier this week. “That’s everything from better IT procurement software to internal tools that we used to streamline paperwork. Even more important, much of that development (particularly RFPEZ) happened entirely in the open. We were taking the open source ethos and applying it to how government solutions were developed, regardless whether or not the code was eventually public. That’s a big shift.”

Balter is a proponent of social coding in the open as a means of providing some transparency to interested citizens. “You can go back and see why an agency made a certain decision, especially when tools like these are used to aid formal decision making,” he said. “That can have an empowering effect on the public.”

Forking code in city hall and beyond

There’s notable government activity beyond the Beltway as well.

The City of Chicago is now on GitHub, where chief data officer and city CIO Brett Goldstein is releasing open data as JSON files, along with open source code.

Both Goldstein and Philadelphia chief data officer Mark Headd are also laudably participating in conversations about code and data on Hacker News threads.

“Chicago has released over 400 datasets using our data portal, which is located at data.cityofchicago.org,” Headd wrote on HackerNews. While Goldstein says that the city’s portal will remain the primary way they release public sector data, publishing data on GitHub is an experiment that will be interesting to watch, in terms of whether it affects reuse.

“We hope [the datasets on GitHub] will be widely used by open source projects, businesses, or non-profits,” wrote Goldstein. “GitHub also allows an on-going collaboration with editing and improving data, unlike the typical portal technology. Because it’s an open source license, data can be hosted on other services, and we’d also like to see applications that could facilitate easier editing of geographic data by non-technical users.”

Headd is also on GitHub in a professional capacity, where he and his colleagues have been publishing code to a City of Philadelphia repository.

“We use [GitHub] to share some of our official city apps,” commented Headd on the same Hacker News thread. “These are usually simple web apps built with tools like Bootstrap and jQuery. We’ll be open sourcing more of these going forward. Not only are we interested in sharing the code for these apps, we’re actively encouraging people to fork, improve and send pull requests.”

While there’s still a long road ahead for widespread code sharing between the public and government, the economic circumstances of cities and agencies could create the conditions for more code sharing inside government. In a TED Talk last year, Clay Shirky suggested that adopting open source methods for collaboration could even transform government.

A more modest (although still audacious) goal would be to simply change how government IT is done.

“I’ve often said, the hardest part of being a software developer is training yourself to Google the problem first and see if someone else has already solved it,” said Balter during our interview. “I think we’re going to see government begin to learn that lesson, especially as budgets begin to tighten. It’s a relative ‘app store’ of technology solutions just waiting to be used or improved upon. That’s the first step: rather than going out to a contractor and reinventing the wheel each time, it’s training ourselves that we’re part of a larger ecosystem and to look for prior art. On the flip side, it’s about contributing back to that commons once the problem has been solved. It’s about realizing you’re part of a community. We’re quickly approaching a tipping point where it’s going to be easier for government to work together than alone. All this means that a taxpayer’s dollar can go further, do more with less, and ultimately deliver better citizen services.”

Some people may understandably bridle at including open source code and open data under the broader umbrella of “open government,” particularly if such efforts are not balanced by adherence to good government principles around transparency and accountability.

That said, there’s reason to hail collaboration around software and data as bonafide examples of 21st century civic participation, where better platforms for social coding enable improved outcomes. The commits and pulls of staff and residents on GitHub may feel like small steps, but they represent measurable progress toward more government not just of the people, but with the people.

“Open source in government is nothing new,” said Balter. “What’s new is that we’re finally approaching a tipping point at which, for federal employees, it’s going to be easier to work together, than work apart. Whereas before, ‘open source’ often meant compiling, zipping, and uploading, when you fuse the internal development tools with the external publishing tools, and you make those tools incredibly easy to use, participating in the open source community becomes trivial. Often, it can be more painful for an agency to avoid it completely. I think we’re about to see a big uptick in the amount of open source participation, and not just in the traditional sense. Open source can be between business units within an agency. Often the left hand doesn’t know what the right is doing between agencies. The problems agencies face are not unique. Often the taxpayer is paying to solve the same problem multiple times. Ultimately, in a collaborative commons with the public, we’re working together to make our government better.”

February 22 2013

White House moves to increase public access to scientific research online

Today, the White House responded to a We The People e-petition that asked for free online access to taxpayer-funded research.

open-access-smallopen-access-smallAs part of the response, John Holdren, the director of the White House Office of Science and Technology Policy, released a memorandum today directing agencies with “more than $100 million in research and development expenditures to develop plans to make the results of federally-funded research publically available free of charge within 12 months after original publication.”

The Obama administration has been considering access to federally funded scientific research for years, including a report to Congress in March 2012. The relevant e-petition, which had gathered more than 65,000 signatures, had gone unanswered since May of last year.

As Hayley Tsukayama notes in the Washington Post, the White House acknowledged the open access policies of the National Institutes of Health as a successful model for sharing research.

“This is a big win for researchers, taxpayers, and everyone who depends on research for new medicines, useful technologies, or effective public policies,” said Peter Suber, Director of the Public Knowledge Open Access Project, in a release. “Assuring public access to non-classified publicly-funded research is a long-standing interest of Public Knowledge, and we thank the Obama Administration for taking this significant step.”

Every federal agency covered by this memomorandum will eventually need to “ensure that the public can read, download, and analyze in digital form final peer-reviewed manuscripts or final published documents within a timeframe that is appropriate for each type of research conducted or sponsored by the agency.”

An open government success story?

From the day they were announced, one of the biggest question marks about We The People e-petitions has always been whether the administration would make policy changes or take public stances it had not before on a given issue.

While the memorandum and the potential outcomes from its release come with caveats, from a $100 million threshold to national security or economic competition, this answer from the director of the White House Office of Science Policy accompanied by a memorandum directing agencies to make a plan for public access to research is a substantive outcome.

While there are many reasons to be critical of some open government initiatives, it certainly appears that today, We The People were heard in the halls of government.

An earlier version of this post appears on the Radar Tumblr, including tweets regarding the policy change. Photo Credit: ajc1 on Flickr.

Reposted bycheg00 cheg00

February 13 2013

Personal data ownership drives market transparency and empowers consumers

On Monday morning, the Obama administration launched a new community focused on consumer data at Data.gov. While there was no new data to be found among the 507 datasets listed there, it was the first time that smart disclosure has an official home in federal government.

Data.gov consumer slide apps imageData.gov consumer slide apps image

Image via Data.gov.

“Smart disclosure means transparent, plain language, comprehensive, synthesis and analysis of data that helps consumers make better-informed decisions,” said Christopher Meyer, the vice president for external affairs and information services at Consumers Union, the nonprofit that publishes “Consumer Reports,” in an interview. “The Obama administration deserves credit for championing agency disclosure of data sets and pulling it together into one web site. The best outcome will be widespread consumer use of the tools — and that remains to be seen.”

You can find the new community at Consumer.Data.gov or data.gov/consumer. Both URLs forward visitors to the same landing page, where they can explore the data, past challenges, external resources on the topic, in addition to a page about smart disclosure, blog posts, forums and feedback.

“Analyzing data and giving plain language understanding of that data to consumers is a critical part of what Consumer Reports does,” said Meyer. “Having hundreds of data sets available on one (hopefully) easy-to-use platform will enable us to provide even more useful information to consumers at a time when family budgets are tight and health care and financial ‘choices” have never been more plentiful.”

The newest community brings the total number of communities on Data.gov to 16. A survey of the existing communities didn’t turn up much recent activity in the forums or blogs, although the health care community at HealthData.gov has more signs of life than others and there are ongoing challenges at Challenge.gov associated with many different topics.

Another side of open?

Smart disclosure is one of the 17 initiatives that the U.S. committed to as part of the National Action Plan for the Open Government Partnership.

“We’ve developed new tools — called ‘smart disclosures’ — so that the data we make public can help people make health care choices, help small businesses innovate, and help scientists achieve new breakthroughs,” said President Obama, speaking at the launch of the Open Government Partnership in New York City in September 2011. “We’ve been promoting greater disclosure of government information, empowering citizens with new ways to participate in their democracy. We are releasing more data in usable forms on health and safety and the environment, because information is power, and helping people make informed decisions and entrepreneurs turn data into new products, they create new jobs.”

In the months since, the Obama administration has been promoting the use of smart disclosure across federal government through a task force (PDF), working to embed the practice as part of the ways that agencies deliver on consumer policy. The United Kingdom’s “Midata” initiative is an important smart disclosure case study outside of the United States.

In 2012, the U.S. Treasury Department launched a finance data community, joining open data initiatives in health care, energy, education, development and safety.

“I think you have to say that what has been accomplished so far is mostly [that] the release of government data has spawned a new generation of apps,” said Richard Thaler, professor of behavioral science and economics at the University of Chicago, in an interview. “This has been a win-win for business and consumers. New businesses are created to utilize the now available government data, and consumers now know when the next bus will arrive. The next step will be to get the private sector data into the picture — but that is only the bright future at this stage, rather than something that has already been accomplished. It is great that the government has led the way in releasing data, since it will give them more credibility when they ask private companies to do the same.”

Open data as catalyst?

While their business or organizational goals for data usage may diverge, consumer advocates, entrepreneurs and media are all looking for more insight into what’s actually happening in marketplaces for goods and services.

“Data releases are critical,” said Meyer. “First, even raw, less consumer-friendly data can help change government and industry behavior when it is published. Second, sunlight truly is the best disinfectant. We believe government and industry want to do right by consumers. Scrutiny of data makes the next iteration better, whether it’s produced by the government or a hospital.”

What will make these kinds of disclosures “smart?” When they involve timely, regular release of personal data in standardized, machine readable formats. When data is more liquid, it can easily be ingested by entrepreneurs and developers to be used in tools and services to help people to make more informed decisions as they navigate marketplaces for finance, health care, energy, education or other areas.

“We use government datasets a great deal in the health care space,” said Meyer. “We use CMS ‘Hospital Compare’ data to publish ratings on patient experience and re-admissions. To develop ratings of preventive services for heart disease, we rely on the U.S. Preventive Services Task Force.”

The stories of Brightscope and Panjiva are instructive: both startups had to invest significant time, money and engineering talent in acquiring and cleaning up government data before they could put it to work adding transparency to supply chains or financial advisers.

“It’s cliche, but true – knowledge is power,” said Yaron Samid, the CEO of BillGuard, in an interview. “In BillGuard’s case, when we inform consumers about a charge on their credit bill that was disputed by thousands of other consumers or a known grey charge merchant before they shop, it empowers them to make active choices in protecting their money – and spending it, penny for penny, how they choose and explicitly authorize. The release and cross-sector collaboration of billing dispute data will empower consumers and help put an end to deceptive sales and billing practices, the same way crowdsourced “mark as spam” data did for the anti-spam industry.”

What tools exist for smart disclosure today?

If you look through the tools and services at the new alpha.data.gov, quite a few of the examples are tools that use smart disclosure. When they solve knotty problems, such consumer-facing products or services have the potential to massively scale quickly:

As Meyer pointed out in our interview, however, which ones catch on is still an open question.

“We are still in the nascent stage of identifying many smart disclosure outcomes that have benefited consumers in a practical way,” he said. “Where we can see demonstrable progress is the government’s acknowledgement that freeing the data is the first and most necessary step to giving private sector innovators opportunity to move the marketplace in a pro-consumer direction.”

The difference between open data on a government website and data put to work where consumers are making decisions, however, is significant.

“‘Freeing the data’ is just the first step,” said Meyer. “It has to be organized in a consumer-friendly format. That means a much more intense effort by the government to understand what consumers want and how they can best absorb the data. Consumer Reports and its policy and action arm, Consumers Union, have spent an enormous amount of time trying to get federal and state governments and private health providers to release information about hospital-acquired infections in order to prevent medical harms that kill 100,000 people a year. We’re making progress with government agencies, although we have a long way to go.”

There has already been some movement in sectors where consumers are used to downloading data, like banking. For instance, BillShrink and Hello Wallet use government and private sector data to help people to make better consumer finance decisions. OPower combines energy efficiency data from appliances and government data on energy usage and weather to produce personalized advice on how to save money on energy bills. BillGuard analyzes millions of billing disputes to find “grey charge” patterns on credit cards and debit cards. (Disclosure: Tim O’Reilly is on BillGuard’s Advisory Board and is a shareholder in the startup.)

“To get an idea of the potential here, think about what has happened to the travel agent business,” said Thaler. “That industry has essentially been replaced by websites servings as choice engines. While this has been a loss to those who used to be travel agents, I think most consumers feel they are better served by being able to search the various travel and lodging options via the Internet. When it comes to choosing a calling plan or a credit card, it is very difficult to get the necessary data, either on prices or on one’s own utilization, to make a good choice. The same is true for mortgages. If we can make the underlying data available, we can help consumers make much better choices in these and other domains, and at the same time make these industries more competitive and transparent. There are similar opportunities in education, especially in the post-high school, for-profit sector.”

Recent data releases have the potential to create new insights into previously opaque markets.

“There are also citizen complaint registries that have been created either by statute (Consumer Product Improvement Safety Act of 2008) or by federal agencies, like the Consumer Financial Protection Bureau (CFPB). [These registries] will create rich datasets that industry can use to improve their products and consumer advocates can analyze to point out where the marketplace hasn’t worked,” said Meyer.

In 2012, the CFPB, in fact,began publishing a new database online. As was the case with the Consumer Product Safety Commission in 2011, the consumer complaint database did not go online without industry opposition, as Suzy Khimm reported in her feature story on the CFPB. That said, the CFPB has been making consumer complaints available to the public online since last June.

That data is now being consumed by BillGuard, enabling more consumers to derive benefit that might not have been available otherwise.

“The CFPB has made their consumer complaint database open to the public,” said Samid. “Billing disputes are the No. 1 complaint category for credit cards. We also source consumer complaint data from the web and anonymized billing disputes directly from banks. We are working with other government agencies to share our findings about grey charges, but cannot disclose those relationships just yet.”

“Choice engines” for an open data economy

Many of this emerging class of services use multiple datasets to provide consumers with insight into their choices. For instance, reviews and experiences of prior customers can be mashed up with regulatory data from government agencies, including complaints. Data from patient reviews could power health care startups. The integration of food inspection data into Yelp will give consumers more insights into dining decisions. Trulia and Zillow suggest another direction for government data use, as seen in real estate.

If these early examples are any guide, there’s an interesting role for consumer policy makers and regulators to play: open data stewards and suppliers. Given that the release such data has an effect on the market for products and services, expect more companies in affected industries to resist such initiatives, much in the same way that that CPSC and CFPB database were opposed by industry. Such resistance may be subtle, where government data collection is portrayed as part of a regulator’s mission but its release into the marketplace is undermined.

Nonetheless, smart disclosure taps into larger trends, in particular “personal data ownership” and consumer empowerment. The growth of an energy usage management sector and participatory health care show how personal data can be used, once acquired. The use of behavioral science in combination with such data is of great interest to business interest and should attract the attention of policy makers, legislators and regulators.

After all, convening and pursuing smart disclosure initiatives puts government in an interesting role. If government agencies or private companies then choose to apply behavioral economics in programs or policies, with an eye on improving health or financial well-being, how should the policies themselves be disclosed use? What principles matter?

“The guideline I suggest is that if a firm is keeping track of your usage and purchases, then you should be able to get access to that data in a machine-readable, standardized format that, with one click, you could upload to a search engine website,” said Thaler. “As for the proper balance, I am proposing only that consumers have access to their raw purchase history, not proprietary inferences the firm may have drawn. To give an example, you should have a right to download the list of all the movies you have rented from Netflix, but not the conclusions they have reached about what sort of movies you might also like. Also, any policy like this should begin with larger firms that already have sophisticated information systems keeping track of consumer data. For those firms, the costs of providing the data to their consumers should be minor.”

Given the growth of student loans, more transparency and understanding for higher education education choices is needed. For that to happen, prospective students will need more access to their own personal data to build the profile that they can then use to get personalized recommendations about education, along with data from higher education institutions, including outcomes for different kinds of students, from graduation rates to job placement.

Disclosures of data regarding outcomes can have other effects as well.

“I referenced the hospital-acquired infection battle earlier,” said Meyer. “In 1999, the Institute of Medicine released a study, “To err is human,” that showed tens of thousands of consumers were dying because of preventable medical harms. Consumers Union started a campaign in 2003 to reduce the number of deaths due to hospital-acquired infections. Our plan was to get laws passed in states that required disclosure of infections. We have helped get laws passed in 30 states, which is great, but getting the states to comply with useful data has been difficult. We’re starting to see progress in reducing infections but it’s taken a long time.”


This post is part of our ongoing investigation into the open data economy.

February 05 2013

Investing in the open data economy

If you had 10 million pounds to spend on open data research, development and startups, what would you do with it? That’s precisely the opportunity that Gavin Starks (@AgentGav) has been given as the first CEO of the Open Data Institute (ODI) in the United Kingdom.

GavinStarksGavinStarksThe ODI, which officially opened last September, was founded by Sir Tim Berners-Lee and Professor Nigel Shadbolt. The independent, non-partisan, “limited by guarantee” nonprofit is a hybrid institution focused on unlocking the value in open data by incubating startups, advising governments, and educating students and media.

Previously, Starks was the founder and chairman of AMEE, a social enterprise that scored environmental costs and risks for businesses. (O’Reilly’s AlphaTech Ventures was one of its funders.) He’s also worked in the arts, science and technology. I spoke to Starks about the work of the ODI and open data earlier this winter as part of our continuing series investigating the open data economy.

What have you accomplished to date?

Gavin Starks: We opened our offices on the first of October last year. Over the first 12 weeks of operation, we’ve had a phenomenal run. The ODI is looking to create value to help everyone address some of the greatest challenges of our time, whether that’s in education, health, in our economy or to benefit our environment.

Since October, we’ve had literally hundreds of people through the door. We’ve secured $750,000 in matched funding from the Amida Network, on top of a 10-million-pound investment from the UK Government’s Technology Strategy Board. We’ve helped identify 200 million pounds a year in savings for the health service in the UK.

200 million pounds? What do you base that estimate upon?

Gavin Starks: Part of our remit is to bring together the main experts from different areas. To illustrate the kind of benefit that I think we can bring here, one part of what we’re doing is to try and unlock data supply.

The Health Service in the UK started to release a lot of its prescription information as open data about nine months ago. We worked with some of the main experts in the health service with a big data analytics firm, Mastodon C, which is a startup that we’re incubating at the ODI.

Together, they identified potential areas of investigation. The data science folks drilled into every single prescription. (I think the dataset was something like 47 million rows of data.) What they were looking at there was the difference between proprietary drugs and generics, where there may be a generic equivalent. In many cases, the generic equivalent has no clinical difference between the proprietary drugs — and so the cost difference is huge. It might be 81 pence or 81 pennies for a generic to more than 20 pounds for a drug that’s still under license.

Looking at the entire dataset, the analytics revealed different patterns, and from that, cost differences. If we carried out this research over a year ago, for example, we could have saved 200 million pounds over the last year. It really is quite significant. That’s on one class of drugs, on one area. We think this research could be repeated against different classes of drugs and replicated internationally.

UK statin map screenshotUK statin map screenshot

Percentage of proprietary statin prescribing by CCG Sept 2011 – May 2012.
Image Credit: PrescribingAnalytics.com

Which open data business models are the most exciting to you?

Gavin Starks: I think there’s lots of different areas to explore here. There are areas where there can be cost savings brought to any organization, whether it’s public sector or private sector organizations. There’s also areas of new innovation. (I think that they’re quite different directions.) Some of the work that we’ve done with the prescription data, that’s where you’re looking at efficiencies.

We’ve got other startups that are based in our offices here in Shoreditch and London that are looking at transportation information. They’re looking at location-based services and other forms of analytics within the commercial sectors: financial information, credit ratings, those kinds of areas. When you start to pull together different levels of open data that have been available but haven’t been that accessible in the past, there’s new services that can be derived from them.

What creates a paid or value-add service? It’s essential that we create a starting point where free and open access to the data itself can be made available for certain use cases for as many people as possible. There, you stimulate innovation if you can gain access to discern new insight from that data.

Having the data aggregated, structured and accessible in an automated way is worth paying for. There could be a service-level-agreement-based model. There could be a carve-out of use cases. You could borrow from the Creative Commons world and say, “If you’re going to have a share alike license on this, then that’s fine, you can use it for free. But if you’re going to start creating closed assets, as a result, there may be a charge for the use of data at that point.”

I think there’s a whole range of different data models, but really, the goal here is to try and discern what insight can be derived from existing datasets and what happens when you start mashing them up with other datasets.

What are the greatest challenges to achieving some of the economic outcomes that the UK Cabinet Office has described?

Gavin Starks: I think there are many challenges. One of the key ones is just understanding. One challenge we’ve heard consistently from pretty much everybody has been, “We believe there’s a huge amount of potential here, but where do we start?”

Part of the ODI’s mission is to provide training, education and assets that enable people to begin on that journey. We’re in the process right now of designing our first dozen or so training programs. We’re working at one level with the World Bank to train the world’s politicians and national leaders, and we’re working at the other end with schools to create programs that fit with existing graduate courses.

Education is one of the biggest challenges. We want to train more than technologists — we also want to train lawyers and journalists about the business case to enable people to understand and move forward at the same pace. There’s very little point in just saying, “There is huge value here,” without trying to demonstrate that return on investment (ROI) and value case at the same time.

What is the ODI’s approach to incubating civic startups?

Gavin Starks: There are two parts to it. One is unlocking supply. We’re working with different government departments and public sector agencies to help them understand what unlocking supply means. Creating structured, addressable, repeatable data creates the supply piece so that you can actually start to build a business. It’s very high-risk to try and build a business when you don’t have a guarantee of supply.

Two, encouraging and incubating the demand side. We’ve got six startups in our space already. They’re all at different stages. Some of them are very early on, just trying to navigate toward the value here that we can discern from the data. Others are more mature, and maybe have some existing revenue streams, but they’re looking at how to really make this scale.

What we’ve found is of benefit so far — and again, we’re only three months in — is our ability to network and convene the different stakeholders. We can take a small startup and get them in front of one of the large corporations and help them bridge that sales process. Helping them communicate their ideas in a clear way, where the value is obvious to the end customer, is important.

What are some of the approaches that have worked to unlock value from open government data?

Gavin Starks: We’re not believers in “If you build it, they will come.” You need to create a guaranteed data supply, but you also need to really engage with people to start to unlock ideas.

We’ve been running our own hackathons, but I think there’s a difference in the way that we’ve structured them and organized them. We include domain experts and frame the hack events around a specific problem or a specific set of problems. For example, we had a weekend-long hackathon in the health space, looking at different datasets, convening domain experts and technical experts.

It involved competitions, when the winner gets a seat at the ODI to take their idea forward. It might be that an idea turns into a business, it might turn into a project, or it might just turn into a research program.

I think that you need to really lead people by the hand through the process of innovation, helping them and supporting them to unlock the value, rather than just having the datasets there and expecting them to be used.

Given the cost the UK’s National Audit Office ascribed to opening data, is the investment worth it?

Gavin Starks: This is like the early days of the web. There are lots of estimates about how much everything is going to be worth and what sort of ROI people are going to see. What we’ve yet to see, I think, is the honest answer.

The reason I’m very excited about this area is that I see the same potential as I saw in the mid-1990s, when I got involved with the web. The same patterns exist today. There are new datasets and ecosystems coming into existence that can be data-mined. They can be joined together in novel ways. They can bridge the virtual and physical worlds. They can bring together people who have not been able to collaborate in different ways.

There’s a huge amount of value to be unlocked. There will be some dead ends, as we had in the web’s development, but there will be some incredible wins. We’re trying to refine our own skills around identifying where those potential hot spots might be.

Health services is an area where it’s really obvious there’s a lot of benefits. There are clear benefits from opening up transportation and location-based services. You can see the potential behind energy efficiency, creating efficient supply chains and opening up more information around education.

You can see resonant points. We’re really drilling into those and asking, “What happens when you really put together the right domain experts and the supportive backers?”

Those backers can be financial as well as in industry. The Open Data Institute has been pulling together those experts and providing a neutral space for that innovation to happen.

Which of those areas have the most clear economic value, in terms of creating shorter term returns on investment and quick wins?

Gavin Starks: I don’t think there’s a single answer to that question. If you look at location-based services, corporate data, health data or education, there are examples and use cases in different parts of the world where they will have different weightings.

If you were looking at water sanitation in areas of the world where there is an absence of it, then they may provide more immediate return than unlocking huge amounts of transportation information.

In Denmark, look at the release of the equivalent of zip code data and the more detailed addresses. I believe the numbers there went from four-fold return to 17-fold return, in terms of value to the country of their investment in decent address-level data.

This is one area that we’ve provided a consultation response in the UK. I think it may vary from state-to-state in the U.S., or maybe in areas where the specific focus on health would be very beneficial. There may be areas where a focus on energy efficiency may be most beneficial.

What conditions lead to beneficial outcomes for open data?

Gavin Starks: A lot of the real issues are not really about the technology. When it comes to the technology, we know what a lot of the solutions are. How can we address or improve the data quality? What standards need to exist? What anonymity, privacy or secrecy needs to exist around the data? How do we really measure the outcomes? What are the circumstances where stakeholders need to get involved?

You definitely need political buy-in, but there also needs to be a sense of what the data landscape is. What’s the inventory? What’s the legal situation? Who has access? What kind of access is required? What does success look like against a particular use case?

You could be looking at health in somewhere like Rwanda, you could be looking at a national statistics office in a particular country where they may not have access to the data themselves, and they don’t have very much access to resources. You could be looking at contracting, government procurement and improving simple accountability, where there may be more information flow than there is around energy data, for example.

I think there’s a range of different use cases that we need to really explore here. We’re looking for great use cases where we can say, “This is something that’s simple to achieve, that’s repeatable, that helps lower costs and stimulate innovation.”

We are really at the beginning of a journey here.

Red Hat made headlines for becoming the first billion-dollar open source company. What do you think the first billion-dollar open data company will be?

Gavin Starks: It would be not unlikely for that to be in the health arena.


This interview has been edited and condensed for clarity. This post is part of our ongoing investigation into the open data economy.

Related:

January 28 2013

Open data economy: Eight business models for open data and insight from Deloitte UK

When I asked whether the push to free up government data was resulting in economic activity and startup creation, I started to receive emails from people around the United States and Europe. I’ll be publishing more of what I learned in our ongoing series of open data interviews and profiles over the next month, but two responses are worth sharing now.

Open questions about open growth

The first response concerned Deloitte’s ongoing research into open data in the United Kingdom [PDF], conducted in collaboration with the Open Data Institute.

Harvey Lewis, one of the primary investigators for the research project, recently wrote about some of Deloitte’s preliminary findings at the Open Government Partnership’s blog in a post on “open growth.” To date, Deloitte has not found the quantitative evidence the team needs to definitely demonstrate the economic value of open data. That said, the team found much of interest in the space:

“… new businesses and new business models are beginning to emerge: Suppliers, aggregators, developers, enrichers and enablers. Working with the Open Data Institute, Deloitte has been investigating the demand for open data from businesses. Looking at the actual supply of and demand for open data in the UK provides some indication of the breadth of sectors the data is relevant to and the scale of data they could be considering.

The research suggests that the key link in the value chain for open data is the consumer (or the citizen). On balance, consumer-driven sectors of the economy will benefit most from open government data that has direct relevance to the choices individuals make as part of their day-to-day lives.”

I interviewed Lewis last week about Deloitte’s findings — stay tuned for more insight into that research in February.

8 business models for open data

Michele Osella, a researcher and business analyst in the Business Model & Policy Innovation Unit at the Istituto Superiore Mario Boella in Italy, wrote in to share examples of emerging business models based upon the research I cited in my post in December. His email reminded me that in Europe, open data is often discussed in the context of public sector information (PSI). Ongoing case studies of re-use are available at the European Public Sector Information Platform website.

Osella linked to a presentation on business models in PSI reuse and shared a list of eight business models, including case studies for six of them:

  1. Premium Product / Service. HospitalRegisters.com
  2. Freemium Product / Service. None of the 13 enterprises interviewed by us falls into this case, but a slew of instances may be provided: a classic example in this vein is represented by mobile apps related to public transportation in urban areas. [Link added.]
  3. Open Source. OpenCorporates and OpenPolis
  4.  Infrastructural Razor & Blades. Public Data Sets on Amazon Web Service
  5. Demand-Oriented Platform. DataMarket and Infochimps
  6. Supply-Oriented Platform. Socrata and Microsoft Open Government Data Initiative
  7. Free, as Branded Advertising. IBM City Forward, IBM Many Eyes or Google Public Data Explorer
  8. White-Label Development.. This business model has not consolidated yet, but some embryonic attempts seem to be particularly promising.

Agree? Disagree? Have other examples of these models or other business models? Please let me know in the comments or write in to alex@oreilly.com.

In the meantime, here are several other posts that have informed my investigation into open data business models:

This post is part of our ongoing investigation into the open data economy.

January 18 2013

We’re releasing the files for O’Reilly’s Open Government book

I’ve read many eloquent eulogies from people who knew Aaron Swartz better than I did, but he was also a Foo and contributor to Open Government. So, we’re doing our part at O’Reilly Media to honor Aaron by posting the Open Government book files for free for anyone to download, read and share.

The files are posted on the O’Reilly Media GitHub account as PDF, Mobi, and EPUB files for now. There is a movement on the Internet (#PDFtribute) to memorialize Aaron by posting research and other material for the world to access, and we’re glad to be able to do this.

You can find the book here: github.com/oreillymedia/open_government

Daniel Lathrop, my co-editor on Open Government, says “I think this is an important way to remember Aaron and everything he has done for the world.” We at O’Reilly echo Daniel’s sentiment.

December 26 2012

Big, open and more networked than ever: 10 trends from 2012

In 2012, technology-accelerated change around the world was driven by the wave of social media, data and mobile devices. In this year in review, we look back at some of the stories that mattered here at Radar and look ahead to what’s in store for 2013.

Below, you’ll find 10 trends that held my interest in 2012. This is by no means a comprehensive account of “everything that mattered in the past year” — try The Economist’s account of the world in 2012 or The Atlantic’s 2012 in review or Popular Science’s “year in ideas” if you’re hungry for that perspective — but I hope you’ll find something new to think about as 2013 draws near.

Social media

Social media wasn’t new in 2012, but it was bigger and more mainstream than ever. There were some firsts, from the first Presidential “Ask Me Anything” on Reddit to the first White House Google Hangout on Google Plus to presidential #debates to the first billion-user social network. The election season had an unprecedented social and digital component, from those hyperwired debates to a presidential campaign built like a startup. Expect even more blogging, tweeting, tumbling, streaming, Liking and pinning in 2013, even if it leaves us searching for context.

Open source in government

Open source software made more inroads in the federal government, from a notable policy at the Consumer Financial Protection Agency to more acceptance in the military.

The White House made its first commits on GitHub, including code for its mobile apps and e-petition platform, where President Obama responded personally to an e-petition for the first time.. The House Oversight Committee’s crowdsourced legislative platform  also went on GitHub. At year’s end, the United States (code) was on GitHub.

Responsive design

According to deputy technical lead Jeremy Vanderlan, the new AIDS.gov, launched in June, was the first full-site implementation of responsive web design for a federal government domain. They weren’t the first to automatically adapt how a website is displayed for the device a visitor is using — you can see next-generation web design at open.nasa.gov or in the way that fcc.gov/live optimizes to provide video to different mobile devices — but this was a genuine milestone for the feds online. By year’s end, Congress had also become responsive, at least with respect to its website, with a new beta at Congress.gov.

Free speech online

Is there free speech on the Internet? As Rebecca MacKinnon, Ethan Zuckerman and others have been explaining for years, what we think of as the new “public square online” is complicated by the fact that these platforms for free expression are owned and operated by private companies. MacKinnon explored these issues, “Consent of the Networked,” one of best technology policy books of the year. In 2012, “Twitter censorship” and the Terms of Service for social networking services caused many more people to suggest a digital Bill of Rights, although “Internet freedom” is an idea that varies with the beholder.

Open mapping

On January 9th, I wondered whether 2012 would be “the year of the open map.” I started reporting on digital maps made with powerful new software and open data last winter. The prediction was partially born out, from Foursquare’s adoption to StreetEast moving from Google Maps to new investments in OpenStreetMap. In response to the shift, Google slashed its price for using the Google Maps API by 88%. In an ideal world, the new competition will result in both better maps and more informed citizens.

Data journalism

Data journalism took on new importance for society. We tracked its growing influence, from the Knight News Challenge to new research initiatives to Africa, and are continuing to investigate data journalism with a series of interviews and a forthcoming report.

Privacy and security

Privacy and security continued to dominate technology policy discussions in the United States, although copyright, spectrum, patents and Internet governance had significant prominence. While the Supreme Court decided GPS monitoring constitutes search under the 4th Amendment, expanded rules for data sharing in the U.S. government raised troubling questions.

In another year that will end without updated baseline privacy legislation from Congress, bills did advance in the U.S. Senate to reform electronic privacy and address location-based technology. After calling for such legislation, the Federal Trade Commission opened an investigation into data brokers.

No “cyber security” bill passed the Senate either, leaving hope that future legislation will balance protections with civil liberties and privacy concerns.

Networked politics

Politics were more wired in Election 2012 than they’d ever been in history, from social media and debates to the growing clout of the Internet. The year started off with the unprecedented wave of networked activism that stopped the progress of the Stop Online Piracy Act (SOPA) and PROTECT-IP Act (PIPA) in Congress.

At year’s end, the jury remains out on whether the Internet will act as a platform for collective action to address societal challenges, from addressing gun violence in the U.S. to a changing climate.

Open data

As open data moves from the information age to the action age, there are significant advances around the globe. As more data becomes available, its practical application has only increased in importance.

After success releasing health care data to fuel innovation and startups, US CTO Todd Park sought to scale open data and agile thinking across the federal government.

While it’s important to be aware of the ambiguity of open government and open data, governments are continuing to move forward globally, with the United Kingdom relaunching Data.gov.uk and, at year’s end, India and the European Commission launching open data platforms. Cities around the world also adopted open data, from Buenos Aires to Berlin to Palo Alto.

In the United States, friendly competition to be the nation’s preeminent digital city emerged between San Francisco, Chicago, Philadelphia and New York. Open data releases became a point of pride. Landmark legislation in New York City and Chicago’s executive order on open data made both cities national leaders.

As the year ends, we’re working to make dollars and sense of the open data economy, explicitly making a connection between releases and economic growth. Look for a report on our research in 2013.

Open government

The world’s largest democracy officially launching an open government data platform was historic. That said, it’s worth reiterating a point I’ve made before: Simply opening up data is not a replacement for a Constitution that enforces a rule of law, free and fair elections, an effective judiciary, decent schools, basic regulatory bodies or civil society — particularly if the data does not relate to meaningful aspects of society. Adopting open data and digital government reforms is not quite the same thing as good government. Beware openwashing in government, as well as in other areas.

On that count, at year’s end, The Economist found that global open government efforts are growing in “scope and clout.” The Open Government Partnership grew, with new leadership, added experts and a finalized review mechanism. The year to come will be a test of the international partnership’s political will.

In the United States, an open government reality check at the federal level showed genuine accomplishments, but it leaves many promises only partially fulfilled, with a mixed record on meeting goals that many critics found transparently disappointing. While some of the administration’s transparency failures concern national security — notably, the use of drones overseas — science journalists reported restricted access to administration officials at the Environmental Protection Agency, Federal Drug Administration and Department of Health and Human Services.

Efforts to check transparency promises also found compliance with the Freedom of Information Act lacking. While a new FOIA portal is promising, only six federal agencies were on it by year’s end. The administration record on prosecuting whistleblowers has also sent a warning to others considering coming forward regarding waste or abuse in the national security.

Despite those challenges, 2012 was a year of continuing progress for open government at the federal level in the United States, with reasons for hope throughout states and cities. Here’s hoping 2013 sees more advances than setbacks in this area.

Coming tomorrow: 14 trends to watch in 2013.

Reposted bycheg00 cheg00

December 06 2012

The United States (Code) is on Github

When Congress launched Congress.gov in beta, they didn’t open the data. This fall, a trio of open government developers took it upon themselves to do what custodians of the U.S. Code and laws in the Library of Congress could have done years ago: published data and scrapers for legislation in Congress from THOMAS.gov in the public domain. The data at github.com/unitedstates is published using an “unlicense” and updated nightly. Credit for releasing this data to the public goes to Sunlight Foundation developer Eric Mill, GovTrack.us founder Josh Tauberer and New York Times developer Derek Willis.

“It would be fantastic if the relevant bodies published this data themselves and made these datasets and scrapers unnecessary,” said Mill, in an email interview. “It would increase the information’s accuracy and timeliness, and probably its breadth. It would certainly save us a lot of work! Until that time, I hope that our approach to this data, based on the joint experience of developers who have each worked with it for years, can model to government what developers who aim to serve the public are actually looking for online.”

If the People’s House is going to become a platform for the people, it will need to release its data to the people. If Congressional leaders want THOMAS.gov to be a platform for members of Congress, legislative staff, civic developers and media, the Library of Congress will need to release structured legislative data. THOMAS is also not updated in real-time, which means that there will continue to be a lag between a bill’s introduction and the nation’s ability to read the bill before a vote.

Until that happens, however, this combination of scraping and open source data publishing offers a way forward on Congressional data to be released to the public, wrote Willis, on his personal blog:

Two years ago, there was a round of blog posts touched off by Clay Johnson that asked, “Why shouldn’t there be a GitHub for data?” My own view at the time was that availability of the data wasn’t as much an issue as smart usage and documentation of it: ‘We need to import, prune, massage, convert. It’s how we learn.’

Turns out that GitHub actually makes this easier, and I’ve had a conversion of sorts to the idea of putting data in version control systems that make it easier to view, download and report issues with data … I’m excited to see this repository grow to include not only other congressional information from THOMAS and the new Congress.gov site, but also related data from other sources. That this is already happening only shows me that for common government data this is a great way to go.

In the future, legislation data could be used to show iterations of laws and improve the ability of communities at OpenCongress, POPVOX or CrunchGov to discover and discuss proposals. As Congress incorporates more tablets on the floor during debates, such data could also be used to update legislative dashboards.

The choice to use Github as a platform for government data and scraper code is another significant milestone in a breakout year for Github’s use in government. In January, the British government committed GOV.UK code to Github. NASA, after contributing its first code in January added 11 code repositories this year. In August, the White House committed code to Github. In September, the Open Gov Foundation open sourced the MADISON crowd sourced legislation platform.

The choice to use Github for this scraper and legislative data, however, presents a new and interesting iteration in the site’s open source story.

“Github is a great fit for this because it’s neutral ground and it’s a welcoming environment for other potential contributors,” wrote Sunlight Labs director Tom Lee, in an email. “Sunlight expects to invest substantial resources in maintaining and improving this codebase, but it’s not ours: we think the data made available by this code belongs to every American. Consequently the project needed to embrace a form that ensures that it will continue to exist, and be free of encumbrances, in a way that’s not dependent on any one organization’s fortunes.”

Mill, an open government developer at Sunlight Labs, shared more perspective in the rest of our email interview, below.

Is this based on the GovTrack.us scraper?

Eric Mill: All three of us have contributed at least one code change to our new THOMAS scraper; the majority of the code was written by me. Some of the code has been taken or adapted from Josh’s work.

The scraper that currently actively populates the information on GovTrack is an older Perl-based scraper. None of that code was used directly in this project. Josh had undertaken an incomplete, experimental rewrite of these scrapers in Python about a year ago (code), but my understanding is it never got to the point of replacing GovTrack’s original Perl scripts.

We used the code from this rewrite in our new scraper, and it was extremely helpful in two ways &mddash; providing a roadmap of how THOMAS’ URLs and sitemap work, and parsing meaning out of the text of official actions.

Parsing the meaning out of action text is, I would say, about half the value and work of the project. When you look at a page on GovTrack or OpenCongress and see the timeline of a bill’s life — “Passed House,” “Signed by the President,” etc. — that information is only obtainable by analyzing the order and nature of the sentences of the official actions that THOMAS lists. Sentences are finicky, inconsistent things, and extracting meaning from them is tricky work. Just scraping them out of THOMAS.gov’s HTML is only half the battle. Josh has experience at doing this for GovTrack. The code in which this experience was encapsulated drastically reduced how long it took to create this.

How long did this take to build?

Eric Mill: Creating the whole scraper, and the accompanying dataset, was about 4 weeks of work on my part. About half of that time was spent actually scraping — reverse engineering THOMAS’ HTML — and the other half was spent creating the necessary framework, documentation, and general level of rigor for this to be a project that the community can invest in and rely on.

There will certainly be more work to come. THOMAS is shutting down in a year, to be replaced by Congress.gov. As Congress.gov grows to have the same level of data as THOMAS, we’ll gradually transition the scraper to use Congress.gov as its data source.

Was this data online before? What’s new?

Eric Mill: All of the data in this project has existed in an open way at GovTrack.us, which has provided bulk data downloads for years. The Sunlight Foundation and OpenCongress have both created applications based on this data, as have many other people and organizations.

This project was undertaken as a collaboration because Josh and I believed that the data was fundamental enough that it should exist in a public, owner-less commons, and that the code to generate it should be in the same place.

There are other benefits, too. Although the source code to GovTrack’s scrapers has been available, it depends on being embedded in GovTrack’s system, and the use of a database server. It was also written in Perl, a language less widely used today, and produced only XML. This new Python scraper has no other dependencies, runs without a database, and generates both JSON and XML. It can be easily extended to output other data formats.

Finally, everyone who worked on the project has had experience in dealing with legislative information. We were able to use that to make various improvements to how the data is structured and presented that make it easier for developers to use the data quickly and connect it to other data sources.

Searches for bills in Scout use data collected directly from this scraper. What else are people doing with the data?

Eric Mill: Right now, I only know for a fact that the Sunlight Foundation is using the data. GovTrack recently sent an email to its developer list announcing that in the near future, its existing dataset would be deprecated in favor of this new one, so the data should be used in GovTrack before long.

Pleasantly, I’ve found nearly nothing new by switching from GovTrack’s original dataset to this one. GovTrack’s data has always had a high level of quality. So far, the new dataset looks to be as good.

Is it common to host open data on Github?

Eric Mill: Not really. Github’s not designed for large-scale data hosting. This is an experiment to see whether this is a useful place to host it. The primary benefit is that no single person or organization (besides Github) is paying for download bandwidth.

The data is published as a convenience, for people to quickly download for analysis or curiosity. I expect that any person or project that intends to integrate the data into their work on an ongoing basis will do so by using the scraper, not downloading the data repeatedly from Github. It’s not our intent that anyone make their project dependent on the Github download links.

Laudably, Josh Tauberer donated his legislator dataset and converted it to YAML. What’s YAML?

Eric Mill: YAML is a lightweight data format intended to be easy for humans to both read and write. This dataset, unlike the one scraped from THOMAS, is maintained mostly through manual effort. Therefore, the data itself needs to be in source control, it needs to not be scary to look at and it needs to be obvious how to fix or improve it.

What’s in this legislator dataset? What can be done with it?

Eric Mill: The legislator dataset contains information about members of Congress from 1789 to the present day. It is a wealth of vital data for anyone doing any sort of application or analysis of members of Congress. This includes a breakdown of their name, a crosswalk of identifiers on other services, and social media accounts. Crucially, it also includes a member of Congress’ change in party, chamber, and name over time.

For example, it’s a pretty necessary companion to the dataset that our scraper gathers from THOMAS. THOMAS tells you the name of the person who sponsored this bill in 2003, and gives you a THOMAS-specific ID number. But it doesn’t tell you what that person’s party was at the time, or if the person is still a member of the same chamber now as they were in 2003 (or whether they’re in office at all). So if you want to say “how many Republicans sponsored bills in 2003,” or if you’d like to draw in information from outside sources, such as campaign finance information, you will need a dataset like the one that’s been publicly donated here.

Sunlight’s API on members of Congress is easily the most prominent API, widely used by people and organizations to build systems that involve legislators. That API’s data is a tiny subset of this new one.

You moved a legal citation and extractor into this code. What do they do here?

Eric Mill: The legal citation extractor, called “Citation,” plucks references to the US Code (and other things) out of text. Just about any system that deals with legal documents benefits from discovering links between those documents. For example, I use this project to power US Code searches on Scout, so that the site returns results that cite some piece of the law, regardless of how that citation is formatted. There’s no text-based search, simple or advanced, that would bring back results matching a variety of formats or matching subsections — something dedicated to the arcane craft of citation formats is required.

The citation extractor is built to be easy for others to invest in. It’s a stand-alone tool that can be used through the command line, HTTP, or directly through JavaScript. This makes it suitable for the front-end or back-end, and easy to integrate into a project written in any language. It’s very far from complete, but even now it’s already proven extremely useful at creating powerful features for us that weren’t possible before.

The parser for the U.S. Code itself is a dataset, written by my colleague Thom Neale. The U.S. Code is published by the government in various formats, but none of them are suitable for easy reuse. The Office of Law Revision Counsel, which publishes the U.S. Code, is planning on producing a dedicated XML version of the US Code, but they only began the procurement process recently. It could be quite some time before it appears.

Thom’s work parses the “locator code” form of the data, which is a binary format designed for telling GPO’s typesetting machines how to print documents. It is very specialized and very complicated. This parser is still in an early stage and not in use in production anywhere yet. When it’s ready, it’ll produce reliable JSON files containing the law of the United States in a sensible, reusable form.

Does Github’s organization structure makes a data commons possible?

Eric Mill: Github deliberately aligns its interests with the open source community, so it is possible to host all of our code and data there for free. Github offers unlimited public repositories, collaborators, bandwidth, and disk space to organizations and users at no charge. They do this while being an extremely successful, profitable business.

On Github, there are two types of accounts: users and organizations. Organizations are independent entities, but no one has to log in as an organization or share a password. Instead, at least one user will be marked as the “owner” of an organization. Ownership can easily change hands or be distributed amongst various users. This means that Josh, Derek, and I can all have equal ownership of the “unitedstates” repositories and data. Any of us can extend that ownership to anyone we want in a simple, secure way, without password sharing.

Github as a company has established both a space and a culture that values the commons. All software development work, from hobbyist to non-profit to corporation, from web to mobile to enterprise, benefits from a foundation of open source code. Github is the best living example of this truth, so it’s not surprising to me that it was the best fit for our work.

Why is this important to the public?

Eric Mill: The work and artifacts of our government should be available in bulk, for easy download, in accessible formats, and without license restrictions. This is a principle that may sound important and obvious to every technologist out there, but it’s rarely the case in practice. When it is, the bag is usually mixed. Not every member of the public will be able or want to interact directly with our data or scrapers. That’s fine. Developers are the force multipliers of public information. Every citizen can benefit somehow from what a developer can build with government information.

Related:

November 26 2012

Investigating data journalism

Great journalism has always been based on adding context, clarity and compelling storytelling to facts. While the tools have improved, the art is the same: explaining the who, what, where, when and why behind the story. The explosion of data, however, provides new opportunities to think about reporting, analysis and publishing stories.

As you may know, there’s already a Data Journalism Handbook to help journalists get started. (I contributed some commentary to it). Over the next month, I’m going to be investigating the best data journalism tools currently in use and the data-driven business models that are working for news startups. We’ll then publish a report that shares those insights and combines them with our profiles of data journalists.

Why dig deeper? Getting to the heart of what’s hype and what’s actually new and noteworthy is worth doing. I’d like to know, for instance, whether tutorials specifically designed for journalists can be useful, as Joe Brockmeier suggested at ReadWrite. On a broader scale, how many data journalists are working today? How many will be needed? What are the primary tools they rely upon now? What will they need in 2013? Who are the leaders or primary drivers in the area? What are the most notable projects? What organizations are embracing data journalism, and why?

This isn’t a new interest for me, but it’s one I’d like to found in more research. When I was offered an opportunity to give a talk at the second International Open Government Data Conference at the World Bank this July, I chose to talk about open data journalism and invited practitioners on stage to share what they do. If you watch the talk and the ensuing discussion in the video below, you’ll pick up great insight from the work of the Sunlight Foundation, the experience of Homicide Watch and why the World Bank is focused on open data journalism in developing countries.

The sites and themes that I explored in that talk will be familiar to Radar readers, focusing on the changing dynamic between the people formerly known as the audience and the editors, researchers and reporters who are charged with making sense of the data deluge for the public good. If you’ve watched one of my Ignites or my Berkman Center talk, much of this won’t be new to you, but the short talk should be a good overview of where I think this aspect of data journalism is going and why I think it’s worth paying attention to today.

For instance, at the Open Government Data Conference Bill Allison talked about how open data creates government accountability and reveals political corruption. We heard from Chris Amico, a data journalist who created a platform to help a court reporter tell the story of every homicide in a city. And we heard from Craig Hammer how the World Bank is working to build capacity in media organizations around the world to use data to show citizens how and where borrowed development dollars are being spent on their behalf.

The last point, regarding capacity, is a critical one. Just as McKinsey identified a gap between available analytic talent and the demand created by big data, there is a data science skills gap in journalism. Rapidly expanding troves of data are useless without the skills to analyze it, whatever the context. An over focus on tech skills could exclude the best candidates for these jobs — but there will need to be training to build them.

This reality hasn’t gone unnoticed by foundations or the academy. In May, the Knight Foundation gave Columbia University $2 million for research to help close the data science skills gap. (I expect to be talking to Emily Bell, Jonathan Stray and the other instructors and students.)

Media organizations must be able to put data to work, a need that was amply demonstrated during Hurricane Sandy, when public open government data feeds became critical infrastructure.

What I’d like to hear from you is what you see working around the world, from the Guardian to ProPublica, and what you’re working on, and where. To kick things off, I’d like to know which organizations are doing the most innovative work in data journalism.

Please weigh in through the comments or drop me a line at alex@oreilly.com or at @digiphile on Twitter.

November 02 2012

Charging up: Networking resources and recovery after Hurricane Sandy

Even though the direct danger from Hurricane Sandy has passed, lower Manhattan and many parts of Connecticut and New Jersey remain a disaster zone, with millions of people still without power, reduced access to food and gas, and widespread damage from flooding. As of yesterday, according to reports from Wall Street Journal, thousands of residents remain in high-rise buildings with no water, power or heat.

E-government services are in heavy demand, from registering for disaster aid to finding resources, like those offered by the Office of the New York City Advocate. People who need to find shelter can use the Red Cross shelter app. FEMA has set up a dedicated landing page for Hurricane Sandy and a direct means to apply for disaster assistance:

Public officials have embraced social media during the disaster as never before, sharing information about where to find help.

No power and diminished wireless capacity, however, mean that the Internet is not accessible in many homes. In the post below, learn more on what you can do on the ground to help and how you can contribute online.

For those who have lost power, using Twitter offline to stay connected to those updates is useful — along with using weather radios.

That said, for those that can get connected on mobile devices, there are digital resources emerging, from a crowdsourced Sandy coworking map in NYC to an OpenTrip Planner app for navigating affected transit options. This Google Maps mashup shows where to find food, shelter and charging stations in Hoboken, New Jersey.

In these conditions, mobile devices are even more crucial connectors to friends, family, services, resources and information. With that shift, government websites must be more mobile-friendly and offer ways to get information through text messaging.

Widespread power outages also mean that sharing the means to keep devices charged is now an act of community and charity.

Ways to to help with Sandy relief

A decade ago, if there was a disaster, you could donate money and blood. In 2012, you can also donate your time and skills. New York Times blogger Jeremy Zillar has compiled a list of hurricane recovery and disaster recovery resources. The conditions on the ground also mean that finding ways to physically help matter.

WNYC has a list of volunteer options around NYC. The Occupy Wall Street movement has shifted to “Occupy Sandy,” focusing on getting volunteers to help pick up and deliver food in neighborhoods around New York City. As Nick Judd reported for TechPresident, this “people-powered recovery” is volunteering to process incoming offers of help and requests for aid.

They’re working with Recovers.org, a new civic startup, which has now registered some 5,000 volunteers from around the New York City area. Recovers is pooling resources and supplies with community centers and churches to help in the following communities:

If you want to help but are far away from directly volunteering in New York, Connecticut or New Jersey, there are several efforts underway to volunteer online, including hackathons around the world tomorrow. Just as open government data feeds critical infrastructure during disasters, it is also integral to recovery and relief. To make that data matter to affected populations, however, the data must be put to use. That’s where the following efforts come in.

“There are a number of ways tech people can help right now,” commented Gisli Olafsson, Emergency Response Director at NetHope, reached via email. “The digital volunteer communities are coordinating many of those efforts over a Skype chat group that we established few days before Sandy arrived. I asked them for input and here are their suggestions:

  1. Sign up and participate in the crisis camps that are being organized this weekend at Geeks Without Borders and Sandy Crisis Camp.
  2. Help create visualizations and fill in the map gaps. Here is a link to all the maps we know about so far. Help people find out what map to look at for x,y,z.
  3. View damage photos to help rate damage assessments at Sandy OpenStreetMap. There are over 2000 images to identify and so far over 1000 helpers.”

Currently, there are Crisis Camps scheduled for Boston, Portland, Washington (DC), Galway (Ireland), San Francisco, Seattle, Auckland (NZ) and Denver, at RubyCon.

“If you are in any of those cities, please go the Sandy CrisisCamp blog post and sign up for the EventBrite for the CrisisCamp you want to attend in person or virtually,” writes Chad Catacchio (@chadcat), Crisis Commons communication lead.

“If you want to start a camp in your city this weekend, we are still open to the idea, but time is running short (it might be better to aim for next week),” he wrote.

UPDATE: New York-based nonprofit DataKind tweeted that they’re trying to rally the NY Tech community to pitch in real life on Saturday and linked to a new Facebook group. New York’s tech volunteers have already been at work helping city residents over the last 24 hours, with the New York Tech Meetup organizing hurricane recovery efforts.

People with technical skills in the New York area who want to help can volunteer online here and check out the NY Tech responds blog.

As Hurricane Sandy approached, hackers built tools to understand the storm. Now that it’s passed, “Hurricane Hackers” are working on projects to help with the recovery. The crisis camp in Boston will be hosted at the MIT Media Lab by Hurricane Hackers this weekend.

Sandy Crisis Camps already have several projects in the works. “We have been asked by FEMA to build and maintain a damage assessment map for the entire state of Rhode Island,” writes Catacchio. He continues:

“We will also be assisting in monitoring social media and other channels and directing reports to FEMA there. We’ll be building the map using ArcGIS and will be needing a wide range of skill sets from developers to communications to mapping. Before the weekend, we could certainly use some help from ArcGIS folks in getting the map ready for reporting, so if that is of interest, please email Pascal Schuback at pascal@crisiscommons.org. Secondly, there has been an ask by NYU and the consortium of colleges in NYC to help them determine hotel capacity/vacancy as well as gas stations that are open and serving fuel. If other official requests for aid come in, we will let the community know. Right now, we DO anticipate more official requests, and again, if you are working with the official response/recovery and need tech support assistance, please let us know: email either Pascal or David Black at david@crisiscommons.org. We are looking to have a productive weekend of tackling real needs to help the helpers on the ground serving those affected by this terrible storm.”

Related:

October 03 2012

The missing ingredient from hyperwired debates: the feedback loop

PodiumPodiumWhat a difference a season makes. A few months after widespread online frustration with a tape-delayed Summer Olympics, the 2012 Presidential debates will feature the most online livestreams and wired, up-to-the-second digital coverage in history.

Given the pace of technological change, it’s inevitable that each election season will bring with it new “firsts,” as candidates and campaigns set precedents by trying new approaches and platforms. This election has been no different: the Romney and Obama campaigns have been experimenting with mobile applications, social media, live online video and big data all year.

Tonight, one of the biggest moments in the presidential campaign to date is upon us and there are several new digital precedents to acknowledge.

The biggest tech news is that YouTube, in a partnership with ABC, will stream the debates online for the first time. The stream will be on YouTube’s politics channel, and it will be embeddable.

With more and more livestreamed sports events, concerts and now debates available online, tuning in to what’s happening no longer means passively “watching TV.” The number of other ways people can tune in online in 2012 has skyrocketed, as you can see in GigaOm’s post listing debate livestreams or Mashable’s ways to watch the debates online.

This year, in fact, the biggest challenge people will have will not be finding an online alternative to broadcast or cable news but deciding which one to watch.

If you’re low on bandwidth or have a mobile device, NPR will stream the audio from the debate online and to its mobile apps. If you’re a Spanish speaker, Univision will stream the debates on YouTube with real-time translation.

The New York Times, Politico and Wall Street Journal are both livestreaming the debates at their websites or through their apps, further eroding the line between broadcast, print and online media.

While the PBS News Hour and CSPAN’s debate hub are good options, my preference is for the Sunlight Foundation’s award-winning Sunlight Live liveblog.

There are a couple of other notable firsts. The Huffington Post will deploy its HuffPost Live platform for the first time, pulling more viewers directly into participatory coverage online.

For those looking for a more… animated approach, the Guardian and Tumblr will ‘live GIF’ the presidential debates.

Microsoft is livestreaming the debates through the XBox, giving gamers an opportunity to weigh in on what they see through their Xboxes. They’ll be polled through the Xbox console during the debate, which will provide more real-time data from a youthful demographic that, according StrategyOne, still has many voters who are not firmly committed.

Social politics

The political news cycle has long since moved from the morning papers and the nightly news to real-time coverage of events. In past years, the post-debate spin by campaigns and pundits shaped public opinion. This year, direct access to online video and to the reaction of friends, family, colleagues and media through the social web means that the spin will begin as soon as any quip, policy position or rebuttal is delivered in the debate.

Beyond real-time commentary, social media will provide useful data for the campaigns to analyze. While there won’t be a “do over,” seeing what resonated directly with the public will help the campaigns tune their messages for the next debates.

Tonight, when I go on Al Jazeera’s special debate night coverage at The Stream, I’ll be looking at a number of factors. I expect the #DenverDebate and #debates hashtags to be moving too fast to follow, so I’ll be looking at which tweets are being amplified and what we can see on Twitter’s new #debates page, what images are popping online, which links are popular, how Facebook and Google+ are reacting, and what people are searching for on Google.com.

This is quite likely to be the most social political event ever, surpassing either of the 2012 political conventions or the State of the Union address. When I watch online, I’ll be looking for what resonated with the public, not just what the campaigns are saying — although that will factor into my analysis. The @mittromney account tweets 1-2 times a day. Will they tweet more? Will @barackobama’s 19 million followers be engaged? How much and how often will they update Facebook, and to what effect?

Will they live tweet open statements with links to policies? Will they link to rebuttals or fact checks in the media? Will they push people to go register or comment or share? Will they echo applause lines or attack lines? In a larger sense, will the campaigns act social, themselves? Will they reshare the people’s posts about them on social platforms or keep broadcasting?

We’ll know answers to all of these questions in a few hours.

Fact-checking in real-time

Continuing a trend from the primary season, real-time fact-checking will play a role in the debate. The difference in this historic moment is it will be the pace of it and the number of players.

As Nick Judd highlighted at techPresident, the campaign response is going to be all about mobile. Both campaigns will be trying their hands at fact checking, using new adaptive microsites at barackobama.com/debate and debates.mittromney.com, dedicated Twitter accounts at @TruthTeam2012 and and @RomneyResponse, and an associated subdomain and Tumblr.

Given the skin that campaigns have in the game, however, undecided or wavering voters are better off going with the Fourth Estate versions. Wired media organizations, like the newspapers streaming the debates I’ve listed above, will be using liveblogs and leveraging their digital readership to help fact check.

Notably, NPR senior social strategist Andy Carvin will be applying the same approach to fact checking during the debate as he has to covering the changes in the Middle East. To participate, follow @acarvin and use the #factcheck hashtag beginning at 8:30 ET.

It’s unclear whether debate moderator Jim Lehrer will tap into the fact-checking efforts online to push back on the candidates during the event. Then again, the wisdom of the crowds may be balanced by one man’s perspective. Given that he’s serving in that capacity for the 12th time, Lehrer possesses substantial experience of his own to draw upon in making his own decisions about when to press, challenge or revisit issues.

The rise of networked polities

In a larger sense, all of this interactivity falls fall short of the promise of networked politics in the Internet age. In the age of the Internet, television debates look antiquated.

When it comes to how much the people are directly involved with the presidential debates of 2012, as Micah Sifry argued earlier this week, little has changed from 2008:

“Google is going to offer some kind of interactive audience dial gadget for YouTube users, which could allow for real-time audience feedback — except it’s already clear none of that feedback is going to get anywhere near the actual debate itself. As best as I can tell, what the CPD [Commission on Presidential Debates] is doing is little more than what they did four years ago, except back then they partnered with Myspace on a site called MyDebates.org that featured video streaming, on-demand playback and archival material. Oh, but this time the partner sites will include a dynamic counter showing how many people have ‘shared their voice’.”

While everyone who has access to the Internet will be able to use multiple screens to watch, read and participate in the conversation around the debates, the public isn’t going to be directly involved in the debate. That’s a missed opportunity that won’t be revisited until the 2016 campaign.

By then, it will be an even more wired political landscape. While many politicians are still delegating the direct use of social media use to staffers, in late 2012 it ill behooves any office to be seen as technically backward and stay off them entirely.

In the years ahead, open government advocates will push politicians to use the Internet to explain their votes, not just broadcast political attacks or campaign events. After all, the United States is a constitutional republic. Executives and Congressmen are obligated to listen to the people they represent. The existing ecosystem of social media platforms may give politicians new tools to interact directly with their constituents but they’re still relatively crude.

Yes, the next generation of social media data analytics will give politicians a dashboard of what their constituents think about their positions. It’s the next generation of polling. In the years to come, however, I’m optimistic that we’re going to see much better use of the Internet to hold politicians accountable for their campaign positions and subsequent votes.

Early experiments in creating an “OKCupid for elections” will evolve. Expect sophisticated choice engines that use social and legislative data to tell voters not only whether candidates share their positions but whether they actually voted or acted upon them. Over time, opposition candidates will be able to use that accumulated data in their campaign platforms and during debates. If a member of Congress or President doesn’t follow through with the wishes of the people, he or she will have to explain why. That will be a debate worth having.

September 20 2012

Congress launches Congress.gov in beta, doesn’t open the data

The Library of Congress is now more responsive — at least when it comes to web design. Today, the nation’s repository for its laws launched a new beta website at Congress.gov and announced that it would eventually replace Thomas.gov, the 17-year-old website that represented one of the first significant forays online for Congress. The new website will educate the public looking for information on their mobile devices about the lawmaking process, but it falls short of the full promise of embracing the power of the Internet. (More on that later).

Tapping into a growing trend in government new media, the new Congress.gov features responsive design, adapting to desktop, tablet or smartphone screens. It’s also search-centric, with Boolean search and, in an acknowledgement that most of its visitors show up looking for information, puts a search field front and center in the interface. The site includes member profiles for U.S. Senators and Representatives, with associated legislative work. In a nod to a mainstay of social media and media websites, the new Congress.gov also has a “most viewed bills” list that lets visitors see at a glance what laws or proposals are gathering interest online. (You can download a fact sheet on all the changes as a PDF).

On the one hand, the new Congress.gov is a dramatic update to a site that desperately needed one, particularly in a historic moment where citizens are increasingly connecting to the Internet (and one another) through their mobile devices.

On the other hand, the new Congress.gov beta has yet to realize the potential of Congress publishing bulk open legislative data. There is no application programming interface (API) for open government developers to build upon. In many ways, the new Congress.gov replicates what was already available to the public at sites like Govtrack.us and OpenCongress.org.

In response to my tweets about the site, former law librarian Meg Lulofs Kuhagan (@librarylulu) noted on Twitter that there’s “no data whatsoever, just window dressing” in the new site — but that “it looks good on my phone. More #opengov if you have a smartphone.”

Aaron E. Myers, the director of new media for Senator Major Leader Harry Reid, commented on Twitter that legislative data is a “tough nut to crack,” with the text of amendments, SCOTUS votes and treaties missing from new Congress.gov. In reply, Chris Carlson, the creative director for the Library of Congress, tweeted that that information is coming soon and that all the data that is currently in Thomas.gov will be available on Congress.gov.

Emi Kolawole, who reviewed the new Congress.gov for the Washington Post, reported that more information, including the categories Meyers cited, will be coming to the site soon, during its beta, including the Congressional Record and Index. Here’s hoping that Congress decides to publish all of its valuable Congressional Research Reports, too. Currently, the public has to turn to OpenCRS.com to access that research.

Carlson was justifiably proud of the beta of Congress.gov: “The new site has clean URLs, powerful search, member pages, clean design,” he tweeted. “This will provide access to so many more people who only have a phone for internet.”

While the new Congress.gov is well designed and has the potential to lead to more informed citizens, the choice to build a new website versus release the data disappointed some open government advocates.

“Another hilarious/clueless misallocation of resources,” commented David Moore, co-founder of OpenCongress. “First liberate bulk open gov data; then open API; then website.”

“What’s noticeable about this evolving beta website, besides the major improvements in how people can search and understand legislative developments, is what’s still missing: public comment on the design process and computer-friendly bulk access to the underlying data,” wrote Daniel Schuman, legislative counsel for the Sunlight Foundation. “We hope that Congress will now deeply engage with the public on the design and specifications process and make sure that legislative information is available in ways that most encourage analysis and reuse.”

Kolawole asked Congressional officials about bulk data access and an API and heard that the capacity is there but the approval is not. “They said the system could handle it, but they haven’t received congressional auth. to do it yet,” she tweeted.

Vision and bipartisan support for open government on this issue does exist among Congressional leadership. There has been progress on this front in the 112th Congress: the U.S. House started publishing machine-readable legislative data at docs.house.gov this past January.

“Making legislative data easily available in machine-readable formats is a big victory for open government, and another example of the new majority keeping its pledge to make Congress more open and accountable,” said Speaker of the House John Boehner.

Last December, House Minority Whip Steny Hoyer commented upon on how technology is affecting Congress, his caucus and open government in the executive branch:

For Congress, there is still a lot of work to be done, and we have a duty to make the legislative process as open and accessible as possible. One thing we could do is make THOMAS.gov — where people go to research legislation from current and previous Congresses — easier to use, and accessible by social media. Imagine if a bill in Congress could tweet its own status.

The data available on THOMAS.gov should be expanded and made easily accessible by third-party systems. Once this happens, developers, like many of you here today, could use legislative data in innovative ways. This will usher in new public-private partnerships that will empower new entrepreneurs who will, in turn, yield benefits to the public sector.

For any of that vision of civic engagement and entrepreneurship to can happen around Web, the Library of Congress will need to fully open up the data. Why hasn’t it happened yet, given bipartisan support and a letter from the Speaker of the House?

techPresident managing editor Nick Judd asked the Library of Congress about Congress.gov. The director of the communications for the Library of Congress, Gayle Osterberg, suggested in an email in response that Congress hasn’t been clear about the manner for data release.

“Congress has said what to do on bulk access,” commented Schuman. “See the joint explanatory statement. “There is support for bulk access.”

In June 2012, the House’s leadership has issued a bipartisan statement that adopted the goal of “provid[ing] bulk access to legislative information to the American people without further delay,” putting releasing bulk data among its “top priorities in the 112th Congress” and directed a task force “to begin its important work immediately.”

The 112th Congress will come to a close soon. The Republicans swept into the House in 2010 promising a new era of innovation and transparency. If Speaker Boehner, Rep. Hoyer and their colleagues want to end these two divisive years on a high note, fully opening legislative data to the People would be an enduring legacy. Congressional leaders will need to work with the Library of Congress to make that happen.

All that being said, the new Congress.gov is in beta and looks dramatically improved. The digital infrastructure of the federal legislative system got a bit better today, moving towards a more adaptive government. Stay tuned, and give the Library of Congress (@LibraryCongress) some feedback: there’s a new button for it on every page.

This post has been updated with comments from Facebook, a link and reporting from techPresident, and a clarification from Daniel Schuman regarding the position of the House of Representatives.

August 29 2012

President Obama participates in first Presidential AMA on Reddit

Starting around 4:30 PM ET today, President Barack Obama made history by going onto Reddit to answer questions about anything for an hour. Reddit, one of the most popular social news sites on the Internet, has been hosting “Ask Me Anything” forums — or AMAs – for years, including sessions with prominent legislators like Representative Darrell Issa (R-CA), but to host a sitting President of the United States will elevate Reddit’s prominence in the intersection of technology and politics. AllThingsD has the story of Reddit got the President onto the site. Reddit co-founder Alexis Ohanian told Peter Kafka that “There are quite a few redditors at 1600 Pennsylvania Ave and at the campaign HQ — given the prominence of reddit, it’s an easy sell.”

President Obama made some news in the process, with respect to the Supreme Court decision that allowed super political action committees, or “Super PACs,” to become part of the campaign finance landscape.

“Over the longer term, I think we need to seriously consider mobilizing a constitutional amendment process to overturn Citizens United (assuming the Supreme Court doesn’t revisit it),” commented President Obama. “Even if the amendment process falls short, it can shine a spotlight of the super-PAC phenomenon and help apply pressure for change.”

President Obama announced that he’d be participating in the AMA in a tweet and provided photographic evidence that he was actually answering questions in an image posted to Reddit (above) and in a second tweet during the session.

The timing of the AMA was at least a little political, coming after a speech in Virginia and falling upon the third day of the Republic National Convention, but it is unequivocally a first, in terms of a president directly engaging with the vibrant Reddit community. Many people also tweeted that they were having trouble accessing the page during the AMA, as tens of thousands of users tried to access the forum. According to The Verge, President Obama’s AMA was the most popular post in Reddit’s history, with more than 200,000 visitors on the site concurrently. (Presidential Q&As apparently melts servers almost as much as being Biebered.)

Today’s AMA is only the latest example of presidents experimenting with online platforms, from President Clinton and President Bush posting text on WhiteHouse.gov to President Obama joining rebooting that platform on Drupal. More recently, President Obama has participated in a series of online ‘town halls’ using social media, including Twitter, Facebook, LinkedIn and the first presidential Hangout on Google+.

His use of all them deserves to be analyzed critically, in terms of whether the platforms and events were being used to shine the credential of a tech-savvy chief executive in an election year or to genuinely answer the questions and concerns of the citizens he serves.

In analyzing the success of such experiment in digital democracy, it’s worth looking at whether the questions answered were based upon the ones most citizens wanted to see asked (on Reddit, counted by upvotes) and whether the answers given were rehashed talking points or specific to the intent of the questions asked. On the first part of that rubric, President Obama scored high: he answered each of the top-voted questions in the AMA, along with a few personal ones.

 

On the rest of those counts, you can judge for yourself. The president’s answers are below:

“Hey everybody – this is barack. Just finished a great rally in Charlottesville, and am looking forward to your questions. At the top, I do want to say that our thoughts and prayers are with folks who are dealing with Hurricane Isaac in the Gulf, and to let them know that we are going to be coordinating with state and local officials to make sure that we give families everything they need to recover.”

On Internet freedom: “Internet freedom is something I know you all care passionately about; I do too. We will fight hard to make sure that the internet remains the open forum for everybody – from those who are expressing an idea to those to want to start a business. And although their will be occasional disagreements on the details of various legislative proposals, I won’t stray from that principle – and it will be reflected in the platform.”

On space exploration: “Making sure we stay at the forefront of space exploration is a big priority for my administration. The passing of Neil Armstrong this week is a reminder of the inspiration and wonder that our space program has provided in the past; the curiosity probe on mars is a reminder of what remains to be discovered. The key is to make sure that we invest in cutting edge research that can take us to the next level – so even as we continue work with the international space station, we are focused on a potential mission to a asteroid as a prelude to a manned Mars flight.”

On helping small businesses and relevant bills: “We’ve really focused on this since I came into office – 18 tax cuts for small business, easier funding from the SBA. Going forward, I want to keep taxes low for the 98 percent of small businesses that have $250,000 or less in income, make it easier for small business to access financing, and expand their opportunities to export. And we will be implementing the Jobs Act bill that I signed that will make it easier for startups to access crowd-funding and reduce their tax burden at the start-up stage.”

Most difficult decision you had to make this term? ”The decision to surge our forces in afghanistan. Any time you send our brave men and women into battle, you know that not everyone will come home safely, and that necessarily weighs heavily on you. The decision did help us blunt the taliban’s momentum, and is allowing us to transition to afghan lead – so we will have recovered that surge at the end of this month, and will end the war at the end of 2014. But knowing of the heroes that have fallen is something you never forget.”

On the influence of money in politics ”Money has always been a factor in politics, but we are seeing something new in the no-holds barred flow of seven and eight figure checks, most undisclosed, into super-PACs; they fundamentally threaten to overwhelm the political process over the long run and drown out the voices of ordinary citizens. We need to start with passing the Disclose Act that is already written and been sponsored in Congress – to at least force disclosure of who is giving to who. We should also pass legislation prohibiting the bundling of campaign contributions from lobbyists. Over the longer term, I think we need to seriously consider mobilizing a constitutional amendment process to overturn Citizens United (assuming the Supreme Court doesn’t revisit it). Even if the amendment process falls short, it can shine a spotlight of the super-PAC phenomenon and help apply pressure for change.”

On prospects for recent college grads – in this case, a law school grad: I understand how tough it is out there for recent grads. You’re right – your long term prospects are great, but that doesn’t help in the short term. Obviously some of the steps we have taken already help young people at the start of their careers. Because of the health care bill, you can stay on your parent’s plan until you’re twenty six. Because of our student loan bill, we are lowering the debt burdens that young people have to carry. But the key for your future, and all our futures, is an economy that is growing and creating solid middle class jobs – and that’s why the choice in this election is so important. The other party has two ideas for growth – more taxs cuts for the wealthy (paid for by raising tax burdens on the middle class and gutting investments like education) and getting rid of regulations we’ve put in place to control the excesses on wall street and help consumers. These ideas have been tried, they didnt work, and will make the economy worse. I want to keep promoting advanced manufacturing that will bring jobs back to America, promote all-American energy sources (including wind and solar), keep investing in education and make college more affordable, rebuild our infrastructure, invest in science, and reduce our deficit in a balanced way with prudent spending cuts and higher taxes on folks making more than $250,000/year. I don’t promise that this will solve all our immediate economic challenges, but my plans will lay the foundation for long term growth for your generation, and for generations to follow. So don’t be discouraged – we didn’t get into this fix overnight, and we won’t get out overnight, but we are making progress and with your help will make more.”

First thing he’ll do on November 7th: “Win or lose, I’ll be thanking everybody who is working so hard – especially all the volunteers in field offices all across the country, and the amazing young people in our campaign offices.”

How do you balance family life and hobbies with being POTUS? ”It’s hard – truthfully the main thing other than work is just making sure that I’m spending enough time with michelle and the girls. The big advantage I have is that I live above the store – so I have no commute! So we make sure that when I’m in DC I never miss dinner with them at 6:30 pm – even if I have to go back down to the Oval for work later in the evening. I do work out every morning as well, and try to get a basketball or golf game in on the weekends just to get out of the bubble. Speaking of balance, though, I need to get going so I’m back in DC in time for dinner. But I want to thank everybody at reddit for participating – this is an example of how technology and the internet can empower the sorts of conversations that strengthen our democracy over the long run. AND REMEMBER TO VOTE IN NOVEMBER – if you need to know how to register, go to Gottaregister.com. By the way, if you want to know what I think about this whole reddit experience – NOT BAD!”

On +The White House homebrew recipe ”It will be out soon! I can tell from first hand experience, it is tasty.”

A step forward for digital democracy?

The most interesting aspect of that Presidential Hangout was that it introduced the possibility of unscripted moments, where a citizen could ask an unexpected question, and the opportunity for followups, if an answer wasn’t specific enough.

Reddit doesn’t provide quite the same mechanism for accountability at a live Hangout, in terms of putting an elected official on the spot to answer. Unfortunately, the platform of Reddit itself falls short here: there’s no way to force a politician to circle back and give a better answer, in the way, say, Mike Wallace might have on “60 Minutes.”

Alexis Madrigal, one of the sharpest observers of technology and society currently gracing the pages of the Atlantic, is clear about the issues with a Reddit AMA: “it’s a terrible format for extracting information from a politician.”

Much as many would like to believe that the medium determines the message, a modern politician is never unmediated. Not in a pie shop in Pennsylvania, not at a basketball game, not while having dinner, not on the phone with NASA, not on TV, not doing a Reddit AMA. Reddit is not a mic accidentally left on during a private moment. The kind of intimacy and honesty that Redditors crave does not scale up to national politics, where no one ever lets down his or her guard. Instead of using the stiffness and formality of the MSM to drive his message home, Obama simply used the looseness and casual banter of Reddit to drive his message home. Here more than in almost anything else: Tech is not the answer to the problems of modern politics.

Today’s exchange, however, does hint at the tantalizing dynamic that makes it alluring: that the Internet is connecting you and your question to the most powerful man in the world, directly, and that your online community can push for him to answer it.

President Obama ended today’s AMA by thanking everyone on Reddit for participating and wrote that “this is an example of how technology and the internet can empower the sorts of conversations that strengthen our democracy over the long run.”

Well, it’s a start. Thank you for logging on today, Mr. President. Please come back online and answer some more follow up questions.

Reposted byRK RK

August 13 2012

With new maps and apps, the case for open transit gets stronger

OpenTripPlanner logoEarlier this year, the news broke that Apple would be dropping default support for transit in iOS 6. For people (like me) who use the iPhone to check transit routes and times when they travel, that would mean losing a key feature. It also has the potential to decrease the demand for open transit data from cities, which has open government advocates like Clay Johnson concerned about public transportation and iOS 6.

This summer, New York City-based non-profit Open Plans launched a Kickstarter campaign to fund a new iPhone transit app to fill in the gap.

“From the public perspective, this campaign is about putting an important feature back on the iPhone,” wrote Kevin Webb, a principal at Open Plans, via email. “But for those of us in the open government community, this is about demonstrating why open data matters. There’s no reason why important civic infrastructure should get bound up in a fight between Apple and Google. And in communities with public GTFS, it won’t.”

Open Plans already had a head start in creating a patch for the problem: they’ve been working with transit agencies over the past few years to build OpenTripPlanner, an open source application that uses open transit data to help citizens make transit decisions.

“We were already working on the back-end to support this application but decided to pursue the app development when we heard about Apple’s plans with iOS,” explained Webb. “We were surprised by the public response around this issue (the tens of thousands who joined Walkscore’s petition and wanted to offer a constructive response).”

Crowdfunding digital city infrastructure?

That’s where Kickstarter and crowdfunding come into the picture. The Kickstarter campaign would help Open Plans make OpenTripPlanner a native iPhone app, followed by Android and HTML5 apps down the road. Open Plans’ developers have decided that given mobile browser limitations in iOS, particularly the speed of JavaScript apps, an HTML5 app isn’t a replacement for a native app.

Kickstarter has emerged as a platform for more than backing ideas for cool iPod watches or services. Increasingly, it’s looking like Kickstarter could be a new way for communities to collectively fund the creation of civic apps or services for their towns that government isn’t agile enough to deliver for them. While that’s sure to make some people in traditional positions of power uneasy, it also might be a way to do an end-around traditional procurement processes — contingent upon cities acting as platforms for civic startups to build upon.

“We get foundation and agency-based contract support for our work already,” wrote Webb. “However, we’ve discovered that foundations aren’t interested in these kinds of rider-facing tools, and most agencies don’t have the discretion or the budget to support the development of something universal. As a result, these kinds of projects require speculative investment. One of the awesome things about open data is that it lets folks respond directly and constructively by building something to solve a need, rather than waiting on others to fix it for them.

“Given our experience with transit and open data, we knew that this was a solvable problem; it just required someone to step up to the challenge. We were well positioned to take on that role. However, as a non-profit, we don’t have unlimited resources, so we’d ask for help. Kickstarter seems like the right fit, given the widespread public interest in the problem, and an interesting way to get the message out about our perspective. Not only do we get to raise a little money, but we’re also sharing the story about why open data and open source matter for public infrastructure with a new audience.”

Civic code in active re-use

Webb, who has previously staked out a position that iOS 6 will promote innovation in public transit, says that OpenTripPlanner is already a thriving open source project, with a recent open transit launch in New Orleans, a refresh in Portland and other betas soon to come.

In a welcome development for DC cyclists (including this writer), a version of OpenTripPlanner went live recently at BikePlanner.org. The web app, which notably uses OpenStreetMap as a base layer, lets users either plot a course for their own bike or tap into the Capital Bikeshare network in DC. BikePlanner is a responsive HTML5 app, which means that it looks good and works well on a laptop, iPad, iPhone or Android device.

Focusing on just open transit apps, however, would be to miss the larger picture of new opportunities to build improvements to digital city infrastructure.

There’s a lot more at stake than just rider-facing tools, in Webb’s view — from urban accessibility to extending the GTFS data ecosystem.

“There’s a real need to build a national (and eventually international) transit data infrastructure,” said Webb. “Right now, the USDOT has completely fallen down on the job. The GTFS support we see today is entirely organic, and there’s no clear guidance anywhere about making data public or even creating GTFS in the first place. That means building universal apps takes a lot of effort just wrangling data.”

August 03 2012

Palo Alto looks to use open data to embrace ‘city as a platform’

In the 21st century, one of the strategies cities around the world are embracing to improve services, increase accountability and stimulate economic activity is to publish open data online. The vision for New York City as a data platform earned wider attention last year, when the Big Apple’s first chief digital officer, Rachel Sterne, pitched the idea to the public.

This week, the city of Palo Alto in California joined over a dozen cities around the United States and globe when it launched its own open data platform. The platform includes an application programming interface (API) which enables direct access through a RESTful interface to open government data published in a JSON format. Datasets can also be embedded like YouTube videos, as below:

“We’re excited to bring the value of Open Data to our community. It is a natural complement to our goal of becoming a leading digital city and a connected community,” said James Keene, Palo Alto City Manager, in a prepared statement. “By making valuable datasets easily available to our residents, we’re further removing the barriers to a more inclusive and transparent local government here in Palo Alto.”

The city initially published open datasets that include the 2010 census data, pavement condition, city tree locations, park locations, bicycle paths and hiking trails, creek water level, rainfall and utility data. Open data about Palo Alto budgets, campaign finance, government salaries, regulations, licensing, or performance — which would all offer more insight into traditional metrics for government accountability — were not part of this first release.

“We are delighted to work with a local, innovative Silicon Valley start-up,” said Dr. Jonathan Reichental, Palo Alto’s chief information officer, in a prepared statement. (Junar’s U.S. offices are in Palo Alto.) “Rather than just publishing lists of datasets, the cloud-based Junar platform has enhancement and visualization capabilities that make the data useful even before it is downloaded or consumed by a software application.”

Notably, the city chose to use Junar, a Chilean software company that raised $1.2 million dollars in funding in May 2012. Junar provides data access in the cloud through the software-as-a-service model. There’s now a more competitive marketplace for open data platforms than has existed in years past, with a new venture-backed startup joining the space.

“The City of Palo Alto joins a group of forward-thinking organizations that are using Open Data as a foundation for more efficient delivery of services, information, and enabling innovation,” said Diego May, CEO and co-founder of Junar, in a prepared statement. “By opening data with the Junar Platform, the City of Palo Alto is exposing and sharing valuable data assets and is also empowering citizens to use and create new applications and services.”

The success or failure of Palo Alto’s push to become a more digital city might be more fairly judged in a year, when measuring downstream consumption of its open data in applications and services by citizens — or by government in increasing productivity — will be possible.

In the meantime, Reichental (who may be familiar to Radar readers as O’Reilly Media’s former CIO) provided more perspective via email on what he’s up to in Palo Alto.

What does it mean for a “city to be a platform?”

Reichental: We think of this as both a broad metaphor and a practicality. Not only do our citizens want to be plugged in to our government operations — open data being one way to achieve this among others — but we want our community and other interested parties to build capability on top of our existing data and services. Recognizing the increasing limitations of local government means you have to find creative ways to extend it and engage with those that have the skills and resources to build a rich and seamless public-private partnership.

Why launch an open data initiative now? What success stories convinced you to make the investment?

Reichental: It’s a response to our community’s desire to easily access their data and our want as a City to unleash the data for better community decision-making and solution development.

We also believe that over time an open data portal will become a standard government offering. Palo Alto wants to be ahead of the curve and create a positive model for other communities.

Seldom does a week pass when a software engineer in our community doesn’t ask me for access to a large dataset to build an app. Earlier this year, the City participated in a hackathon at Stanford University that produced a prototype web application in less than 24 hours. We provided the data. They provided the skills. The results were so impressive, we were convinced then that we should scale this model.

How much work did it take to make your data more open? Is it machine-readable? What format? What cost was involved?

Reichental: We’re experimenting with running our IT department like a start-up, so we’re moving fast. We went from vendor selection to live in just a few weeks. The data in our platform can be exported as a CSV or to a Google Spreadsheet. In addition, we provide an API for direct access to the data. The bulk of the cost was internal staff time. The actual software, which is cloud-based, was under $5000 for the first year.

What are the best examples of open data initiatives delivering sustainable services to citizens?

Reichental: Too many to mention. I really like what they’re doing in San Francisco (http://apps.sfgov.org/showcase/) but there are amazing things happening on data.gov and in New York City. Lots of other cities in the US doing neat things. The UK has done some high-quality budget accountability work.

Are you consuming your own open data?

Reichental: You bet we are.

Why does having an API matter?

Reichental: We believe the main advantage of having an API is for app development. Of course, there will be other use cases that we can’t even think of right now.

Why did you choose Junar instead of Socrata, CKAN or the OGPL from the U.S. federal government?

Reichental: We did review most of the products in the marketplace including some open source solutions. Each had merits. We ultimately decided on Junar for a 1-year commitment, as it seemed to strike the right balance of features, cost, and vision alignment.

Palo Alto has a couple developers in it. How are you engaging them to work with your data?

Reichental: That’s quite the understatement! The buzz already in the developer community is palpable. We’ve been swamped with requests and ideas already. We think one of the first places we’ll see good usage is in the myriad of hackathons/code jams held in the area.

What are the conditions for using your data or making apps?

Reichental: Our terms and conditions are straightforward. The data can be freely used by anyone for almost any purpose, but the condition of use is that the City has no liability or relationship with the use of the data or any derivative.

You told Mashable that you’re trying to acting like a “lean startup.” What does that mean, in practice?

Reichental: This initiative is a good example. Rather than spend time making the go-live product perfect, we went for speed-to-market with the minimally viable solution to get community feedback. We’ll use that feedback to quickly improve on the solution.

With the recent go-live of our redesigned public website, we launched it initially as a beta site; warts and all. We received lots of valuable feedback, made many of the suggested changes, and then cutover from the beta to production. We ended up with a better product.

Our intent is to get more useful capability out to our community and City staff in shorter time. We want to function as close as we can with the community that we serve. And that’s a lot of amazing start-ups.

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!

Schweinderl