Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

February 06 2014

Self-directed learning, and O’Reilly’s role in the ConnectED program

I wanted to provide a bit of perspective on the donation, announced on Wednesday by the White House, of a Safari Books Online subscription providing access to O’Reilly Media books, videos, and other educational content to every high school in the country.

First off, this came up very suddenly, with a request from the White House that reached me only on Monday, as the White House and Department of Education were gearing up to Wednesday’s announcement about broadband and iPads in schools.  I had a followup conversation with David Edelman, a young staffer who taught himself programming by reading O’Reilly books when in middle school, and launched a web development firm while in high school.  He made the case that connectivity alone, without content, wasn’t all it could be. And he thought of his own experience, and he thought of us.

So we began brainstorming if there were any way we could donate a library of O’Reilly ebooks to every high school in the country. Fortunately, there may be a relatively easy way for us to do that, via Safari Books Online, the subscription service we launched in 2000 in partnership with the Pearson Technology Group. Safari already offers access to corporations and colleges in addition to individuals, so we should be able to work out some kind of special library as part of this offering.

Andrew Savikas, the CEO of Safari, was game. We still haven’t figured out all the details on how we’ll be implementing the program, but in essence, we’ll be providing a custom Safari subscription containing a rich library of content from O’Reilly (and potentially other publishers, if they want to join us) to all high schools in the US.

What’s interesting here is that when we think about education, we often think about investing in teachers. And yes, teachers are incredibly important. But they are only one of the resources we provide to motivated students.

I can’t tell you how often people come up to me and say, “I taught myself everything I know about programming from your books.” In fast-moving fields like software development, people learn from their peers, by looking at source code, and reading books or watching videos to learn more about how things work. They teach themselves.

And if this is true of our adult customers, it is also true of high schoolers and even middle schoolers. I still laugh to remember when it came time to sign the contract for Adam Goldstein’s first book with us, Applescript: The Missing Manual, and he sheepishly confessed that his mother would have to sign for him, because he was only sixteen. His proposal had been flawless – over email, how were we to know how young he was? Adam went on to be an Internet entrepreneur, founder and CEO of the Hipmunk travel search engine.

Other people from O’Reilly’s extended circle of friends who may be well known to you who began their software careers in high school or younger include Eric Ries of Lean Startup fame, Dylan Field of Figma, Alex Rampell of TrialPay, and, sadly, Aaron Swartz.

As David explained the goals of the ConnectED program, he made the point that if only one or two kids in every school gets fired up to build and learn on their own, that could make a huge difference to the future of our country.

It’s easy to see how kids get exposed to programming when they live in Silicon Valley or another high-tech hub. It’s a lot harder in many other parts of the country. So we’re glad to be part of the ConnectEd program, and hope that one day we’ll all be using powerful new services that got built because some kid, somewhere, got his start programming as a result of our participation in this initiative.

June 14 2013

Four short links: 14 June 2014

  1. How Geeks Opened up the UK Government (Guardian) — excellent video introduction to how the UK is transforming its civil service to digital delivery. Most powerful moment for me was scrolling through various depts’ web sites and seeing consistent visual design.
  2. Tools for Working Remotely — Braid’s set of tools (Trello, Hackpad, Slingshot, etc.) for remote software teams.
  3. Git Push to Deploy on Google App EngineEnabling this feature will create a remote Git repository for your application’s source code. Pushing your application’s source code to this repository will simultaneously archive the latest the version of the code and deploy it to the App Engine platform.
  4. Amazon’s 3D Printer Store — printers and supplies. Deeply underwhelming moment of it arriving on the mainstream.

May 01 2013

Towards a more open world

Last September, I gave a 5 minute Ignite talk at the tenth Ignite DC. The video just became available. My talk, embedded below, focused on what I’ve been writing about here at Radar for the past three years: open government, journalism, media, mobile technology and more.

The 20 slides that I used for the Ignite were a condensed version of a much longer presentation I’d created for a talk on open data and journalism in Moldova, also I’ve embedded below.

April 30 2013

Linking open data to augmented intelligence and the economy

After years of steady growth, open data is now entering into public discourse, particularly in the public sector. If President Barack Obama decides to put the White House’s long-awaited new open data mandate before the nation this spring, it will finally enter the mainstream.

As more governments, businesses, media organizations and institutions adopt open data initiatives, interest in the evidence behind  release and the outcomes from it is similarly increasing. High hopes abound in many sectors, from development to energy to health to safety to transportation.

“Today, the digital revolution fueled by open data is starting to do for the modern world of agriculture what the industrial revolution did for agricultural productivity over the past century,” said Secretary of Agriculture Tom Vilsack, speaking at the G-8 Open Data for Agriculture Conference.

As other countries consider releasing their public sector information as data and machine-readable formats onto the Internet, they’ll need to consider and learn from years of effort at data.gov.uk, data.gov in the United States, and Kenya in Africa.

nigel_shadboltnigel_shadboltOne of the crucial sources of analysis for the success or failure of open data efforts will necessarily be research institutions and academics. That’s precisely why research from the Open Data Institute and Professor Nigel Shadbolt (@Nigel_Shadbolt) will matter in the months and years ahead.

In the following interview, Professor Shadbolt and I discuss what lies ahead. His responses were lightly edited for content and clarity.

How does your research on artificial intelligence (AI) relate to open data?

AI has always fascinated me. The quest for understanding what makes us smart and how we can make computers smart has always engaged me. While we’re trying to understand the principles of human intelligence and build a “brain in a box, smarter robots” or better speech processing algorithms, the world’s gone and done a different kind of AI: augmented intelligence. The web, with billions of human brains, has a new kind of collective and distributive capability that we couldn’t even see coming in AI. A number of us have coined a phrase, “Web science,” to understand the Web at a systems level, much as we do when we think about human biology. We talk about “systems biology” because there are just so many elements: technical, organizational, cultural.

The Web really captured my attention ten years ago as this really new manifestation of collective problem-solving. If you think about the link into earlier work I’d done, in what was called “knowledge engineering” or knowledge-based systems, there the problem was that all of the knowledge resided on systems on people’s desks. What the web has done is finish this with something that looks a lot like a supremely distributed database. Now, that distributed knowledge base is one version of the Semantic Web. The way I got into open data was the notion of using linked data and semantic Web technologies to integrate data at scale across the web — and one really high value source of data is open government data.

What was the reason behind the founding and funding of the Open Data Institute (ODI)?

The open government data piece originated in work I did in 2003 and 2004. We were looking at this whole idea of putting new data-linking standards on the Web. I had a project in the United Kingdom that was working with government to show the opportunities to use these techniques to link data. As in all of these things, that work was reported to Parliament. There was real interest in it, but not really top-level heavy “political cover” interest. Tim Berners-Lee’s engagement with the previous prime minister led to Gordon Brown appointing Tim and I to look at setting up data.gov.uk, getting data released and then the current coalition government taking that forward.

Throughout this time, Tim and I have been arguing that we could really do with a central focus, an institute whose principle motivation was working out how we could find real value in this data. The ODI does exactly that. It’s got about $60 million of public money over five years to incubate companies, build capacity, train people, and ensure that the public sector is supplying high quality data that can be consumed. The fundamental idea is that you ensure high quality supply by generating a strong demand side. The good demand side isn’t just public sector, it’s also the private sector.

What have we learned so far about what works and what doesn’t? What are the strategies or approaches that have some evidence behind them?

I think there are some clear learnings. One that I’ve been banging on about recently has been that yes, it really does matter to turn the dial so that governments have a presumption to publish non-personal public data. If you would publish it anyway, under a Freedom of Information request or whatever your local legislative equivalent is, why aren’t you publishing it anyway as open data? That, as a behavioral change. is a big one for many administrations where either the existing workflow or culture is, “Okay, we collect it. We sit on it. We do some analysis on it, and we might give it away piecemeal if people ask for it.” We should construct publication process from the outset to presume to publish openly. That’s still something that we are two or three years away from, working hard with the public sector to work out how to do and how to do properly.

We’ve also learned that in many jurisdictions, the amount of [open data] expertise within administrations and within departments is slight. There just isn’t really the skillset, in many cases. for people to know what it is to publish using technology platforms. So there’s a capability-building piece, too.

One of the most important things is it’s not enough to just put lots and lots of datasets out there. It would be great if the “presumption to publish” meant they were all out there anyway — but when you haven’t got any datasets out there and you’re thinking about where to start, the tough question is to say, “How can I publish data that matters to people?”

The data that matters is revealed in the fact that if we look at the download stats on these various UK, US and other [open data] sites. There’s a very, very distinctive parallel curve. Some datasets are very, very heavily utilized. You suspect they have high utility to many, many people. Many of the others, if they can be found at all, aren’t being used particularly much. That’s not to say that, under that long tail, there isn’t large amounts of use. A particularly arcane open dataset may have exquisite use to a small number of people.

The real truth is that it’s easy to republish your national statistics. It’s much harder to do a serious job on publishing your spending data in detail, publishing police and crime data, publishing educational data, publishing actual overall health performance indicators. These are tough datasets to release. As people are fond of saying, it holds politicians’ feet to the fire. It’s easy to build a site that’s full of stuff — but does the stuff actually matter? And does it have any economic utility?

Page views and traffic aren’t ideal metrics for measuring success for an open data platform. What should people measure, in terms of actual outcomes in citizens’ lives? Improved services or money saved? Performance or corrupt politicians held accountable? Companies started or new markets created?

You’ve enumerated some of them. It’s certainly true that one of the challenges is to instrument the effect or the impact. Actually, it’s the last thing that governments, nation states, regions or cities who are enthused to do this thing do. It’s quite hard.

Datasets, once downloaded, may then be virally reproduced all over the place, so that you don’t notice it from a government site. One of the requirements in most of the open licensing which is so essential to this effort is usually has a requirement for essential attribution. Those licenses should be embedded in the machine readable datasets themselves. Not enough attention is paid to that piece of process, to actually noticing when you’re looking at other applications, other data and publishing efforts, that attribution is there. We should be smarter about getting better sense from the attribution data.

The other sources of impact, though: How do you evidence actual internal efficiencies and internal government-wide benefits of open data? I had an interesting discussion recently, where the department of IT had said, “You know, I thought this was all stick and no carrot. I thought this was all in overhead, to get my data out there for other people’s benefits, but we’re now finding it so much easier to re-consume our own data and repurpose it in other contexts that it’s taken a huge amount of friction out of our own publication efforts.”

Quantified measures would really help, if we had standard methods to notice those kinds of impacts. Our economists, people whose impact is around understanding where value is created, really haven’t embraced open markets, particularly open data markets, in a very substantial way. I think we need a good number of capable economists pilling into this, trying to understand new forms of value and what the values are that are created.

I think a lot of the traditional models don’t stand up here. Bizarrely, it’s much easier to measure impact when information scarcity exists and you have something that I don’t, and I have to pay you a certain fee for that stuff. I can measure that value. When you’ve taken that asymmetry out, when you’ve made open data available more widely, what are the new things that flourish? In some respects, you’ll take some value out of the market, but you’re going to replace it by wider, more distributed, capable services. This is a key issue.

The ODI will certainly be commissioning and is undertaking work in this area. We published a piece of work jointly with Deloitte in London, looking at evidence-linked methodology.

You mentioned the demand-side of open data. What are you learning in that area — and what’s being done?

There’s an interesting tension here. If we turn the dial in the governmental mindset to the “presumption to publish” — and in the UK, our public data principles actually embrace that as government policy — you are meant to publish unless there’s an issue in personal information or national security why you would not. In a sense, you say, “Well, we just publish everything out there? That’s what we’ll do. Some of it will have utility, and some of it won’t.”

When the Web took off, and you offered pages as a business or an individual, you didn’t foresee the link-making that would occur. You didn’t foresee that PageRank would ultimately give you a measure of your importance and relevance in the world and could even be monetized after the fact. You didn’t foresee that those pages have their own essential network effect, that the more pages there are that interconnect, that there’s value being created out of it and so there’s is a strong argument [for publishing them].

So, you know, just publish. In truth, the demand side is an absolutely great and essential test of whether actually [publishing data] does matter.

Again, to take the Web as an analogy, large amounts of the Web are unattended to, neglected, and rot. It’s just stuff nobody cares about, actually. What we’re seeing in the open data effort in the UK is that it’s clear that some data is very privileged. It’s at the center of lots of other datasets.

In particular, [data about] location, occurrence, and when things occurred, and stable ways of identifying those things which are occurring. Then, of course, the data space that relates to companies, their identifications, the contracts they call, and the spending they engage in. That is the meat and drink of business intelligence apps all across the planet. If you started to turn off an ability for any business intelligence to access legal identifiers or business identifiers, all sorts of oversight would fall apart, apart from anything else.

The demand side [of open data] can be characterized. It’s not just economic. It will have to do with transparency, accountability and regulatory action. The economic side of open data gives you huge room for maneuver and substantial credibility when you can say, “Look, this dataset of spending data in the UK, published by local authorities, is the subject of detailed analytics from companies who look at all data about how local authorities and governments are spending their data. They sell procurement analysis insights back to business and on to third parties and other parts of the business world, saying ‘This is the shape of how the UK PLC is buying.’”

What are some of the lessons we can learn from how the World Wide Web grew and the value that it’s delivered around the world?

That’s always a worry, that, in some sense, the empowered get more powerful. What we do see is that, in open data in particular, new sorts of players couldn’t enter the game at all.

My favorite example is in mass transportation. In the UK, we have to fight quite hard to get some of the data from bus, rail and other forms of transportation made openly available. Until that was done, there was a pretty small number of supplies from this market.

In London, where all of it was made available from the Transport for London Authority, there’s just been an explosion of apps and businesses who are giving you subtly and distinct experiences as users of that data. I’ve got about eight or nine apps on my phone that give me interestingly distinctive views of moving about the city of London. I couldn’t have predicted or anticipated many of those exist.

I’m sure the companies who held that data could’ve spent large amounts of money and still not given me anything like the experience I now have. The flood of innovation around the data has really been significant and many, many more players and stakeholders in that space.

The Web taught us that serendipitous reuse, where you can’t anticipate where the bright idea comes from, is what is so empowering. The flipside of that is that it also reveals that, in some cases, the data isn’t necessarily of a quality that you might’ve thought. This effort might allow for civic improvement or indeed, business improvement in some cases, where businesses come and improve the data the state holds.

What’s happening in the UK with the so-called “MiData Initiative,” which posits that people have a right to access and use personal data disclosed to them?

I think this is every bit as potentially disruptive and important as open government data. We’re starting to see the emergence of what we might think of as a new class of important data, “personal assets.”

People have talked about “personal information management systems” for a long time now. Frequently, it’s revolved around managing your calendar or your contact list, but it’s much deeper. Imagine that you, the consumer, or you, the citizen, had a central locus of authority around data that was relevant to you: consumer data from retail, from the banks that you deal with, from the telcos you interact with, from the utilities you get your gas, water and electricity from. Imagine if that data infosphere was something that you could access easily, with a right to reuse and redistribute it as you saw fit.

The canonical example, of course, is health data. It isn’t all data that business holds, it’s also data the state holds, like your health records, educational transcript, welfare, tax, or any number of areas.

In the UK, we’ve been working towards empowering consumers, in particular through this MiData program. We’re trying to get to a place where consumers have a right to data held about their transactions by businesses, [released] back to them in a reusable and flexible way. We’ve been working on a voluntary program in this area for the last year. We have a consultation on taking up power to require large companies to give that information back. There is a commitment to the UK, for the first time, to get health records back to patients as data they control, but I think it has to go much more widely.

Personal data is a natural complement to open data. Some of the most interesting applications I’m sure we’re going to see in this area are where you take your personal data and enrich it with open data relating to businesses, the services of government, or the actual trading environment you’re in. In the UK, we’ve got six large energy companies that compete to sell energy to you.

Why shouldn’t groups and individuals be able to get together and collectively purchase in the same way that corporations can purchase and get their discounts? Why can’t individuals be in a spot market, effectively, where it’s easy to move from one supplier to another? Along with those efficiencies in the market and improvements in service delivery, it’s about empowering consumers at the end of the day.

This post is part of our ongoing series on the open data economy.

April 18 2013

Sprinting toward the future of Jamaica

Creating the conditions for startups to form is now a policy imperative for governments around the world, as Julian Jay Robinson, minister of state in Jamaica’s Ministry of Science, Technology, Energy and Mining, reminded the attendees at the “Developing the Caribbean” conference last week in Kingston, Jamaica.

photo-22photo-22

Robinson said Jamaica is working on deploying wireless broadband access, securing networks and stimulating tech entrepreneurship around the island, a set of priorities that would have sounded of the moment in Washington, Paris, Hong Kong or Bangalore. He also described open access and open data as fundamental parts of democratic governance, explicitly aligning the release of public data with economic development and anti-corruption efforts. Robinson also pledged to help ensure that Jamaica’s open data efforts would be successful, offering a key ally within government to members of civil society.

The interest in adding technical ability and capacity around the Caribbean was sparked by other efforts around the world, particularly Kenya’s open government data efforts. That’s what led the organizers to invite Paul Kukubo to speak about Kenya’s experience, which Robinson noted might be more relevant to Jamaica than that of the global north.

Kukubo, the head of Kenya’s Information, Communication and Technology Board, was a key player in getting the country’s open data initiative off the ground and evangelizing it to developers in Nairobi. At the conference, Kukubo gave Jamaicans two key pieces of advice. First, open data efforts must be aligned with national priorities, from reducing corruption to improving digital services to economic development.

“You can’t do your open data initiative outside of what you’re trying to do for your country,” said Kukubo.

Second, political leadership is essential to success. In Kenya, the president was personally involved in open data, Kukubo said. Now that a new president has been officially elected, however, there are new questions about what happens next, particularly given that pickup in Kenya’s development community hasn’t been as dynamic as officials might have hoped. There’s also a significant issue on the demand-side of open data, with respect to the absence of a Freedom of Information Law in Kenya.

When I asked Kukubo about these issues, he said he expects a Freedom of Information law will be passed this year in Kenya. He also replied that the momentum on open data wasn’t just about the supply side.

“We feel that in the usage side, especially with respect to the developer ecosystem, we haven’t necessarily gotten as much traction from developers using data and interpreting cleverly as we might have wanted to have,” he said. “We’re putting putting more into that area.”

With respect to leadership, Kukubo pointed out that newly elected Kenyan President Uhuru Kenyatta drove open data release and policy when he was the minister of finance. Kukubo expects him to be very supportive of open data in office.

The development of open data in Jamaica, by way of contrast, has been driven by academia, said professor Maurice McNaughton, director of the Center of Excellence at the Mona School of Business at the University of the West Indies (UWI). The Caribbean Open Institute, for instance, has been working closely with Jamaica’s Rural Agriculture Development Authority (RADA). There are high hopes that releases of more data from RADA and other Jamaican institutions will improve Jamaica’s economy and the effectiveness of its government.

Open data could add $35 million annually to the Jamaican economy, said Damian Cox, director of the Access to Information Unit in the Office of the Prime Minister, citing a United Nations estimate. Cox also explicitly aligned open data with measuring progress toward Millennium Development Goals, positing that increasing the availability of data will enable the civil society, government agencies and the UN to more accurately assess success.

The development of (open) data-driven journalism

Developing the Caribbean focused on the demand side of open data as well, particularly the role of intermediaries in collecting, cleaning, fact checking, and presenting data, matched with necessary narrative and context. That kind of work is precisely what data-driven journalism does, which is why it was one of the major themes of the conference. I was invited to give an overview of data-driven journalism that connected some trends and highlighted the best work in the field.

I’ve written quite a bit about how data-driven journalism is making sense of the world elsewhere, with a report yet to come. What I found in Jamaica is that media there have long since begun experimenting in the field, from the investigative journalism at Panos Caribbean to the relatively recent launch of diGJamaica by the Gleaner Company.

diGJamaica is modeled upon the Jamaican Handbook and includes more than a million pages from The Gleaner newspaper, going back to 1834. The site publishes directories of public entities and public data, including visualizations. It charges for access to the archives.

Legends and legacies

Usain Bolt in JamaicaUsain Bolt in Jamaica

Olympic champion Usain Bolt, photographed in his (fast) car at the UWI/Usain Bolt Track in Mona, Jamaica.

Normally, meeting the fastest man on earth would be the most memorable part of any trip. The moment that left the deepest impression from my journey to the Caribbean, however, came not from encountering Usain Bolt on a run but from within a seminar room on a university campus.

As a member of a panel of judges, I saw dozens of young people present after working for 30 hours at a hackathon at the University of the West Indies. While even the most mature of the working apps was still a prototype, the best of them were squarely focused on issues that affect real Jamaicans: scoring the risk of farmers that needed banking loans and collecting and sharing data about produce.

The winning team created a working mobile app that would enable government officials to collect data at farms. While none of the apps are likely to be adopted by the agricultural agency in its current form, or show up in the Google Play store this week, the experience the teams gained will help them in the future.

As I left the island, the perspective that I’d taken away from trips to Brazil, Moldova and Africa last year was further confirmed: technical talent and creativity can be found everywhere in the world, along with considerable passion to apply design thinking, data and mobile technology to improve the societies people live within. This is innovation that matters, not just clones of popular social networking apps — though the judges saw more than a couple of those ideas flow by as well.

In the years ahead, Jamaican developers will play an important role in media, commerce and government on the island. If attracting young people to engineering and teaching them to code is the long-term legacy of efforts like Developing the Caribbean, it will deserve its own thumbs up from Mr. Bolt. The track to that future looks wide open.

photo-23photo-23

Disclosure: the cost of my travel to Jamaica was paid for by the organizers of the Developing the Caribbean conference.

April 04 2013

Four short links: 4 April 2013

  1. geo-bootstrap — Twitter Bootstrap fork that looks like a classic geocities page. Because. (via Narciso Jaramillo)
  2. Digital Public Library of America — public libraries sharing full text and metadata for scans, coordinating digitisation, maximum reuse. See The Verge piece. (via Dan Cohen)
  3. Snake Robots — I don’t think this is a joke. The snake robot’s versatile abilities make it a useful tool for reaching locations or viewpoints that humans or other equipment cannot. The robots are able to climb to a high vantage point, maneuver through a variety of terrains, and fit through tight spaces like fences or pipes. These abilities can be useful for scouting and reconnaissance applications in either urban or natural environments. Watch the video, the nightmares will haunt you. (via Aaron Straup Cope)
  4. The Power of Data in Aboriginal Hands (PDF) — critique of government statistical data gathering of Aboriginal populations. That ABS [Australian Bureau of Statistics] survey is designed to assist governments, commentators or academics who want to construct policies that shape our lives or encourage a one-sided public discourse about us and our position in the Australian nation. The survey does not provide information that Indigenous people can use to advance our position because the data is aggregated at the national or state level or within the broad ABS categories of very remote, remote, regional or urban Australia. These categories are constructed in the imagination of the Australian nation state. They are not geographic, social or cultural spaces that have relevance to Aboriginal people. [...] The Australian nation’s foundation document of 1901 explicitly excluded Indigenous people from being counted in the national census. That provision in the constitution, combined with Section 51, sub section 26, which empowered the Commonwealth to make special laws for ‘the people of any race, other than the Aboriginal race in any State’ was an unambiguous and defining statement about Australian nation building. The Founding Fathers mandated the federated governments of Australia to oversee the disappearance of Aboriginal people in Australia.

January 31 2013

NASA launches second International Space Apps Challenge

From April 20 to April 21, on Earth Day, the second international Space Apps Challenge will invite developers on all seven continents to the bridge to contribute code to NASA projects.

space app challengespace app challenge

Given longstanding concerns about the sustainability of apps contests, I was curious about NASA’s thinking behind launching this challenge. When I asked NASA’s open government team about the work, I immediately heard back from Nick Skytland (@Skytland), who heads up NASA’s open innovation team.

“The International Space Apps Challenge was a different approach from other federal government ‘app contests’ held before,” replied Skytland, via email.

“Instead of incentivizing technology development through open data and a prize purse, we sought to create a unique platform for international technological cooperation though a weekend-long event hosted in multiple locations across the world. We didn’t just focus on developing software apps, but actually included open hardware, citizen science, and data visualization as well.”

Aspects of that answer will please many open data advocates, like Clay Johnson or David Eaves. When Eaves recently looked at apps contests, in the context of his work on Open Data Day (coming up on February 23rd), he emphasized the importance of events that build community and applications that meet the needs of citizens or respond to business demand.

The rest of my email interview with Skytland follows.

Why is the International Space Apps Challenge worth doing again?

Nick Skytland: We see the International Space Apps Challenge event as a valuable platform for the Agency because it:

  • Creates new technologies and approaches that can solve some of the key challenges of space exploration, as well as making current efforts more cost-effective.
  • Uses open data and technology to address global needs to improve life on Earth and in space.
  • Demonstrates our commitment to the principles of the Open Government Partnership in a concrete way.

What were the results from the first challenge?

Nick Skytland: More than 100 unique open-source solutions were developed in less then 48 hours.

There were 6 winning apps, but the real “results” of the challenge was a 2,000+ person community engaged in and excited about space exploration, ready to apply that experience to challenges identified by the agency at relatively low cost and on a short timeline.

How does this challenge contribute to NASA’s mission?

Nick Skytland: There were many direct benefits. The first International Space Apps Challenge offered seven challenges specific to satellite hardware and payloads, including submissions from at least two commercial organizations. These challenges received multiple solutions in the areas of satellite tracking, suborbital payloads, command and control systems, and leveraging commercial smartphone technology for orbital remote sensing.

Additionally, a large focus of the Space Apps Challenge is on citizen innovation in the commercial space sector, lowering the cost and barriers to space so that it becomes easier to enter the market. By focusing on citizen entrepreneurship, Space Apps enables NASA to be deeply involved with the quickly emerging space startup culture. The event was extremely helpful in encouraging the collection and dissemination of space-derived data.

As you know, we have amazing open data. Space Apps is a key opportunity for us to continue to open new data sources and invite citizens to use them. Space Apps also encouraged the development of new technologies and new industries, like the space-based 3D printing industry and open-source ROV (remote submersibles for underwater exploration.)

How much of the code from more than 200 “solutions” is still in use?

Nick Skytland: We didn’t track this last time around, but almost all (if not all) of the code is still available online, many of the projects continued on well after the event, and some teams continue to work on their projects today. The best example of this is the Pineapple Project, which participated in numerous other hackathons after the 2012 International Space Apps Challenge and just recently was accepted into the Geeks Without Borders accelerator program.

Of the 71 challenges that were offered last year, a low percentage were NASA challenges — about 13, if I recall correctly. There are many reasons for this, mostly that cultural adoption of open government philosophies within government is just slow. What last year did for us is lay the groundwork. Now we have much more buy-in and interest in what can be done. This year, our challenges from NASA are much more mission-focused and relevant to needs program managers have within the agency.

Additionally, many of the externally submitted challenges we have come from other agencies who are interested in using space apps as a platform to address needs they have. Most notably, we recently worked with the Peace Corps on the Innovation Challenge they offered at RHoK in December 2012, with great results.

The International Space Apps Challenge was a way for us not only to move forward technology development, drawing on the talents and initiative of bright-minded developers, engineers, and technologists, but also a platform to actually engage people who have a passion and desire to make an immediate impact on the world.

What’s new in 2013?

Nick Skytland: Our goal for this year is to improve the platform, create an even better engagement experience, and focus the collective talents of people around the world on develop technological solutions that are relevant and immediately useful.

We have a high level of internal buy-in at NASA and a lot of participation outside NASA, from both other government organizations and local leads in many new locations. Fortunately, this means we can focus our efforts on making this an meaningful event and we are well ahead of the curve in terms of planning to do this.

To date, 44 locations have confirmed their participation and we have six spots remaining, although four of these are reserved as placeholders for cities we are pursuing. We have 50 challenge ideas already drafted for the event, 25 of which come directly from NASA. We will be releasing the entire list of challenges around March 15th on spaceappschallenge.org.

We have 55 organizations so far that are supporting the event, including seven other U.S. government organizations, and international agencies. Embassies or consulates are either directly leading or hosting the events in Monterrey, Krakow, Sofia, Jakarta, Santa Cruz, Rome, London and Auckland.

 

January 17 2013

Yelp partners with NYC and SF on restaurant inspection data

One of the key notions in my “Government as a Platform” advocacy has been that there are other ways to partner with the private sector besides hiring contractors and buying technology. One of the best of these is to provide data that can be used by the private sector to build or enrich their own citizen-facing services. Yes, the government runs a weather website but it’s more important that data from government weather satellites shows up on the Weather Channel, your local TV and radio stations, Google and Bing weather feeds, and so on. They already have more eyeballs and ears combined than the government could or should possibly acquire for its own website.

That’s why I’m so excited to see a joint effort by New York City, San Francisco, and Yelp to incorporate government health inspection data into Yelp reviews. I was involved in some early discussions and made some introductions, and have been delighted to see the project take shape.

My biggest contribution was to point to GTFS as a model. Bibiana McHugh at the city of Portland’s TriMet transit agency reached out to Google, Bing, and others with the question: “If we came up with a standard format for transit schedules, could you use it?” Google Transit was the result — a service that has spread to many other U.S. cities. When you rejoice in the convenience of getting transit timetables on your phone, remember to thank Portland officials as well as Google.

In a similar way, Yelp, New York, and San Francisco came up with a data format for health inspection data. The specification is at http://yelp.com/healthscores. It will reportedly be announced at the US Conference of Mayors with San Francisco Mayor Ed Lee today.

Code for America built a site for other municipalities to pledge support. I’d also love to see support in other local restaurant review services from companies like Foursquare, Google, Microsoft, and Yahoo!  This is, as Chris Anderson of TED likes to say, “an idea worth spreading.”

December 06 2012

The United States (Code) is on Github

When Congress launched Congress.gov in beta, they didn’t open the data. This fall, a trio of open government developers took it upon themselves to do what custodians of the U.S. Code and laws in the Library of Congress could have done years ago: published data and scrapers for legislation in Congress from THOMAS.gov in the public domain. The data at github.com/unitedstates is published using an “unlicense” and updated nightly. Credit for releasing this data to the public goes to Sunlight Foundation developer Eric Mill, GovTrack.us founder Josh Tauberer and New York Times developer Derek Willis.

“It would be fantastic if the relevant bodies published this data themselves and made these datasets and scrapers unnecessary,” said Mill, in an email interview. “It would increase the information’s accuracy and timeliness, and probably its breadth. It would certainly save us a lot of work! Until that time, I hope that our approach to this data, based on the joint experience of developers who have each worked with it for years, can model to government what developers who aim to serve the public are actually looking for online.”

If the People’s House is going to become a platform for the people, it will need to release its data to the people. If Congressional leaders want THOMAS.gov to be a platform for members of Congress, legislative staff, civic developers and media, the Library of Congress will need to release structured legislative data. THOMAS is also not updated in real-time, which means that there will continue to be a lag between a bill’s introduction and the nation’s ability to read the bill before a vote.

Until that happens, however, this combination of scraping and open source data publishing offers a way forward on Congressional data to be released to the public, wrote Willis, on his personal blog:

Two years ago, there was a round of blog posts touched off by Clay Johnson that asked, “Why shouldn’t there be a GitHub for data?” My own view at the time was that availability of the data wasn’t as much an issue as smart usage and documentation of it: ‘We need to import, prune, massage, convert. It’s how we learn.’

Turns out that GitHub actually makes this easier, and I’ve had a conversion of sorts to the idea of putting data in version control systems that make it easier to view, download and report issues with data … I’m excited to see this repository grow to include not only other congressional information from THOMAS and the new Congress.gov site, but also related data from other sources. That this is already happening only shows me that for common government data this is a great way to go.

In the future, legislation data could be used to show iterations of laws and improve the ability of communities at OpenCongress, POPVOX or CrunchGov to discover and discuss proposals. As Congress incorporates more tablets on the floor during debates, such data could also be used to update legislative dashboards.

The choice to use Github as a platform for government data and scraper code is another significant milestone in a breakout year for Github’s use in government. In January, the British government committed GOV.UK code to Github. NASA, after contributing its first code in January added 11 code repositories this year. In August, the White House committed code to Github. In September, the Open Gov Foundation open sourced the MADISON crowd sourced legislation platform.

The choice to use Github for this scraper and legislative data, however, presents a new and interesting iteration in the site’s open source story.

“Github is a great fit for this because it’s neutral ground and it’s a welcoming environment for other potential contributors,” wrote Sunlight Labs director Tom Lee, in an email. “Sunlight expects to invest substantial resources in maintaining and improving this codebase, but it’s not ours: we think the data made available by this code belongs to every American. Consequently the project needed to embrace a form that ensures that it will continue to exist, and be free of encumbrances, in a way that’s not dependent on any one organization’s fortunes.”

Mill, an open government developer at Sunlight Labs, shared more perspective in the rest of our email interview, below.

Is this based on the GovTrack.us scraper?

Eric Mill: All three of us have contributed at least one code change to our new THOMAS scraper; the majority of the code was written by me. Some of the code has been taken or adapted from Josh’s work.

The scraper that currently actively populates the information on GovTrack is an older Perl-based scraper. None of that code was used directly in this project. Josh had undertaken an incomplete, experimental rewrite of these scrapers in Python about a year ago (code), but my understanding is it never got to the point of replacing GovTrack’s original Perl scripts.

We used the code from this rewrite in our new scraper, and it was extremely helpful in two ways &mddash; providing a roadmap of how THOMAS’ URLs and sitemap work, and parsing meaning out of the text of official actions.

Parsing the meaning out of action text is, I would say, about half the value and work of the project. When you look at a page on GovTrack or OpenCongress and see the timeline of a bill’s life — “Passed House,” “Signed by the President,” etc. — that information is only obtainable by analyzing the order and nature of the sentences of the official actions that THOMAS lists. Sentences are finicky, inconsistent things, and extracting meaning from them is tricky work. Just scraping them out of THOMAS.gov’s HTML is only half the battle. Josh has experience at doing this for GovTrack. The code in which this experience was encapsulated drastically reduced how long it took to create this.

How long did this take to build?

Eric Mill: Creating the whole scraper, and the accompanying dataset, was about 4 weeks of work on my part. About half of that time was spent actually scraping — reverse engineering THOMAS’ HTML — and the other half was spent creating the necessary framework, documentation, and general level of rigor for this to be a project that the community can invest in and rely on.

There will certainly be more work to come. THOMAS is shutting down in a year, to be replaced by Congress.gov. As Congress.gov grows to have the same level of data as THOMAS, we’ll gradually transition the scraper to use Congress.gov as its data source.

Was this data online before? What’s new?

Eric Mill: All of the data in this project has existed in an open way at GovTrack.us, which has provided bulk data downloads for years. The Sunlight Foundation and OpenCongress have both created applications based on this data, as have many other people and organizations.

This project was undertaken as a collaboration because Josh and I believed that the data was fundamental enough that it should exist in a public, owner-less commons, and that the code to generate it should be in the same place.

There are other benefits, too. Although the source code to GovTrack’s scrapers has been available, it depends on being embedded in GovTrack’s system, and the use of a database server. It was also written in Perl, a language less widely used today, and produced only XML. This new Python scraper has no other dependencies, runs without a database, and generates both JSON and XML. It can be easily extended to output other data formats.

Finally, everyone who worked on the project has had experience in dealing with legislative information. We were able to use that to make various improvements to how the data is structured and presented that make it easier for developers to use the data quickly and connect it to other data sources.

Searches for bills in Scout use data collected directly from this scraper. What else are people doing with the data?

Eric Mill: Right now, I only know for a fact that the Sunlight Foundation is using the data. GovTrack recently sent an email to its developer list announcing that in the near future, its existing dataset would be deprecated in favor of this new one, so the data should be used in GovTrack before long.

Pleasantly, I’ve found nearly nothing new by switching from GovTrack’s original dataset to this one. GovTrack’s data has always had a high level of quality. So far, the new dataset looks to be as good.

Is it common to host open data on Github?

Eric Mill: Not really. Github’s not designed for large-scale data hosting. This is an experiment to see whether this is a useful place to host it. The primary benefit is that no single person or organization (besides Github) is paying for download bandwidth.

The data is published as a convenience, for people to quickly download for analysis or curiosity. I expect that any person or project that intends to integrate the data into their work on an ongoing basis will do so by using the scraper, not downloading the data repeatedly from Github. It’s not our intent that anyone make their project dependent on the Github download links.

Laudably, Josh Tauberer donated his legislator dataset and converted it to YAML. What’s YAML?

Eric Mill: YAML is a lightweight data format intended to be easy for humans to both read and write. This dataset, unlike the one scraped from THOMAS, is maintained mostly through manual effort. Therefore, the data itself needs to be in source control, it needs to not be scary to look at and it needs to be obvious how to fix or improve it.

What’s in this legislator dataset? What can be done with it?

Eric Mill: The legislator dataset contains information about members of Congress from 1789 to the present day. It is a wealth of vital data for anyone doing any sort of application or analysis of members of Congress. This includes a breakdown of their name, a crosswalk of identifiers on other services, and social media accounts. Crucially, it also includes a member of Congress’ change in party, chamber, and name over time.

For example, it’s a pretty necessary companion to the dataset that our scraper gathers from THOMAS. THOMAS tells you the name of the person who sponsored this bill in 2003, and gives you a THOMAS-specific ID number. But it doesn’t tell you what that person’s party was at the time, or if the person is still a member of the same chamber now as they were in 2003 (or whether they’re in office at all). So if you want to say “how many Republicans sponsored bills in 2003,” or if you’d like to draw in information from outside sources, such as campaign finance information, you will need a dataset like the one that’s been publicly donated here.

Sunlight’s API on members of Congress is easily the most prominent API, widely used by people and organizations to build systems that involve legislators. That API’s data is a tiny subset of this new one.

You moved a legal citation and extractor into this code. What do they do here?

Eric Mill: The legal citation extractor, called “Citation,” plucks references to the US Code (and other things) out of text. Just about any system that deals with legal documents benefits from discovering links between those documents. For example, I use this project to power US Code searches on Scout, so that the site returns results that cite some piece of the law, regardless of how that citation is formatted. There’s no text-based search, simple or advanced, that would bring back results matching a variety of formats or matching subsections — something dedicated to the arcane craft of citation formats is required.

The citation extractor is built to be easy for others to invest in. It’s a stand-alone tool that can be used through the command line, HTTP, or directly through JavaScript. This makes it suitable for the front-end or back-end, and easy to integrate into a project written in any language. It’s very far from complete, but even now it’s already proven extremely useful at creating powerful features for us that weren’t possible before.

The parser for the U.S. Code itself is a dataset, written by my colleague Thom Neale. The U.S. Code is published by the government in various formats, but none of them are suitable for easy reuse. The Office of Law Revision Counsel, which publishes the U.S. Code, is planning on producing a dedicated XML version of the US Code, but they only began the procurement process recently. It could be quite some time before it appears.

Thom’s work parses the “locator code” form of the data, which is a binary format designed for telling GPO’s typesetting machines how to print documents. It is very specialized and very complicated. This parser is still in an early stage and not in use in production anywhere yet. When it’s ready, it’ll produce reliable JSON files containing the law of the United States in a sensible, reusable form.

Does Github’s organization structure makes a data commons possible?

Eric Mill: Github deliberately aligns its interests with the open source community, so it is possible to host all of our code and data there for free. Github offers unlimited public repositories, collaborators, bandwidth, and disk space to organizations and users at no charge. They do this while being an extremely successful, profitable business.

On Github, there are two types of accounts: users and organizations. Organizations are independent entities, but no one has to log in as an organization or share a password. Instead, at least one user will be marked as the “owner” of an organization. Ownership can easily change hands or be distributed amongst various users. This means that Josh, Derek, and I can all have equal ownership of the “unitedstates” repositories and data. Any of us can extend that ownership to anyone we want in a simple, secure way, without password sharing.

Github as a company has established both a space and a culture that values the commons. All software development work, from hobbyist to non-profit to corporation, from web to mobile to enterprise, benefits from a foundation of open source code. Github is the best living example of this truth, so it’s not surprising to me that it was the best fit for our work.

Why is this important to the public?

Eric Mill: The work and artifacts of our government should be available in bulk, for easy download, in accessible formats, and without license restrictions. This is a principle that may sound important and obvious to every technologist out there, but it’s rarely the case in practice. When it is, the bag is usually mixed. Not every member of the public will be able or want to interact directly with our data or scrapers. That’s fine. Developers are the force multipliers of public information. Every citizen can benefit somehow from what a developer can build with government information.

Related:

November 26 2012

Investigating data journalism

Great journalism has always been based on adding context, clarity and compelling storytelling to facts. While the tools have improved, the art is the same: explaining the who, what, where, when and why behind the story. The explosion of data, however, provides new opportunities to think about reporting, analysis and publishing stories.

As you may know, there’s already a Data Journalism Handbook to help journalists get started. (I contributed some commentary to it). Over the next month, I’m going to be investigating the best data journalism tools currently in use and the data-driven business models that are working for news startups. We’ll then publish a report that shares those insights and combines them with our profiles of data journalists.

Why dig deeper? Getting to the heart of what’s hype and what’s actually new and noteworthy is worth doing. I’d like to know, for instance, whether tutorials specifically designed for journalists can be useful, as Joe Brockmeier suggested at ReadWrite. On a broader scale, how many data journalists are working today? How many will be needed? What are the primary tools they rely upon now? What will they need in 2013? Who are the leaders or primary drivers in the area? What are the most notable projects? What organizations are embracing data journalism, and why?

This isn’t a new interest for me, but it’s one I’d like to found in more research. When I was offered an opportunity to give a talk at the second International Open Government Data Conference at the World Bank this July, I chose to talk about open data journalism and invited practitioners on stage to share what they do. If you watch the talk and the ensuing discussion in the video below, you’ll pick up great insight from the work of the Sunlight Foundation, the experience of Homicide Watch and why the World Bank is focused on open data journalism in developing countries.

The sites and themes that I explored in that talk will be familiar to Radar readers, focusing on the changing dynamic between the people formerly known as the audience and the editors, researchers and reporters who are charged with making sense of the data deluge for the public good. If you’ve watched one of my Ignites or my Berkman Center talk, much of this won’t be new to you, but the short talk should be a good overview of where I think this aspect of data journalism is going and why I think it’s worth paying attention to today.

For instance, at the Open Government Data Conference Bill Allison talked about how open data creates government accountability and reveals political corruption. We heard from Chris Amico, a data journalist who created a platform to help a court reporter tell the story of every homicide in a city. And we heard from Craig Hammer how the World Bank is working to build capacity in media organizations around the world to use data to show citizens how and where borrowed development dollars are being spent on their behalf.

The last point, regarding capacity, is a critical one. Just as McKinsey identified a gap between available analytic talent and the demand created by big data, there is a data science skills gap in journalism. Rapidly expanding troves of data are useless without the skills to analyze it, whatever the context. An over focus on tech skills could exclude the best candidates for these jobs — but there will need to be training to build them.

This reality hasn’t gone unnoticed by foundations or the academy. In May, the Knight Foundation gave Columbia University $2 million for research to help close the data science skills gap. (I expect to be talking to Emily Bell, Jonathan Stray and the other instructors and students.)

Media organizations must be able to put data to work, a need that was amply demonstrated during Hurricane Sandy, when public open government data feeds became critical infrastructure.

What I’d like to hear from you is what you see working around the world, from the Guardian to ProPublica, and what you’re working on, and where. To kick things off, I’d like to know which organizations are doing the most innovative work in data journalism.

Please weigh in through the comments or drop me a line at alex@oreilly.com or at @digiphile on Twitter.

November 02 2012

Charging up: Networking resources and recovery after Hurricane Sandy

Even though the direct danger from Hurricane Sandy has passed, lower Manhattan and many parts of Connecticut and New Jersey remain a disaster zone, with millions of people still without power, reduced access to food and gas, and widespread damage from flooding. As of yesterday, according to reports from Wall Street Journal, thousands of residents remain in high-rise buildings with no water, power or heat.

E-government services are in heavy demand, from registering for disaster aid to finding resources, like those offered by the Office of the New York City Advocate. People who need to find shelter can use the Red Cross shelter app. FEMA has set up a dedicated landing page for Hurricane Sandy and a direct means to apply for disaster assistance:

Public officials have embraced social media during the disaster as never before, sharing information about where to find help.

No power and diminished wireless capacity, however, mean that the Internet is not accessible in many homes. In the post below, learn more on what you can do on the ground to help and how you can contribute online.

For those who have lost power, using Twitter offline to stay connected to those updates is useful — along with using weather radios.

That said, for those that can get connected on mobile devices, there are digital resources emerging, from a crowdsourced Sandy coworking map in NYC to an OpenTrip Planner app for navigating affected transit options. This Google Maps mashup shows where to find food, shelter and charging stations in Hoboken, New Jersey.

In these conditions, mobile devices are even more crucial connectors to friends, family, services, resources and information. With that shift, government websites must be more mobile-friendly and offer ways to get information through text messaging.

Widespread power outages also mean that sharing the means to keep devices charged is now an act of community and charity.

Ways to to help with Sandy relief

A decade ago, if there was a disaster, you could donate money and blood. In 2012, you can also donate your time and skills. New York Times blogger Jeremy Zillar has compiled a list of hurricane recovery and disaster recovery resources. The conditions on the ground also mean that finding ways to physically help matter.

WNYC has a list of volunteer options around NYC. The Occupy Wall Street movement has shifted to “Occupy Sandy,” focusing on getting volunteers to help pick up and deliver food in neighborhoods around New York City. As Nick Judd reported for TechPresident, this “people-powered recovery” is volunteering to process incoming offers of help and requests for aid.

They’re working with Recovers.org, a new civic startup, which has now registered some 5,000 volunteers from around the New York City area. Recovers is pooling resources and supplies with community centers and churches to help in the following communities:

If you want to help but are far away from directly volunteering in New York, Connecticut or New Jersey, there are several efforts underway to volunteer online, including hackathons around the world tomorrow. Just as open government data feeds critical infrastructure during disasters, it is also integral to recovery and relief. To make that data matter to affected populations, however, the data must be put to use. That’s where the following efforts come in.

“There are a number of ways tech people can help right now,” commented Gisli Olafsson, Emergency Response Director at NetHope, reached via email. “The digital volunteer communities are coordinating many of those efforts over a Skype chat group that we established few days before Sandy arrived. I asked them for input and here are their suggestions:

  1. Sign up and participate in the crisis camps that are being organized this weekend at Geeks Without Borders and Sandy Crisis Camp.
  2. Help create visualizations and fill in the map gaps. Here is a link to all the maps we know about so far. Help people find out what map to look at for x,y,z.
  3. View damage photos to help rate damage assessments at Sandy OpenStreetMap. There are over 2000 images to identify and so far over 1000 helpers.”

Currently, there are Crisis Camps scheduled for Boston, Portland, Washington (DC), Galway (Ireland), San Francisco, Seattle, Auckland (NZ) and Denver, at RubyCon.

“If you are in any of those cities, please go the Sandy CrisisCamp blog post and sign up for the EventBrite for the CrisisCamp you want to attend in person or virtually,” writes Chad Catacchio (@chadcat), Crisis Commons communication lead.

“If you want to start a camp in your city this weekend, we are still open to the idea, but time is running short (it might be better to aim for next week),” he wrote.

UPDATE: New York-based nonprofit DataKind tweeted that they’re trying to rally the NY Tech community to pitch in real life on Saturday and linked to a new Facebook group. New York’s tech volunteers have already been at work helping city residents over the last 24 hours, with the New York Tech Meetup organizing hurricane recovery efforts.

People with technical skills in the New York area who want to help can volunteer online here and check out the NY Tech responds blog.

As Hurricane Sandy approached, hackers built tools to understand the storm. Now that it’s passed, “Hurricane Hackers” are working on projects to help with the recovery. The crisis camp in Boston will be hosted at the MIT Media Lab by Hurricane Hackers this weekend.

Sandy Crisis Camps already have several projects in the works. “We have been asked by FEMA to build and maintain a damage assessment map for the entire state of Rhode Island,” writes Catacchio. He continues:

“We will also be assisting in monitoring social media and other channels and directing reports to FEMA there. We’ll be building the map using ArcGIS and will be needing a wide range of skill sets from developers to communications to mapping. Before the weekend, we could certainly use some help from ArcGIS folks in getting the map ready for reporting, so if that is of interest, please email Pascal Schuback at pascal@crisiscommons.org. Secondly, there has been an ask by NYU and the consortium of colleges in NYC to help them determine hotel capacity/vacancy as well as gas stations that are open and serving fuel. If other official requests for aid come in, we will let the community know. Right now, we DO anticipate more official requests, and again, if you are working with the official response/recovery and need tech support assistance, please let us know: email either Pascal or David Black at david@crisiscommons.org. We are looking to have a productive weekend of tackling real needs to help the helpers on the ground serving those affected by this terrible storm.”

Related:

October 31 2012

NYC’s PLAN to alert citizens to danger during Hurricane Sandy

Starting at around 8:36 PM ET last night, as Hurricane Sandy began to flood the streets of lower Manhattan, many New Yorkers began to receive an unexpected message: a text alert on their mobile phones that strongly urged them to seek shelter. It showed up on iPhones:

…and upon Android devices:

While the message was clear enough, the way that these messages ended up on the screens may not have been clear to recipients or observers. And still other New Yorkers were left wondering why emergency alerts weren’t on their phones.

Here’s the explanation: the emergency alerts that went out last night came from New York’s Personal Localized Alerting Network, the “PLAN” the Big Apple launched in late 2011.

NYC chief digital officer Rachel Haot confirmed that the messages New Yorkers received last night were the result of a public-private partnership between the Federal Communications Commission, the Federal Emergency Management Agency, the New York City Office of Emergency Management (OEM), the CTIA and wireless carriers.

While the alerts may look quite similar to text messages, the messages themselves run in parallel, enabling them to get through txt traffic congestion. NYC’s PLAN is the local version of the Commercial Mobile Alert System (CMAS) that has been rolling out nation-wide over the last year.

“This new technology could make a tremendous difference during
disasters like the recent tornadoes in Alabama where minutes – or even seconds – of extra warning could make the difference between life and death,” said FCC chairman Julius Genachowski, speaking last May in New York City. “And we saw the difference alerting systems can make in Japan, where they have an earthquake early warning system that issued alerts that saved lives.”

NYC was the first city to have it up and running, last December, and less than a year later, the alerts showed up where and when they mattered.

The first such message I saw shared by a New Yorker actually came on October 28th, when the chief digital officer of the Columbia Journalism School, Sree Sreenivasan, tweeted about receiving the alert:

He tweeted out the second alert he received, on the night of the 29th, as well:

These PLAN alerts go out to everyone in a targeted geographic area with enabled mobile devices, enabling emergency management officials at the state and local level to get an alert to the right people at the right time. And in an emergency like a hurricane, earthquake or fire, connecting affected residents to critical information at the right time and place are essential.

While the government texting him gave national security writer Marc Ambinder some qualms about privacy, the way the data is handled looks much less disconcerting than, say, needing to opt-out of sharing location data or wireless wiretapping.

PLAN alerts are free and automatic, unlike opt-in messages from Notify NYC or signing up for email alerts from OEM.

Not all New Yorkers received an emergency alert during Sandy because not all mobile devices have the necessary hardware installed or have updated relevant software. In May 2011, new iPhones and Android devices already had the chip. (Most older phones, not so much.)

These alerts don’t go out for minor issues, either: the system is only used by authorized state, local or national officials during public safety emergencies. They send the alert to CMAS, it’s authenticated, and then the system pushes it out to all enabled devices in a geographic area.

Consumers receive only three types of messages: alerts issued by the President, Amber Alerts, and alerts involving “imminent threats to safety or life.” The last category covers the ones that went out about Hurricane Sandy in NYC last night.

According to the FCC, participating mobile carriers can allow their subscribers to block all but Presidential alerts, although it may be a little complicated to navigate a website or call center to do so. By 2014, every mobile phone sold in the United States must be CMAS-capable. (You can learn more about CMAS in this PDF). Whether such mobile phones should be subsidized for the poor is a larger question that will be left to the next administration.

As more consumers replace their devices in the years ahead, more people around the United States will also be able to receive these messages, benefiting from a public-private partnership that actually worked to deliver on improved public safety.

At least one New Yorker got the message and listened to it:

“If ‘act’ means stay put, then why yes I did,” tweeted Noreen Whysel, operations manager Information Architecture Institute. “It was enough to convince my husband from going out….”

Here’s hoping New York City doesn’t have use this PLAN to tell her and others about impending disaster again soon.

October 19 2012

San Francisco looks to tap into the open data economy

As interest in open data continues to grow around the world, cities have become laboratories for participatory democracy. They’re also ground zero for new experiments in spawning civic startups that deliver city services or enable new relationships between the people and city government. San Francisco was one of the first municipalities in the United States to embrace the city as a platform paradigm in 2009, with the launch of an open data platform.

Years later, the city government is pushing to use its open data to accelerate economic development. On Monday, San Francisco announced revised open data legislation to enable that change and highlighted civic entrepreneurs who are putting the city’s data to work in new mobile apps.

City staff have already published the revised open data legislation on GitHub. (If other cities want to “fork” it, clone away.) David Chiu, the chairman of the San Francisco Board of Supervisors, the city’s legislative body, introduced the new version on Monday and submitted it on Tuesday. A vote is expected before the end of the year.

Speaking at the offices of the Hatchery in San Francisco, Chiu observed that, by and large, the data that San Francisco has put out showed the city in a positive light. In the future, he suggested, that should change. Chiu challenged the city and the smartest citizens of San Francisco to release more data, figure out where the city could take risks, be more entrepreneurial and use data to hold the city accountable. In his remarks, he said that San Francisco is working on open budgeting but is still months away from getting the data that they need.

Rise of the CDO

This new version of the open data legislation will create a chief data officer (CDO) position, assign coordinators for open data in each city department, and make it clear in procurement language that the city owns data and retains access to it.

“Timelines, mandates and especially the part about getting them to inventory what data they collect are all really good,” said Luke Fretwell, founder of Govfresh, which covers open government in San Francisco. “It’s important that’s in place. Otherwise, there’s no way to be accountable. Previous directives didn’t do it.”

The city’s new CDO will “be responsible for sharing city data with the public, facilitating the sharing of information between City departments, and analyzing how data sets can be used to improve city decision making,” according to the revised legislation.

In creating a CDO, San Francisco is running a play from the open data playbooks of Chicago and Philadelphia. (San Francisco’s new CDO will be a member of the mayor’s staff in the budget office.) Moreover, the growth of CDOs around the country confirms the newfound importance of civic data in cities. If open government data is to be a strategic asset that can be developed for the public good, civic utility and economic value, it follows that it needs better stewards.

Assigning a coordinator in each department is also an acknowledgement that open data consumers need a point of contact and accountability. In theory, this could help create better feedback loops between the city and the cohort of civic entrepreneurs that this policy is aimed at stimulating.

Who owns the data?

San Francisco’s experience with NextBus and a conflict over NextMuni real-time data is a notable case study for other cities and states that are considering similar policies.

The revised legislation directs the Committee on Information Technology (COIT) to, within 60 days from the passage of the legislation, enact “rules for including open data requirements in applicable City contracts and standard contract provisions that promote the City’s open data policies, including, where appropriate, provisions to ensure that the City retains ownership of City data and the ability to post the data on data.sfgov.org or make it available through other means.”

That language makes it clear that it’s the city that owns city data, not a private company. That’s in line with a principle that open government data is a public good that should be available to the public, not locked up in a proprietary format or a for-pay database. There’s some nuance to the issue, in terms of thinking through what rights a private company that invests in acquiring and cleaning up government data holds, but the basic principle that the public should have access to public data is sound. The procurement practices in place will mean that any newly purchased system that captures structured data must have a public API.

Putting open data to work

Speaking at the Hatchery on Monday, Mayor Ed Lee highlighted three projects that each showcase open data put to use. The new Rec & Park app (iOS download), built by San Francisco-based startup Appallicious, enables citizens to find trails, dog parks, playgrounds and other recreational resources on a mobile device. “Outside” (iOS download), from San Francisco-based 100plus, encourages users to complete “healthy missions” in their neighborhoods. The third project, from mapping giant Esri, is a beautiful web-based visualization of San Francisco’s urban growth based upon open data from San Francisco’s planning departments.

The power of prediction

Over the past three years, transparency, accountability, cost savings and mobile apps have constituted much of the rationale for open data in cities. Now, San Francisco is renewing its pitch for the role of open data in job creation and combining increased efficiency and services.

Jon Walton, San Francisco’s chief information officer (CIO), identified two next steps for San Francisco in an interview earlier this year: working with other cities to create a federated model (now online at cities.data.gov) and using its own data internally to identify and solve issues. (San Francisco and cities everywhere will benefit from looking to New York City’s work with predictive data analytics.)

“We’re thinking about using data behind the firewalls,” said Walton. “We want to give people a graduated approach, in terms of whether they want to share data for themselves, to a department, to the city, or worldwide.”

On that count, it’s notable that Mayor Lee is now publicly encouraging more data sharing between private companies that are collecting data in San Francisco. As TechCrunch reported, the San Francisco government quietly passed a new milestone when it added to its open data platform private-sector datasets on pedestrian and traffic movement collected by Motionloft.

“This gives the city a new metric on when and where congestion happens, and how many pedestrians and vehicles indicate a slowdown will occur,” said Motionloft CEO Jon Mills, in an interview.

Mills sees opportunities ahead to apply predictive data analytics to life and death situations by providing geospatial intelligence for first responders in the city.

“We go even further when police and fire data are brought in to show the relation between emergency situations and our data,” he said. “What patterns cause emergencies in different neighborhoods or blocks? We’ll know, and the city will be able to avoid many horrible situations.”

Such data-sharing could have a real impact on department bottom lines: while “Twitter311” created a lot of buzz in the social media world, access to real-time transit data is what is estimated to have saved San Francisco more than $1 million a year by reducing the volume of San Francisco 311 calls by 21.7%.

Open data visualization can also enable public servants to understand how city residents are interacting and living in an urban area. For instance, a map of San Francisco pedestrian injuries shows high-injury corridors that merit more attention.

Open data and crowdsourcing will not solve all IT ills

While San Francisco was an early adopter of open data, that investment hasn’t changed an underlying reality: the city government remains burdened by a legacy of dysfunctional tech infrastructure, as detailed in a report issued in August 2012 by the City and County of San Francisco.

“San Francisco’s city-wide technology governing structure is ineffective and poorly organized, hampered by a hands-off Mayor, a weak Committee on Information Technology, an unreliable Department of Technology, and a departmentalized culture that only reinforces the City’s technological ineffectiveness,” state the report’s authors.

San Francisco government has embraced technologically progressive laws and rhetoric, but hasn’t always followed through on them, from setting deadlines to reforming human resources, code sharing or procurement.

“Departments with budgets in the tens of millions of dollars — including the very agency tasked with policing government ethics — still have miles to go,” commented Gov 2.0 advocate Adriel Hampton and former San Francisco government staffer in an interview earlier this year.

Hampton, who has turned his advocacy to legal standards for open data in California and to working at Nationbuilder, a campaign software startup, says that San Francisco has used technology “very poorly” over the past decade. While he credited the city’s efforts in mobile government and recent progress on open data, the larger system is plagued with problems that are endemic in government IT.

Hampton said the city’s e-government efforts largely remain in silos. “Lots of departments have e-services, but there has been no significant progress in integrating processes across departments, and some agencies are doing great while others are a mess,” commented Hampton. “Want to do business in SF? Here’s a sea of PDFs.”

The long-standing issues here go beyond policy, in his view. “San Francisco has a very fragmented IT structure, where the CIO doesn’t have real authority, and proven inability to deliver on multi-departmental IT projects,” he said. As an example, Hampton pointed to San Francisco’s Justice Information Tracking System, a $25 million, 10-year project that has made some progress, but still has not been delivered.

“The City is very good at creating feel-good requirements for its vendors that simply result in compliant companies marking up and reselling everything from hardware to IT software and services,” he commented. “This makes for not only higher costs and bureaucratic waste, but huge openings for fraud. Contracting reform was the number one issue identified in the ImproveSF employee ideation exercise in 2010, but it sure didn’t make the press release.”

Hampton sees the need for two major reforms to keep San Francisco on a path to progress: empowering the CIO position with more direct authority over departmental IT projects, and reforming how San Francisco procures technology, an issue he says affects all other parts of the IT landscape. The reason city IT is so bad, he says, its that it’s run by a 13-member council. “[The] poor CIO’s hardly got a shot.”

All that said, Hampton gives David Chiu and San Francisco city government high marks for their recent actions. “Bringing in Socrata to power the open data portal is a solid move and shows commitment to executing on the open data principle,” he said.

While catalyzing more civic entrepreneurship is important, creating enduring structural change in how San Francisco uses technology will require improving how the city government collects, stores, consumes and releases data, along with how it procures, governs and builds upon technology.

On that count, Chicago’s experience may be relevant. Efforts to open government data there have led to both progress and direction, as Chicago CTO John Tolva blogged in January:

“Open data and its analysis are the basis of our permission to interject the following questions into policy debate: How can we quantify the subject-matter underlying a given decision? How can we parse the vital signs of our city to guide our policymaking? … It isn’t just app competitions and civic altruism that prompts developers to create applications from government data. 2011 was the year when it became clear that there’s a new kind of startup ecosystem taking root on the edges of government. Open data is increasingly seen as a foundation for new businesses built using open source technologies, agile development methods, and competitive pricing. High-profile failures of enterprise technology initiatives and the acute budget and resource constraints inside government only make this more appealing.”

Open data and job creation?

While realizing internal efficiencies and cost savings are key requirements for city CIOs, they don’t hold the political cachet of new jobs and startups, particularly in an election year. San Francisco is now explicitly connecting its release of open data to jobs.

“San Francisco’s open data policies are creating jobs, improving our city and making it easier for residents and visitors to communicate with government,” commented Mayor Lee, via email.

Lee is optimistic about the future, too: “I know that, at the heart of this data, there will be a lot more jobs created,” he said on Monday at the Hatchery.

Open data’s potential for job creation is also complemented by its role as a raw material for existing businesses. “This legislation creates more opportunities for the Esri community to create data-driven decision products,” said Bronwyn Agrios, a project manager at Esri, in an interview.

Esri, however, as an established cloud mapping giant, is in a different position than startups enabled by open data. Communications strategist Brian Purchia, the former new media director for former San Francisco Mayor Gavin Newsom, points to Appallicious.

Appallicious “would not have been possible with [San Francisco's] open data efforts,” said Purchia. “They have have hired about 10 folks and are looking to expand to other cities.”

The startup’s software drives the city’s new Rec & Park app, including the potential to enable mobile transactions in the next iteration.

“Motionloft will absolutely grow from our involvement in San Francisco open data,” said Motionloft CEO Mills. “By providing some great data and tools to the city of San Francisco, it enables Motionloft to develop solutions for other cities and government agencies. We’ll be hiring developers, sales people, and data experts to keep up with our plans to grow this nationwide, and internationally.”

The next big question for these startups, as with so many others in nearby Silicon Valley, is whether their initial successes can scale. For that to happen for startups that depend upon government data, other cities will not only need to open up more data, they’ll need to standardize it.

Motionloft, at least, has already moved beyond the Bay Area, although other cities haven’t incorporated its data yet. Esri, as a major enterprise provider of proprietary software to local governments, has some skin in this game.

“City governments are typically using Esri software in some capacity,” said Agrios. “It will certainly be interesting to see how geo data standards emerge given the rapid involvement of civic startups eagerly consuming city data. Location-aware technologists on both sides of the fence, private and public, will need to work together to figure this out.”

If the marketplace for civic applications based upon open data develops further, it could help with a key issue that has dogged the results of city app contests: sustainability. It could also help with a huge problem for city governments: the cost of providing e-services to more mobile residents as budgets continue to tighten.

San Francisco CIO Walton sees an even bigger opportunity for the growth of civic apps that go far beyond the Bay Area, if cities can coordinate their efforts.

“There’s lots of potential here,” Walton said. “The challenge is replicating successes like Open311 in other verticals. If you look at the grand scale of time, we’re just getting started. For instance, I use Nextbus, an open source app that uses San Francisco’s open data … If I have Nextbus on my phone, when I get off a plane in Chicago or New York City, I want to be able to use it there, too. I think we can achieve that by working together.”

If a national movement toward open data and civic apps gathers more momentum, perhaps we’ll solve a perplexing problem, mused Walton.

“In a sense, we have transferred the intellectual property for apps to the public,” he said. “On one hand, that’s great, but I’m always concerned about what happens when an app stops working. By creating data standards and making apps portable, we will create enough users so that there’s enough community to support an application.”

Related:

October 17 2012

Data from health care reviews could power “Yelp for health care” startups

A hospital in MaineA hospital in MaineGiven where my work and health has taken me this year, I’ve been thinking much more about the relationship of the Internet and health data to accountability and patient-driven health care.

When I was looking for a place in Maine to go for care this summer, I went online to look at my options. I consulted hospital data from the government at HospitalCompare.HHS.gov and patient feedback data on Yelp, and then made a decision based upon proximity and those ratings. If I had been closer to where I live in Washington D.C., I would also have consulted friends, peers or neighbors for their recommendations of local medical establishments.

My brush with needing to find health care when I was far from home reminded me of the prism that collective intelligence can now provide for the treatment choices we make, if we have access to the Internet.

Patients today are sharing more of their health data and experiences online voluntarily, which in turn means that the Internet is shaping health care. There’s a growing phenomenon of “e-patients” and caregivers going online to find communities and information about illness and disability.

Aided by search engines and social media, newly empowered patients are discussing health conditions with others suffering from disease and sickness — and they’re taking that peer-to-peer health care knowledge into their doctors’ offices with them, frequently on mobile devices. E-patients are sharing their health data of their own volition because they have a serious health condition, want to get healthy, and are willing.

From the perspective of practicing physicians and hospitals, the trend of patients contributing to and consulting on online forums adds the potential for errors, fraud, or misunderstanding. And yet, I don’t think there’s any going back from a networked future of peer-to-peer health care, anymore than we can turn back the dial on networked politics or disaster response.

What’s needed in all three of these areas is better data that informs better data-driven decisions. Some of that data will come from industry, some from government, and some from citizens.

This fall, the Obama administration proposed a system for patients to report medical mistakes. The system would create a new “consumer reporting system for patient safety” that would enable patients to tell the federal government about unsafe practices or errors. This kind of review data, if validated by government, could be baked into the next generation of consumer “choice engines,” adding another layer for people, like me, searching for care online.

There are precedents for the collection and publishing of consumer data, including the Consumer Product Safety Commission’s public complaint database at SaferProducts.gov and the Consumer Financial Protection Bureau’s complaint database. Each met with initial resistance by industry but have successfully gone online without massive abuse or misuse, at least to date.

It will be interesting to see how medical associations, hospitals and doctors react. Given that such data could amount to government collecting data relevant to thousands of “Yelps for health care,” there’s both potential and reason for caution. Health care is a bit different than product safety or consumer finance, particularly with respect to how a patient experiences or understands his or her treatment or outcomes for a given injury or illness. For those that support or oppose this approach, there is an opportunity for public comment on proposed data collection at the Federal Register.

The power of performance data

Combining patients review data with government-collected performance data could be quite powerful in helping to drive better decisions and adding more transparency to health care.

In the United Kingdom, officials are keen to find the right balance between open data, transparency and prosperity.

“David Cameron, the Prime Minister, has made open data a top priority because of the evidence that this public asset can transform outcomes and effectiveness, as well as accountability,” said Tim Kelsey, in an interview this year. He used to head up the United Kingdom’s transparency and open data efforts and now works at its National Health Service.

“There is a good evidence base to support this,” said Kelsey. “Probably the most famous example is how, in cardiac surgery, surgeons on both sides of the Atlantic have reduced the number of patient deaths through comparative analysis of their outcomes.”

More data collected by patients, advocates, governments and industry could help to shed light on the performance of more physicians and clinics engaged in other expensive and lifesaving surgeries and associated outcomes.

Should that be extrapolated across the medical industry, it’s a safe bet that some medical practices or physicians will use whatever tools or legislative influence they have to fight or discredit websites, services or data that puts them in a poor light. This might parallel the reception that BrightScope’s profiles of financial advisors have received in industry.

When I talked recently with Dr. Atul Gawande about health data and care givers, he said more transparency in these areas is crucial:

“As long as we are not willing to open up data to let people see what the results are, we will never actually learn. The experience of what happens in fields where the data is open is that it’s the practitioners themselves that use it.”

In that context, health data will be the backbone of the disruption in health care ahead. Part of that change will necessarily have to come from health care entrepreneurs and watchdogs connecting code to research. In the future, a move to open science and perhaps establish a health data commons could accelerate that change.

The ability of caregivers and patients alike to make better data-driven decisions is limited by access to data. To make a difference, that data will also need to be meaningful to both the patient and the clinician, said Dr. Gawande. He continued:

“[Health data] needs to be able to connect the abstract world of data to the physical world of what really happens, which means it has to be timely data. A six-month turnaround on data is not great. Part of what has made Wal-Mart powerful, for example, is they took retail operations from checking their inventory once a month to checking it once a week and then once a day and then in real-time, knowing exactly what’s on the shelves and what’s not. That equivalent is what we’ll have to arrive at if we’re to make our systems work. Timeliness, I think, is one of the under-recognized but fundamentally powerful aspects because we sometimes over prioritize the comprehensiveness of data and then it’s a year old, which doesn’t make it all that useful. Having data that tells you something that happened this week, that’s transformative.”

Health data, in other words, will need to be open, interoperable, timely, higher quality, baked into the services that people use, and put at the fingertips of caregivers, as US CTO Todd Park explains in the video below:

There is more that needs to be done than simply putting “how to live better” information online or into an app. To borrow a phrase from Robert Kirkpatrick, for data to change health care, we’ll need to apply the wisdom of the crowds, the power of algorithms and the intuition of experts to find meaning in health data and help patients and caregivers alike make better decisions.

That isn’t to say that health data, once published, can’t be removed or filtered. Witness the furor over the removal of a malpractice database from the Internet last year, along with its restoration.
But as more data about doctors, services, drugs, hospitals and insurance companies goes online, the ability of those institutions to control public perception of the institutions will shift, just as it has with government and media. Given flaws in devices or poor outcomes, patients deserve such access, accountability and insight.

Enabling better health-data-driven decisions to happen across the world will be far from easy. It is, however, a future worth building toward.

Reposted byfortmyersrealty fortmyersrealty

October 03 2012

The missing ingredient from hyperwired debates: the feedback loop

PodiumPodiumWhat a difference a season makes. A few months after widespread online frustration with a tape-delayed Summer Olympics, the 2012 Presidential debates will feature the most online livestreams and wired, up-to-the-second digital coverage in history.

Given the pace of technological change, it’s inevitable that each election season will bring with it new “firsts,” as candidates and campaigns set precedents by trying new approaches and platforms. This election has been no different: the Romney and Obama campaigns have been experimenting with mobile applications, social media, live online video and big data all year.

Tonight, one of the biggest moments in the presidential campaign to date is upon us and there are several new digital precedents to acknowledge.

The biggest tech news is that YouTube, in a partnership with ABC, will stream the debates online for the first time. The stream will be on YouTube’s politics channel, and it will be embeddable.

With more and more livestreamed sports events, concerts and now debates available online, tuning in to what’s happening no longer means passively “watching TV.” The number of other ways people can tune in online in 2012 has skyrocketed, as you can see in GigaOm’s post listing debate livestreams or Mashable’s ways to watch the debates online.

This year, in fact, the biggest challenge people will have will not be finding an online alternative to broadcast or cable news but deciding which one to watch.

If you’re low on bandwidth or have a mobile device, NPR will stream the audio from the debate online and to its mobile apps. If you’re a Spanish speaker, Univision will stream the debates on YouTube with real-time translation.

The New York Times, Politico and Wall Street Journal are both livestreaming the debates at their websites or through their apps, further eroding the line between broadcast, print and online media.

While the PBS News Hour and CSPAN’s debate hub are good options, my preference is for the Sunlight Foundation’s award-winning Sunlight Live liveblog.

There are a couple of other notable firsts. The Huffington Post will deploy its HuffPost Live platform for the first time, pulling more viewers directly into participatory coverage online.

For those looking for a more… animated approach, the Guardian and Tumblr will ‘live GIF’ the presidential debates.

Microsoft is livestreaming the debates through the XBox, giving gamers an opportunity to weigh in on what they see through their Xboxes. They’ll be polled through the Xbox console during the debate, which will provide more real-time data from a youthful demographic that, according StrategyOne, still has many voters who are not firmly committed.

Social politics

The political news cycle has long since moved from the morning papers and the nightly news to real-time coverage of events. In past years, the post-debate spin by campaigns and pundits shaped public opinion. This year, direct access to online video and to the reaction of friends, family, colleagues and media through the social web means that the spin will begin as soon as any quip, policy position or rebuttal is delivered in the debate.

Beyond real-time commentary, social media will provide useful data for the campaigns to analyze. While there won’t be a “do over,” seeing what resonated directly with the public will help the campaigns tune their messages for the next debates.

Tonight, when I go on Al Jazeera’s special debate night coverage at The Stream, I’ll be looking at a number of factors. I expect the #DenverDebate and #debates hashtags to be moving too fast to follow, so I’ll be looking at which tweets are being amplified and what we can see on Twitter’s new #debates page, what images are popping online, which links are popular, how Facebook and Google+ are reacting, and what people are searching for on Google.com.

This is quite likely to be the most social political event ever, surpassing either of the 2012 political conventions or the State of the Union address. When I watch online, I’ll be looking for what resonated with the public, not just what the campaigns are saying — although that will factor into my analysis. The @mittromney account tweets 1-2 times a day. Will they tweet more? Will @barackobama’s 19 million followers be engaged? How much and how often will they update Facebook, and to what effect?

Will they live tweet open statements with links to policies? Will they link to rebuttals or fact checks in the media? Will they push people to go register or comment or share? Will they echo applause lines or attack lines? In a larger sense, will the campaigns act social, themselves? Will they reshare the people’s posts about them on social platforms or keep broadcasting?

We’ll know answers to all of these questions in a few hours.

Fact-checking in real-time

Continuing a trend from the primary season, real-time fact-checking will play a role in the debate. The difference in this historic moment is it will be the pace of it and the number of players.

As Nick Judd highlighted at techPresident, the campaign response is going to be all about mobile. Both campaigns will be trying their hands at fact checking, using new adaptive microsites at barackobama.com/debate and debates.mittromney.com, dedicated Twitter accounts at @TruthTeam2012 and and @RomneyResponse, and an associated subdomain and Tumblr.

Given the skin that campaigns have in the game, however, undecided or wavering voters are better off going with the Fourth Estate versions. Wired media organizations, like the newspapers streaming the debates I’ve listed above, will be using liveblogs and leveraging their digital readership to help fact check.

Notably, NPR senior social strategist Andy Carvin will be applying the same approach to fact checking during the debate as he has to covering the changes in the Middle East. To participate, follow @acarvin and use the #factcheck hashtag beginning at 8:30 ET.

It’s unclear whether debate moderator Jim Lehrer will tap into the fact-checking efforts online to push back on the candidates during the event. Then again, the wisdom of the crowds may be balanced by one man’s perspective. Given that he’s serving in that capacity for the 12th time, Lehrer possesses substantial experience of his own to draw upon in making his own decisions about when to press, challenge or revisit issues.

The rise of networked polities

In a larger sense, all of this interactivity falls fall short of the promise of networked politics in the Internet age. In the age of the Internet, television debates look antiquated.

When it comes to how much the people are directly involved with the presidential debates of 2012, as Micah Sifry argued earlier this week, little has changed from 2008:

“Google is going to offer some kind of interactive audience dial gadget for YouTube users, which could allow for real-time audience feedback — except it’s already clear none of that feedback is going to get anywhere near the actual debate itself. As best as I can tell, what the CPD [Commission on Presidential Debates] is doing is little more than what they did four years ago, except back then they partnered with Myspace on a site called MyDebates.org that featured video streaming, on-demand playback and archival material. Oh, but this time the partner sites will include a dynamic counter showing how many people have ‘shared their voice’.”

While everyone who has access to the Internet will be able to use multiple screens to watch, read and participate in the conversation around the debates, the public isn’t going to be directly involved in the debate. That’s a missed opportunity that won’t be revisited until the 2016 campaign.

By then, it will be an even more wired political landscape. While many politicians are still delegating the direct use of social media use to staffers, in late 2012 it ill behooves any office to be seen as technically backward and stay off them entirely.

In the years ahead, open government advocates will push politicians to use the Internet to explain their votes, not just broadcast political attacks or campaign events. After all, the United States is a constitutional republic. Executives and Congressmen are obligated to listen to the people they represent. The existing ecosystem of social media platforms may give politicians new tools to interact directly with their constituents but they’re still relatively crude.

Yes, the next generation of social media data analytics will give politicians a dashboard of what their constituents think about their positions. It’s the next generation of polling. In the years to come, however, I’m optimistic that we’re going to see much better use of the Internet to hold politicians accountable for their campaign positions and subsequent votes.

Early experiments in creating an “OKCupid for elections” will evolve. Expect sophisticated choice engines that use social and legislative data to tell voters not only whether candidates share their positions but whether they actually voted or acted upon them. Over time, opposition candidates will be able to use that accumulated data in their campaign platforms and during debates. If a member of Congress or President doesn’t follow through with the wishes of the people, he or she will have to explain why. That will be a debate worth having.

September 28 2012

Four key trends changing digital journalism and society

See something or say something: Los AngelesSee something or say something: Los AngelesIt’s not just a focus on data that connects the most recent class of Knight News Challenge winners. They all are part of a distributed civic media community that works on open source code, collects and improves data, and collaborates across media organizations.

These projects are “part of an infrastructure that helps journalists better understand and serve their communities through data,” commented Chris Sopher, Knight Foundation Journalism Program Associate, in an interview last week. To apply a coding metaphor, the Knight Foundation is funding the creation of patches for the source code of society. This isn’t a new focus: in 2011, Knight chose to help build the newsroom stack, from editorial search engines to data cleaning tools.

Following are four themes that jumped out when I looked across the winners of the latest Knight News Challenge round.

Networked accountability

An intercontinental project that bridged citizen science, open data, open source hardware, civic hacking and the Internet of things to monitor, share and map radiation data? Safecast is in its own category. Adapting the system to focus on air quality in Los Angeles — a city that’s known for its smog — will be an excellent stress test for seeing if this distributed approach to networked accountability can scale.

If it does — and hacked Chumbys, LED signs, Twitter bots, smartphone apps and local media reports start featuring the results — open data is going to be baked into how residents of Los Angeles understand their own atmosphere. If this project delivers on some of its promise, the value of this approach will be clearer.

If this project delivers on all of its potential, the air itself might improve. For that to happen, the people who are looking at the realities of air pollution will need to advocate for policy makers to improve it. In the future, the success or failure of this project will inform similar efforts that seek to enlist communities in data collection, including whether governments embrace “citizensourcing” beyond natural disasters and crises. The idea of citizens as sensors continues to have legs.

Peer-to-peer collaboration, across newsrooms

As long as I’ve been reading newspapers, watching television news and following the industry, competition has always been part of the dynamic: be first to the scene, first to get the scoop, first to call the election. As the Internet has taken on a larger role in delivering the news, there have been new opportunities for competition in digital journalism: first to tweet, post or upload video, often followed by rewards from online traffic.

One (welcome) reality that jumps out in this series of Knight grants is that there are journalists from newsrooms that compete for stories who are collaborating on these projects independently. New York Times and Washington Post developers are teaming up to create an open election database. Data journalists from WNYC, the Chicago Tribune and the Spokesman-Review are collaborating on building a better interface for Census data. The same peer networks that helped build the Internet are forming around building out civic infrastructure. It’s an inspiring trend to watch.

The value of an open geo commons

The amount of consternation regarding Apple’s new mapping app for iOS 6 doesn’t seem to be dying down. It shouldn’t: David Pogue called the Apple Map app “an appalling first release,” and maybe “the most embarrassing, least usable piece of software Apple has ever unleashed.” It’s going to take a while for Apple Maps to improve — maybe even years, based upon how long it took for Google to improve maps. In the meantime, iPhone users can go to maps.google.com on Safari, along with the other third-party alternatives that Apple CEO Tim Cook recommended in his letter of apology.

In the wake of “#MAppleGate,” there’s suddenly a lot more attention being paid to the importance and value of mapping data, including how difficult it is to do maps right. And that’s where OpenStreetMap comes in. That’s also why the Knight Foundation is putting more than $500,000 behind tools from Development Seed: it will help to sustain and improve an open geo data commons that media organizations large and small can tap into to inform communities using maps.

“There are two ways the geo data space is going to evolve: 1) in closed silos of proprietary owned data or 2) in the open,” said Eric Gundersen, co-founder and CEO of Development Seed in a recent interview. “Our community does not need a fleet of cars driving millions of miles. We need good infrastructure to make it easy for people to map their surroundings and good community tools to help us garden the data and improve quality. As geo data becomes core to mobile, maps are a canvas to visualizing the ‘where’.”

As with Wikipedia, there will be people who doubt whether an open source digital map revolution enabled by MapBox, Development Seed’s open source mapping suite will come to pass. Then again, how many people believed a decade ago that Wikipedia would grow into the knowledge repository it is today?

“We are trying to radically lower the barrier of entry to map making for organizations and activists,” Gundersen told me last April. Given that they’re up against Google in mapmaking, the relatively tiny DC startup is banking on OpenStreetMap looking more like Wikipedia than Google Knol in a few years.

“Open” is in

Open data is a common thread that connects the winners — but the openness doesn’t stop there. Open maps. Open source. Open government. Open journalism. That this theme has emerged as a strong pulse isn’t a tremendous surprise, given a global movement to apply technology to open government. Moreover, no one should take this to mean that immense amounts of business, society, technology, media and government aren’t still closed. Clearly, that’s not the situation. But there’s a strong case to be made that open is the way of the day.

Data won’t save the world, on its own. However, when data is applied for the public good and put to work, there are a growing number of examples that raise optimism about data’s role in the future of journalism.

Photo Credit: Eric Fisher

September 20 2012

Congress launches Congress.gov in beta, doesn’t open the data

The Library of Congress is now more responsive — at least when it comes to web design. Today, the nation’s repository for its laws launched a new beta website at Congress.gov and announced that it would eventually replace Thomas.gov, the 17-year-old website that represented one of the first significant forays online for Congress. The new website will educate the public looking for information on their mobile devices about the lawmaking process, but it falls short of the full promise of embracing the power of the Internet. (More on that later).

Tapping into a growing trend in government new media, the new Congress.gov features responsive design, adapting to desktop, tablet or smartphone screens. It’s also search-centric, with Boolean search and, in an acknowledgement that most of its visitors show up looking for information, puts a search field front and center in the interface. The site includes member profiles for U.S. Senators and Representatives, with associated legislative work. In a nod to a mainstay of social media and media websites, the new Congress.gov also has a “most viewed bills” list that lets visitors see at a glance what laws or proposals are gathering interest online. (You can download a fact sheet on all the changes as a PDF).

On the one hand, the new Congress.gov is a dramatic update to a site that desperately needed one, particularly in a historic moment where citizens are increasingly connecting to the Internet (and one another) through their mobile devices.

On the other hand, the new Congress.gov beta has yet to realize the potential of Congress publishing bulk open legislative data. There is no application programming interface (API) for open government developers to build upon. In many ways, the new Congress.gov replicates what was already available to the public at sites like Govtrack.us and OpenCongress.org.

In response to my tweets about the site, former law librarian Meg Lulofs Kuhagan (@librarylulu) noted on Twitter that there’s “no data whatsoever, just window dressing” in the new site — but that “it looks good on my phone. More #opengov if you have a smartphone.”

Aaron E. Myers, the director of new media for Senator Major Leader Harry Reid, commented on Twitter that legislative data is a “tough nut to crack,” with the text of amendments, SCOTUS votes and treaties missing from new Congress.gov. In reply, Chris Carlson, the creative director for the Library of Congress, tweeted that that information is coming soon and that all the data that is currently in Thomas.gov will be available on Congress.gov.

Emi Kolawole, who reviewed the new Congress.gov for the Washington Post, reported that more information, including the categories Meyers cited, will be coming to the site soon, during its beta, including the Congressional Record and Index. Here’s hoping that Congress decides to publish all of its valuable Congressional Research Reports, too. Currently, the public has to turn to OpenCRS.com to access that research.

Carlson was justifiably proud of the beta of Congress.gov: “The new site has clean URLs, powerful search, member pages, clean design,” he tweeted. “This will provide access to so many more people who only have a phone for internet.”

While the new Congress.gov is well designed and has the potential to lead to more informed citizens, the choice to build a new website versus release the data disappointed some open government advocates.

“Another hilarious/clueless misallocation of resources,” commented David Moore, co-founder of OpenCongress. “First liberate bulk open gov data; then open API; then website.”

“What’s noticeable about this evolving beta website, besides the major improvements in how people can search and understand legislative developments, is what’s still missing: public comment on the design process and computer-friendly bulk access to the underlying data,” wrote Daniel Schuman, legislative counsel for the Sunlight Foundation. “We hope that Congress will now deeply engage with the public on the design and specifications process and make sure that legislative information is available in ways that most encourage analysis and reuse.”

Kolawole asked Congressional officials about bulk data access and an API and heard that the capacity is there but the approval is not. “They said the system could handle it, but they haven’t received congressional auth. to do it yet,” she tweeted.

Vision and bipartisan support for open government on this issue does exist among Congressional leadership. There has been progress on this front in the 112th Congress: the U.S. House started publishing machine-readable legislative data at docs.house.gov this past January.

“Making legislative data easily available in machine-readable formats is a big victory for open government, and another example of the new majority keeping its pledge to make Congress more open and accountable,” said Speaker of the House John Boehner.

Last December, House Minority Whip Steny Hoyer commented upon on how technology is affecting Congress, his caucus and open government in the executive branch:

For Congress, there is still a lot of work to be done, and we have a duty to make the legislative process as open and accessible as possible. One thing we could do is make THOMAS.gov — where people go to research legislation from current and previous Congresses — easier to use, and accessible by social media. Imagine if a bill in Congress could tweet its own status.

The data available on THOMAS.gov should be expanded and made easily accessible by third-party systems. Once this happens, developers, like many of you here today, could use legislative data in innovative ways. This will usher in new public-private partnerships that will empower new entrepreneurs who will, in turn, yield benefits to the public sector.

For any of that vision of civic engagement and entrepreneurship to can happen around Web, the Library of Congress will need to fully open up the data. Why hasn’t it happened yet, given bipartisan support and a letter from the Speaker of the House?

techPresident managing editor Nick Judd asked the Library of Congress about Congress.gov. The director of the communications for the Library of Congress, Gayle Osterberg, suggested in an email in response that Congress hasn’t been clear about the manner for data release.

“Congress has said what to do on bulk access,” commented Schuman. “See the joint explanatory statement. “There is support for bulk access.”

In June 2012, the House’s leadership has issued a bipartisan statement that adopted the goal of “provid[ing] bulk access to legislative information to the American people without further delay,” putting releasing bulk data among its “top priorities in the 112th Congress” and directed a task force “to begin its important work immediately.”

The 112th Congress will come to a close soon. The Republicans swept into the House in 2010 promising a new era of innovation and transparency. If Speaker Boehner, Rep. Hoyer and their colleagues want to end these two divisive years on a high note, fully opening legislative data to the People would be an enduring legacy. Congressional leaders will need to work with the Library of Congress to make that happen.

All that being said, the new Congress.gov is in beta and looks dramatically improved. The digital infrastructure of the federal legislative system got a bit better today, moving towards a more adaptive government. Stay tuned, and give the Library of Congress (@LibraryCongress) some feedback: there’s a new button for it on every page.

This post has been updated with comments from Facebook, a link and reporting from techPresident, and a clarification from Daniel Schuman regarding the position of the House of Representatives.

September 13 2012

Growth of SMART health care apps may be slow, but inevitable

This week has been teaming with health care conferences, particularly in Boston, and was declared by President Obama to be National Health IT Week as well. I chose to spend my time at the second ITdotHealth conference, where I enjoyed many intense conversations with some of the leaders in the health care field, along with news about the SMART Platform at the center of the conference, the excitement of a Clayton Christiansen talk, and the general panache of hanging out at the Harvard Medical School.

SMART, funded by the Office of the National Coordinator in Health and Human Services, is an attempt to slice through the Babel of EHR formats that prevent useful applications from being developed for patient data. Imagine if something like the wealth of mash-ups built on Google Maps (crime sites, disaster markers, restaurant locations) existed for your own health data. This is what SMART hopes to do. They can already showcase some working apps, such as overviews of patient data for doctors, and a real-life implementation of the heart disease user interface proposed by David McCandless in WIRED magazine.

The premise and promise of SMART

At this conference, the presentation that gave me the most far-reaching sense of what SMART can do was by Nich Wattanasin, project manager for i2b2 at Partners. His implementation showed SMART not just as an enabler of individual apps, but as an environment where a user could choose the proper app for his immediate needs. For instance, a doctor could use an app to search for patients in the database matching certain characteristics, then select a particular patient and choose an app that exposes certain clinical information on that patient. In this way, SMART an combine the power of many different apps that had been developed in an uncoordinated fashion, and make a comprehensive data analysis platform from them.

Another illustration of the value of SMART came from lead architect Josh Mandel. He pointed out that knowing a child’s blood pressure means little until one runs it through a formula based on the child’s height and age. Current EHRs can show you the blood pressure reading, but none does the calculation that shows you whether it’s normal or dangerous. A SMART app has been developer to do that. (Another speaker claimed that current EHRs in general neglect the special requirements of child patients.)

SMART is a close companion to the Indivo patient health record. Both of these, aong with the i2b2 data exchange system, were covered in article from an earlier conference at the medical school. Let’s see where platforms for health apps are headed.

How far we’ve come

As I mentioned, this ITdotHealth conference was the second to be held. The first took place in September 2009, and people following health care closely can be encouraged by reading the notes from that earlier instantiation of the discussion.

In September 2009, the HITECH act (part of the American Recovery and Reinvestment Act) had defined the concept of “meaningful use,” but nobody really knew what was expected of health care providers, because the ONC and the Centers for Medicare & Medicaid Services did not release their final Stage 1 rules until more than a year after this conference. Aneesh Chopra, then the Federal CTO, and Todd Park, then the CTO of Health and Human Services, spoke at the conference, but their discussion of health care reform was a “vision.” A surprisingly strong statement for patient access to health records was made, but speakers expected it to be accomplished through the CONNECT Gateway, because there was no Direct. (The first message I could find on the Direct Project forum dated back to November 25, 2009.) Participants had a sophisticated view of EHRs as platforms for applications, but SMART was just a “conceptual framework.”

So in some ways, ONC, Harvard, and many other contributors to modern health care have accomplished an admirable amount over three short years. But some ways we are frustratingly stuck. For instance, few EHR vendors offer API access to patient records, and existing APIs are proprietary. The only SMART implementation for a commercial EHR mentioned at this week’s conference was one created on top of the Cerner API by outsiders (although Cerner was cooperative). Jim Hansen of Dossia told me that there is little point to encourage programmers to create SMART apps while the records are still behind firewalls.

Keynotes

I couldn’t call a report on ITdotHealth complete without an account of the two keynotes by Christiansen and Eric Horvitz, although these took off in different directions from the rest of the conference and served as hints of future developments.

Christiansen is still adding new twists to the theories laid out in c The Innovator’s Dilemma and other books. He has been a backer of the SMART project from the start and spoke at the first ITdotHealth conference. Consistent with his famous theory of disruption, he dismisses hopes that we can reduce costs by reforming the current system of hospitals and clinics. Instead, he projects the way forward through technologies that will enable less trained experts to successively take over tasks that used to be performed in more high-cost settings. Thus, nurse practitioners will be able to do more and more of what doctors do, primary care physicians will do more of what we current delegate to specialists, and ultimately the patients and their families will treat themselves.

He also has a theory about the progression toward openness. Radically new technologies start out tightly integrated, and because they benefit from this integration they tend to be created by proprietary companies with high profit margins. As the industry comes to understand the products better, they move toward modular, open standards and become commoditized. Although one might conclude that EHRs, which have been around for some forty years, are overripe for open solutions, I’m not sure we’re ready for that yet. That’s because the problems the health care field needs to solve are quite different from the ones current EHRs solve. SMART is an open solution all around, but it could serve a marketplace of proprietary solutions and reward some of the venture capitalists pushing health care apps.

While Christiansen laid out the broad environment for change in health care, Horvitz gave us a glimpse of what he hopes the practice of medicine will be in a few years. A distinguished scientist at Microsoft, Horvitz has been using machine learning to extract patterns in sets of patient data. For instance, in a collection of data about equipment uses, ICD codes, vital signs, etc. from 300,000 emergency room visits, they found some variables that predicted a readmission within 14 days. Out of 10,000 variables, they found 500 that were relevant, but because the relational database was strained by retrieving so much data, they reduced the set to 23 variables to roll out as a product.

Another project predicted the likelihood of medical errors from patient states and management actions. This was meant to address a study claiming that most medical errors go unreported.

A study that would make the privacy-conscious squirm was based on the willingness of individuals to provide location data to researchers. The researchers tracked searches on Bing along with visits to hospitals and found out how long it took between searching for information on a health condition and actually going to do something about it. (Horvitz assured us that personally identifiable information was stripped out.)

His goal is go beyond measuring known variables, and to find new ones that could be hidden causes. But he warned that, as is often the case, causality is hard to prove.

As prediction turns up patterns, the data could become a “fabric” on which many different apps are based. Although Horvitz didn’t talk about combining data sets from different researchers, it’s clearly suggested by this progression. But proper de-identification and flexible patient consent become necessities for data combination. Horvitz also hopes to move from predictions to decisions, which he says is needed to truly move to evidence-based health care.

Did the conference promote more application development?

My impression (I have to admit I didn’t check with Dr. Ken Mandl, the organizer of the conference) was that this ITdotHealth aimed to persuade more people to write SMART apps, provide platforms that expose data through SMART, and contribute to the SMART project in general. I saw a few potential app developers at the conference, and a good number of people with their hands on data who were considering the use of SMART. I think they came away favorably impressed–maybe by the presentations, maybe by conversations that the meeting allowed them to have with SMART developers–so we may see SMART in wider use soon. Participants came far for the conference; I talked to one from Geneva, for instance.

The presentations were honest enough, though, to show that SMART development is not for the faint-hearted. On the supply side–that is, for people who have patient data and want to expose it–you have to create a “container” that presents data in the format expected by SMART. Furthermore, you must make sure the data conforms to industry standards, such as SNOMED for diagnoses. This could be a lot of conversion.

On the application side, you may have to deal with SMART’s penchant for Semantic Web technologies such as OWL and SPARQL. This will scare away a number of developers. However, speakers who presented SMART apps at the conference said development was fairly easy. No one matched the developer who said their app was ported in two days (most of which were spent reading the documentation) but development times could usually be measured in months.

Mandl spent some time airing the idea of a consortium to direct SMART. It could offer conformance tests (but probably not certification, which is a heavy-weight endeavor) and interact with the ONC and standards bodies.

After attending two conferences on SMART, I’ve got the impression that one of its most powerful concepts is that of an “app store for health care applications.” But correspondingly, one of the main sticking points is the difficulty of developing such an app store. No one seems to be taking it on. Perhaps SMART adoption is still at too early a stage.

Once again, we are batting our heads up against the walls erected by EHRs to keep data from being extracted for useful analysis. And behind this stands the resistance of providers, the users of EHRs, to give their data to their patients or to researchers. This theme dominated a federal government conference on patient access.

I think SMART will be more widely adopted over time because it is the only useful standard for exposing patient data to applications, and innovation in health care demands these apps. Accountable Care Organizations, smarter clinical trials (I met two representatives of pharmaceutical companies at the conference), and other advances in health care require data crunching, so those apps need to be written. And that’s why people came from as far as Geneva to check out SMART–there’s nowhere else to find what they need. The technical requirements to understand SMART seem to be within the developers’ grasps.

But a formidable phalanx of resistance remains, from those who don’t see the value of data to those who want to stick to limited exchange formats such as CCDs. And as Sean Nolan of Microsoft pointed out, one doesn’t get very far unless the app can fit into a doctor’s existing workflow. Privacy issues were also raised at the conference, because patient fears could stymie attempts at sharing. Given all these impediments, the government is doing what it can; perhaps the marketplace will step in to reward those who choose a flexible software platform for innovation.

August 29 2012

President Obama participates in first Presidential AMA on Reddit

Starting around 4:30 PM ET today, President Barack Obama made history by going onto Reddit to answer questions about anything for an hour. Reddit, one of the most popular social news sites on the Internet, has been hosting “Ask Me Anything” forums — or AMAs – for years, including sessions with prominent legislators like Representative Darrell Issa (R-CA), but to host a sitting President of the United States will elevate Reddit’s prominence in the intersection of technology and politics. AllThingsD has the story of Reddit got the President onto the site. Reddit co-founder Alexis Ohanian told Peter Kafka that “There are quite a few redditors at 1600 Pennsylvania Ave and at the campaign HQ — given the prominence of reddit, it’s an easy sell.”

President Obama made some news in the process, with respect to the Supreme Court decision that allowed super political action committees, or “Super PACs,” to become part of the campaign finance landscape.

“Over the longer term, I think we need to seriously consider mobilizing a constitutional amendment process to overturn Citizens United (assuming the Supreme Court doesn’t revisit it),” commented President Obama. “Even if the amendment process falls short, it can shine a spotlight of the super-PAC phenomenon and help apply pressure for change.”

President Obama announced that he’d be participating in the AMA in a tweet and provided photographic evidence that he was actually answering questions in an image posted to Reddit (above) and in a second tweet during the session.

The timing of the AMA was at least a little political, coming after a speech in Virginia and falling upon the third day of the Republic National Convention, but it is unequivocally a first, in terms of a president directly engaging with the vibrant Reddit community. Many people also tweeted that they were having trouble accessing the page during the AMA, as tens of thousands of users tried to access the forum. According to The Verge, President Obama’s AMA was the most popular post in Reddit’s history, with more than 200,000 visitors on the site concurrently. (Presidential Q&As apparently melts servers almost as much as being Biebered.)

Today’s AMA is only the latest example of presidents experimenting with online platforms, from President Clinton and President Bush posting text on WhiteHouse.gov to President Obama joining rebooting that platform on Drupal. More recently, President Obama has participated in a series of online ‘town halls’ using social media, including Twitter, Facebook, LinkedIn and the first presidential Hangout on Google+.

His use of all them deserves to be analyzed critically, in terms of whether the platforms and events were being used to shine the credential of a tech-savvy chief executive in an election year or to genuinely answer the questions and concerns of the citizens he serves.

In analyzing the success of such experiment in digital democracy, it’s worth looking at whether the questions answered were based upon the ones most citizens wanted to see asked (on Reddit, counted by upvotes) and whether the answers given were rehashed talking points or specific to the intent of the questions asked. On the first part of that rubric, President Obama scored high: he answered each of the top-voted questions in the AMA, along with a few personal ones.

 

On the rest of those counts, you can judge for yourself. The president’s answers are below:

“Hey everybody – this is barack. Just finished a great rally in Charlottesville, and am looking forward to your questions. At the top, I do want to say that our thoughts and prayers are with folks who are dealing with Hurricane Isaac in the Gulf, and to let them know that we are going to be coordinating with state and local officials to make sure that we give families everything they need to recover.”

On Internet freedom: “Internet freedom is something I know you all care passionately about; I do too. We will fight hard to make sure that the internet remains the open forum for everybody – from those who are expressing an idea to those to want to start a business. And although their will be occasional disagreements on the details of various legislative proposals, I won’t stray from that principle – and it will be reflected in the platform.”

On space exploration: “Making sure we stay at the forefront of space exploration is a big priority for my administration. The passing of Neil Armstrong this week is a reminder of the inspiration and wonder that our space program has provided in the past; the curiosity probe on mars is a reminder of what remains to be discovered. The key is to make sure that we invest in cutting edge research that can take us to the next level – so even as we continue work with the international space station, we are focused on a potential mission to a asteroid as a prelude to a manned Mars flight.”

On helping small businesses and relevant bills: “We’ve really focused on this since I came into office – 18 tax cuts for small business, easier funding from the SBA. Going forward, I want to keep taxes low for the 98 percent of small businesses that have $250,000 or less in income, make it easier for small business to access financing, and expand their opportunities to export. And we will be implementing the Jobs Act bill that I signed that will make it easier for startups to access crowd-funding and reduce their tax burden at the start-up stage.”

Most difficult decision you had to make this term? ”The decision to surge our forces in afghanistan. Any time you send our brave men and women into battle, you know that not everyone will come home safely, and that necessarily weighs heavily on you. The decision did help us blunt the taliban’s momentum, and is allowing us to transition to afghan lead – so we will have recovered that surge at the end of this month, and will end the war at the end of 2014. But knowing of the heroes that have fallen is something you never forget.”

On the influence of money in politics ”Money has always been a factor in politics, but we are seeing something new in the no-holds barred flow of seven and eight figure checks, most undisclosed, into super-PACs; they fundamentally threaten to overwhelm the political process over the long run and drown out the voices of ordinary citizens. We need to start with passing the Disclose Act that is already written and been sponsored in Congress – to at least force disclosure of who is giving to who. We should also pass legislation prohibiting the bundling of campaign contributions from lobbyists. Over the longer term, I think we need to seriously consider mobilizing a constitutional amendment process to overturn Citizens United (assuming the Supreme Court doesn’t revisit it). Even if the amendment process falls short, it can shine a spotlight of the super-PAC phenomenon and help apply pressure for change.”

On prospects for recent college grads – in this case, a law school grad: I understand how tough it is out there for recent grads. You’re right – your long term prospects are great, but that doesn’t help in the short term. Obviously some of the steps we have taken already help young people at the start of their careers. Because of the health care bill, you can stay on your parent’s plan until you’re twenty six. Because of our student loan bill, we are lowering the debt burdens that young people have to carry. But the key for your future, and all our futures, is an economy that is growing and creating solid middle class jobs – and that’s why the choice in this election is so important. The other party has two ideas for growth – more taxs cuts for the wealthy (paid for by raising tax burdens on the middle class and gutting investments like education) and getting rid of regulations we’ve put in place to control the excesses on wall street and help consumers. These ideas have been tried, they didnt work, and will make the economy worse. I want to keep promoting advanced manufacturing that will bring jobs back to America, promote all-American energy sources (including wind and solar), keep investing in education and make college more affordable, rebuild our infrastructure, invest in science, and reduce our deficit in a balanced way with prudent spending cuts and higher taxes on folks making more than $250,000/year. I don’t promise that this will solve all our immediate economic challenges, but my plans will lay the foundation for long term growth for your generation, and for generations to follow. So don’t be discouraged – we didn’t get into this fix overnight, and we won’t get out overnight, but we are making progress and with your help will make more.”

First thing he’ll do on November 7th: “Win or lose, I’ll be thanking everybody who is working so hard – especially all the volunteers in field offices all across the country, and the amazing young people in our campaign offices.”

How do you balance family life and hobbies with being POTUS? ”It’s hard – truthfully the main thing other than work is just making sure that I’m spending enough time with michelle and the girls. The big advantage I have is that I live above the store – so I have no commute! So we make sure that when I’m in DC I never miss dinner with them at 6:30 pm – even if I have to go back down to the Oval for work later in the evening. I do work out every morning as well, and try to get a basketball or golf game in on the weekends just to get out of the bubble. Speaking of balance, though, I need to get going so I’m back in DC in time for dinner. But I want to thank everybody at reddit for participating – this is an example of how technology and the internet can empower the sorts of conversations that strengthen our democracy over the long run. AND REMEMBER TO VOTE IN NOVEMBER – if you need to know how to register, go to Gottaregister.com. By the way, if you want to know what I think about this whole reddit experience – NOT BAD!”

On +The White House homebrew recipe ”It will be out soon! I can tell from first hand experience, it is tasty.”

A step forward for digital democracy?

The most interesting aspect of that Presidential Hangout was that it introduced the possibility of unscripted moments, where a citizen could ask an unexpected question, and the opportunity for followups, if an answer wasn’t specific enough.

Reddit doesn’t provide quite the same mechanism for accountability at a live Hangout, in terms of putting an elected official on the spot to answer. Unfortunately, the platform of Reddit itself falls short here: there’s no way to force a politician to circle back and give a better answer, in the way, say, Mike Wallace might have on “60 Minutes.”

Alexis Madrigal, one of the sharpest observers of technology and society currently gracing the pages of the Atlantic, is clear about the issues with a Reddit AMA: “it’s a terrible format for extracting information from a politician.”

Much as many would like to believe that the medium determines the message, a modern politician is never unmediated. Not in a pie shop in Pennsylvania, not at a basketball game, not while having dinner, not on the phone with NASA, not on TV, not doing a Reddit AMA. Reddit is not a mic accidentally left on during a private moment. The kind of intimacy and honesty that Redditors crave does not scale up to national politics, where no one ever lets down his or her guard. Instead of using the stiffness and formality of the MSM to drive his message home, Obama simply used the looseness and casual banter of Reddit to drive his message home. Here more than in almost anything else: Tech is not the answer to the problems of modern politics.

Today’s exchange, however, does hint at the tantalizing dynamic that makes it alluring: that the Internet is connecting you and your question to the most powerful man in the world, directly, and that your online community can push for him to answer it.

President Obama ended today’s AMA by thanking everyone on Reddit for participating and wrote that “this is an example of how technology and the internet can empower the sorts of conversations that strengthen our democracy over the long run.”

Well, it’s a start. Thank you for logging on today, Mr. President. Please come back online and answer some more follow up questions.

Reposted byRK RK

August 28 2012

Seeking prior art where it most often is found in software

Patent ambushes are on the rise again, and cases such as Apple/Samsung shows that prior art really has to swing the decision–obviousness or novelty is not a strong enough defense. Obviousness and novelty are subjective decisions made by a patent examiner, judge, or jury.

In this context, a recent conversation I had with Keith Bergelt, Chief Executive Officer of the Open Invention Network takes on significance. OIN was formed many years ago to protect the vendors, developers, and users of Linux and related open source software against patent infringement. They do this the way companies prepare a defense: accumulating a portfolio of patents of their own.

According to Bergelt, OIN has spent millions of dollars to purchase patents that uniquely enable Linux and open source and have helped free software vendors and developers understand and prepare to defend against lawsuits. All OIN patents are available under a free license to those who agree to forbear suit on Linux grounds and to cross license their own patents that read on OIN’s Linux System Definition. OIN has nearly 500 licensees and is adding a new one every three days, as everyone from individual developers to large multinationals are coming to recognize its role and the value of an OIN license.

The immediate trigger for our call was an announcement by OIN that they are expanding their Linux System Definition to include key mobile Linux software packages such as Dalvik, which expands the scope of the cross licenses under the OIN license. In this way OIN is increasing the freedom of action under which a company can operate under Linux.

OIN’s expansion of its Linux System Definition affects not only Android, which seems to be in Apple’s sights, but any other mobile distribution based on Linux, such as MeeGo and Tizen. They have been interested in this area for some time, but realize that mobile is skyrocketing in importance.

Meanwhile, they are talking to their supporters about new ways of deep mining for prior art in source code. Patent examiners, as well as developers filing patents in good faith, look mostly at existing patents to find prior art. But in software, most innovation is not patented. It might not even appear in the hundreds of journals and conference proceedings that come out in the computer science field each year. It is abstraction that emerges from code, when analyzed.

A GitHub staffer told me it currently hosts approximately 25 TB of data and adds over 65 GB of new data per day. A lot of that stuff is probably hum-drum, but I bet a fraction of it contains techniques that someone else will try to gain a monopoly over someday through patents.

Naturally, inferring innovative processes from source code is a daunting exercise in machine learning. It’s probably harder than most natural language processing, which tries to infer limited meanings or relationships from words. But OIN feels we have to try. Otherwise more and more patents may impinge (which is different from infringe) on free software.

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!

Schweinderl