Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

June 27 2013

Four short links: 27 June 2013

  1. nitrous.io — IDE “in the cloud”, as “the kids” say.
  2. smartHeadlight — headlight that tracks raindrops and doesn’t send out light to reflect off them back into your eyes causing you to clutch your head and veer off the road into the parking lot of a Hooters to which your wife will NOT enjoy being called to tow your VERY SORRY HONEY ass home. Thank heavens science can save us from this awful hypothetical scenario. (via Greg Linden)
  3. Knight Funds outline.io — it’s a public policy simulator that helps people visualize the impact that public policies like health care reform and school budget changes might have on local economies and communities. Simulators are hugely underused way to get public to understand policy debate. (via Julie Starr)
  4. ZXX Font — designed to be hard to OCR, though a common trick makes it pervious to OCR. Secrecy is not an option on your font menu. (via Beta Knowledge)

April 10 2013

Four short links: 10 April 2013

  1. HyperLapse — this won the Internet for April. Everyone else can go home. Check out this unbelievable video and source is available.
  2. Housing Simulator — NZ’s largest city is consulting on its growth plan, and includes a simulator so you can decide where the growth to house the hundreds of thousands of predicted residents will come from. Reminds me of NPR’s Budget Hero. Notice that none of the levers control immigration or city taxes to make different cities attractive or unattractive. Growth is a given and you’re left trying to figure out which green fields to pave.
  3. Converting To and From Google Map Tile Coordinates in PostGIS (Pete Warden) — Google Maps’ system of power-of-two tiles has become a defacto standard, widely used by all sorts of web mapping software. I’ve found it handy to use as a caching scheme for our data, but the PostGIS calls to use it were getting pretty messy, so I wrapped them up in a few functions. Code on github.
  4. So You Want to Build A Connected Sensor Device? (Google Doc) — The purpose of this document is to provide an overview of infrastructure, options, and tradeoffs for the parts of the data ecosystem that deal with generating, storing, transmitting, and sharing data. In addition to providing an overview, the goal is to learn what the pain points are, so we can address them. This is a collaborative document drafted for the purpose of discussion and contribution at Sensored Meetup #10. (via Rachel Kalmar)

March 19 2013

Four short links: 19 March 2013

  1. VizCities Dev Diary — step-by-step recount of how they brought London’s data to life, SimCity-style.
  2. Google Fibre Isn’t That ImpressiveFor [gigabit broadband] to become truly useful and necessary, we’ll need to see a long-term feedback loop of utility and acceptance. First, super-fast lines must allow us to do things that we can’t do with the pedestrian internet. This will prompt more people to demand gigabit lines, which will in turn invite developers to create more apps that require high speed, and so on. What I discovered in Kansas City is that this cycle has not yet begun. Or, as Ars Technica put it recently, “The rest of the internet is too slow for Google Fibre.”
  3. gov.uk Recommendations on Open SourceUse open source software in preference to proprietary or closed source alternatives, in particular for operating systems, networking software, Web servers, databases and programming languages.
  4. Internet Bad Neighbourhoods (PDF) — bilingual PhD thesis. The idea behind the Internet Bad Neighborhood concept is that the probability of a host in behaving badly increases if its neighboring hosts (i.e., hosts within the same subnetwork) also behave badly. This idea, in turn, can be exploited to improve current Internet security solutions, since it provides an indirect approach to predict new sources of attacks (neighboring hosts of malicious ones).

October 26 2012

Four short links: 26 October 2012

  1. BootMetro (github) — website templates with a Metro (Windows 8) look. (via Hacker News)
  2. Kenya’s Treasury to tax M-Pesa — 10% tax on mobile money-transfer systems. M-Pesa is the largest mobile money transfer service provider in Kenya, with more than 14 million subscribers. [...] It is estimated that M-Pesa reports some 2 million transactions per day. [...] the value of money transferred through mobile platforms jumped by 41 per cent in the first six months of 2012. Neer mind fighting you, you know you’re winning when they tax you! (via Evgeny Mozorov)
  3. Digital Divide and Fibre RolloutAs the group of non-users gets smaller, they are likely to become more seriously disadvantaged. The NBN – and high-speed broadband more generally – will drive a wave of new applications across most areas of life, transforming Australia’s service economy in fundamental ways. Those who are not connected in 2015 may be fewer, but they will be missing out on far more – in education, health, government, commerce, communication and entertainment. The costs will also fall on service providers forced to keep supplying expensive physical and face-to-face services to this declining number of people. This will be particularly significant in remote communities, where health consultations and evacuations by flying doctors, nurses and allied health professionals could potentially be reduced through e-health diagnostics, and where Centrelink still regularly sends teams out to communities. As gov2 expands and services move online, connectivity disadvantages are compounded. (via Ellen Strickland)
  4. Smart Body Smart World (Forrester) — take note of these two consequences of Internet of Things and Quantified Self: Verticals fuse: “Health and wellness” is not its own silo, but is connected to our finances, our shopping habits, our relationships. As bodies get connected, everyone is in the body business. Retail disperses: All retailers become computing retailers, and computing-specific retailers like Best Buy go the way of Blockbuster. You wouldn’t buy a smart toothbrush at a specialty CE store; you’d be more likely to buy it in the channel that solves the rest of your hygiene needs. (via Internet of Things)

October 16 2012

Four short links: 16 October 2012

  1. cir.ca — news app for iPhone, which lets you track updates and further news on a given story. (via Andy Baio)
  2. DataWrangler (Stanford) — an interactive tool for data cleaning and transformation. Spend less time formatting and more time analyzing your data. From the Stanford Visualization Group.
  3. Responsivator — see how websites look at different screen sizes.
  4. Accountable Algorithms (Ed Felten) — When we talk about making an algorithmic public process open, we mean two separate things. First, we want transparency: the public knows what the algorithm is. Second, we want the execution of the algorithm to be accountable: the public can check to make sure that the algorithm was executed correctly in a particular case. Transparency is addressed by traditional open government principles; but accountability is different.

October 09 2012

Four short links: 9 October 2012

  1. Finland Crowdsourcing New Laws (GigaOm) — online referenda. The Finnish government enabled something called a “citizens’ initiative”, through which registered voters can come up with new laws – if they can get 50,000 of their fellow citizens to back them up within six months, then the Eduskunta (the Finnish parliament) is forced to vote on the proposal. Now this crowdsourced law-making system is about to go online through a platform called the Open Ministry. Petitions and online voting are notoriously prone to fraud, so it will be interesting to see how well the online identity system behind this holds up.
  2. WebPlatform — wiki of information about developing for the open web. Joint production of many of the $BIGCOs of the web and the W3C, so will be interesting to see, as it develops, whether it has the best aspects of each or the worst.
  3. Why Your Phone, Cable, Internet Bills Cost So Much (Yahoo) — “The companies essentially have a business model that is antithetical to economic growth,” he says. “Profits go up if they can provide slow Internet at super high prices.” Excellent piece!
  4. Probability and Statistics Cookbook (Matthias Vallentin) — The cookbook contains a succinct representation of various topics in probability theory and statistics. It provides a comprehensive reference reduced to the mathematical essence, rather than aiming for elaborate explanations. CC-BY-NC-SA licensed, LaTeX source on github.

September 27 2012

Four short links: 27 September 2012

  1. Paying for Developers is a Bad Idea (Charlie Kindel) — The companies that make the most profit are those who build virtuous platform cycles. There are no proof points in history of virtuous platform cycles being created when the platform provider incents developers to target the platform by paying them. Paying developers to target your platform is a sign of desperation. Doing so means developers have no skin in the game. A platform where developers do not have skin in the game is artificially propped up and will not succeed in the long run. A thesis illustrated with his experience at Microsoft.
  2. Learnable Programming (Bret Victor) — deconstructs Khan Academy’s coding learning environment, and explains Victor’s take on learning to program. A good system is designed to encourage particular ways of thinking, with all features carefully and cohesively designed around that purpose. This essay will present many features! The trick is to see through them — to see the underlying design principles that they represent, and understand how these principles enable the programmer to think. (via Layton Duncan)
  3. Tablet as External Display for Android Smartphones — new app, in beta, letting you remote-control via a tablet. (via Tab Times)
  4. Clay Shirky: How The Internet Will (One Day) Transform Government (TED Talk) — There’s no democracy worth the name that doesn’t have a transparency move, but transparency is openness in only one direction, and being given a dashboard without a steering wheel has never been the core promise a democracy makes to its citizens.

September 21 2012

Four short links: 21 September 2012

  1. Business Intelligence on FarmsMachines keep track of all kinds of data about each cow, including the chemical properties of its milk, and flag when a particular cow is having problems or could be sick. The software can compare current data with historical patterns for the entire herd, and relate to weather conditions and other seasonal variations. Now a farmer can track his herd on his iPad without having to get out of bed, or even from another state. (via Slashdot)
  2. USAxGITHUB — monitor activity on all the US Federal Government’s github repositories. (via Sarah Milstein)
  3. Rethinking Robotics — $22k general purpose industrial robot. “‘It feels like a true Macintosh moment for the robot world,’ said Tony Fadell, the former Apple executive who oversaw the development of the iPod and the iPhone. Baxter will come equipped with a library of simple tasks, or behaviors — for example, a “common sense” capability to recognize it must have an object in its hand before it can move and release it.” (via David ten Have)
  4. Shift LabsShift Labs makes low-cost medical devices for resource-limited settings. [Crowd]Fund the manufacture and field testing of the Drip Clip [...] a replacement for expensive pumps that dose fluid from IV bags.

August 28 2012

June 25 2012

Four short links: 25 June 2012

  1. Stop Treating People Like Idiots (Tom Steinberg) -- governments miss the easy opportunities to link the tradeoffs they make to the point where the impacts are felt. My argument is this: key compromises or decisions should be linked to from the points where people obtain a service, or at the points where they learn about one. If my bins are only collected once a fortnight, the reason why should be one click away from the page that describes the collection times.
  2. UK Study Finds Mixed Telemedicine Benefits -- The results, in a paper to the British Medical Journal published today, found telehealth can help patients with long-term conditions avoid emergency hospital care, and also reduce deaths. However, the estimated scale of hospital cost savings is modest and may not be sufficient to offset the cost of the technology, the report finds. Overall the evidence does not warrant full scale roll-out but more careful exploration, it says. (via Mike Pearson)
  3. Pay Attention to What Nick Denton is Doing With Comments (Nieman Lab) -- Most news sites have come to treat comments as little more than a necessary evil, a kind of padded room where the third estate can vent, largely at will, and tolerated mainly as a way of generating pageviews. This exhausted consensus makes what Gawker is doing so important. Nick Denton, Gawker’s founder and publisher, Thomas Plunkett, head of technology, and the technical staff have re-designed Gawker to serve the people reading the comments, rather than the people writing them.
  4. Informed Consent Source of Confusion (Nature) -- fascinating look at the downstream uses of collected bio data and the difficulty in gaining informed consent: what you might learn about yourself (do I want to know I have an 8.3% greater chance of developing Alzheimers? What would I do with that knowledge besides worry?), what others might learn about you (will my records be subpoenable?), and what others might make from the knowledge (will my data be used for someone else's financial benefit?). (via Ed Yong)

June 22 2012

The emerging political force of the network of networks

The shape and substance of our networked world is constantly emerging over time, stretching back over decades. Over the past year, the promise of the Internet as a platform for collective action moved from theory to practice, as networked movements of protesters and consumers have used connection technologies around the world in the service of their causes.

This month, more eyes and minds came alive to the potential of this historic moment during the ninth Personal Democracy Forum (PDF) in New York City, where for two intense days the nexus of technology, politics and campaigns came together on stage (and off) in a compelling, provocative mix of TED-style keynotes and lightning talks, longer panels, and the slipstream serendipity of hallway conversations and the backchannel on Twitter.


If you are interested in the intersection of politics, technology, social change and the Internet, PDF has long since become a must-attend event, as many of the most prominent members of the "Internet public" convene to talk about what's changing and why.

The first day began with a huge helping of technology policy, followed with a hint of triumphalism regarding the newfound power of the Internet in politics that was balanced by Jaron Lanier's concern about the impact of the digital economy on the middle class. The conference kicked off with a conversation between two United States Congressmen who were central to the historic online movement that halted the progression of the Stop Online Piracy Act (SOPA) and the Protect IP Act (PIPA) in the U.S. House of Representatives and Senate: Representative Darrell Issa (R-CA) and Senator Ron Wyden (D-OR). You can watch a video of their conversation with Personal Democracy Media founder Andrew Rasiej below:

During this conversation, Rep. Issa and Sen. Ron Wyden introduced a proposal for a "Digital Bill of Rights." They published a draft set of principles on MADISON, the online legislation platform built last December during the first Congressional hackathon.

Both Congressmen pointed to different policy choices that stand to affect billions of people, ranging from proposed legislation about intellectual property, to the broader issue of online innovation and Internet freedom, and international agreements like the Anti-Counterfeiting Trade Agreement (ACTA or the Trans Pacific Partnership). Such policy choices also include online and network security: Rep. Issa sponsored and voted for CISPA, whereas Sen. Wyden is to opposed to a similar legislative approach in the Senate. SOPA, PIPA, ACTA and TPP have all been posted on MADISON for public comment.


On the second day of PDF, conversations and talks turned toward not only what is happening around the networked world but what could be in store for citizens in failed states in the developing world or those inhabiting huge cities in the West, with implications that can be simultaneously exhilarating and discomfiting. There was a strong current of discussion about the power of "adhocracy" and the force of the networked movements that are now forming, dissolving and reforming in new ways, eddying around the foundations of established societal institutions around the globe. Micah Sifry, co-founder of the Personal Democracy Forum, hailed five of these talks as exemplars of the "radical power of the Internet public.

These keynotes, by Chris Soghoian, Dave Parry, Peter Fein, Sascha Meinrath and Deanna Zandt, "could serve as a 50-minute primer on the radical power of the Internet public to change the world, why it's so important to nurture that public, where some of the threats to the Internet are coming from, and how people are routing around them to build a future 'intranet' that might well stand free from governmental and corporate control," wrote Sifry. (Three of them are embedded individually below; the rest you can watch in the complete video catalog at the bottom of this section.)

Given the historic changes in the Middle East and Africa over the past year during the Arab Spring, or the networked protests we've seen during the Occupy movement or over elections in Russia or austerity measures in Greece, it's no surprise that there was great interest in not just talking about what was happening, but why. This year, PDF attendees were also fortunate to hear about the experiences of netizens in China and Russia. The degree of change created by adding wireless Internet connectivity, social networking and online video to increasingly networked societies will vary from country to country. There are clearly powerful lessons that can be gleaned from the experiences of other humans around the globe. Learning where social change is happening (or not) and understanding how our world is changing due to the influence of networks is core to being a digitally literate citizen in the 21st century.

Declaring that we, as a nation or global polity, stand at a historic inflection point for the future of the Open Web or the role of the Internet in presidential politics or the balance of digital security and privacy feels, frankly, like a reiteration of past punditry, going well back to the .com boom in the 1990s.

That said, it doesn't make it less true. We've never been this connected to a network of networks, nor have the public, governments and corporations been so acutely aware of the risks and rewards that those connection technologies pose. It wasn't an accident that Muammar Gaddafi namechecked Facebook before his fall, nor that the current President of the United States (or his opponent in the the upcoming election) are talking directly with the public over the Internet. One area that PDF might have dwelt more upon is the dark side of networks, from organized crime and crimesourcing to government-sponsored hacking to the consequences of poorly considered online videos or updates.

We live in a moment of breathtaking technological changes that stand to disrupt nearly every sector of society, for good or ill. Many thanks to the curators and conveners of this year's conference for amplifying the voices of those whose work focuses on documenting and understanding how our digital world is changing — and a special thanks to all of the inspiring people who are not only being the change they wish to see in the world but making it.

Below, I've embedded a selection of the PDF 12 talks that resonated with me. These videos should serve a starting point, however, not an ending: every person on the program of this year's conference had something important to share, from Baratunde Thurston to Jan Hemme to Susan Crawford to Leslie Harris to Carne Ross to the RIAA's Cary Sherman — and the list goes on and on. You can watch all 45 talks from PDF 2012 (at least, the ones that have been uploaded to YouTube by the Personal Democracy Media team) in the player below:

Yochai Benkler | SOPA/PIPA: A Case Study in Networked Discourse and Activism

In this talk, Harvard law professor Yochai Benkler (@ybenkler) discussed using the Berkman Center's media cloud to trace how the Internet became a networked platform for collective action against SOPA and PIPA. Benkler applies a fascinating term — the "attention backbone" — to describe how influential nodes in a network direct traffic and awareness to research or data. If you're interested in the evolution of the blueprint for democratic participation online, you'll find this talk compelling.

Sascha Meinrath | Commotion and the Rise of the Intranet Era

Mesh networks have become an important — and growing — force for carrying connectivity to more citizens around the world. The work of Sasha Meinrath (@SashaMeinrath) at the Open Technology Institute in the New America Foundation is well worth following.

Mark Surman | Making Movements: What Punk Rock, Scouting, and the Royal Society Can Teach

Mark Surman (@msurman), the executive director of the Mozilla Foundation, shared a draft of his PDF talk prior to the conference. He offered his thoughts on "movement making," connecting lessons from punk rock, scouting and the Royal Society.

With the onrush of mobile apps and swift ride of Facebook, what we think about as the Internet — the open platform that is the World Wide Web — is changing. Surman contrasted the Internet today, enabled by an end-to-end principle, built upon open-source technologies and on open protocols, with the one of permissions, walled gardens and controlled app stores that we're seeing grow around the world. "Tim Berners-Lee built the idea that the web should be LEGO into its very design," said Surman. We'll see how if all of these pieces (loosely joined?) fit as well together in the future.

Juan Pardinas | OGP: Global Steroids for National Reformers

There are substantial responsibilities and challenges inherent in moving forward with the historic Open Government Partnership (OGP) that officially launched in New York City last September. Juan Pardinas (@jepardinas) took the position that OGP will have a positive impact on the world and that the seat civil society has at the partnership's table will matter. By the time the next annual OGP conference rolls around in 2013, history may well have rendered its own verdict on whether this effort will endure to lasting effect.

Given diplomatic challenges around South Africa's proposed secrecy law, all of the stakeholders in the Open Government Partnership will need to keep pressure on other stakeholders if significant progress is going to be made. If OGP is to be judged more than a PR opportunity for politicians and diplomats to make bold framing statements, government and civil society leaders will need to do more to hold countries accountable to the commitments required for participation: all participating countries must submit Action Plans after a bonafide public consultation. Moreover, they'll need to define the metrics by which progress should be judged and be clear with citizens about the timelines for change.

Michael Anti | Walking Along the Great Firewall

Michael Anti (@mranti) is a Chinese journalist and political blogger who has earned global attention for activism in the service of freedom of the press in China. When Anti was exiled from Facebook over its real names policy, his account deletion became an important example for other activists around the world. At PDF, he shared a frank perspective on where free speech stands in China, including how the Chinese government is responding to the challenges of their increasingly networked society. For perspective, there are now more Internet users in China (an estimated 350 million) than the total population of the United States. As you'll hear in Anti's talk, the Chinese government is learning and watching what happens elsewhere.





Masha Gessen | The Future of the Russian Protest Movement

Masha Gessen (@mashagessen), a Russian and American journalist, threw a bucket of ice water on any hopes that increasing Internet penetration or social media would in of themselves lead to improvements in governance, reduce corruption, or improve the ability of Russia's people to petition their government for grievances.





An Xiao Mina | Internet Street Art and Social Change in China

This beautiful and challenging talk by Mina (@anxiaostudio) offered a fascinating insight: memes are the street art of the censored web. If you want to learn more about how Chinese artists and citizens are communicating online, watch this creative, compelling presentation. (Note: there are naked people in this video, which will make it NSFW is some workplaces.)

Chris Soghoian | Lessons from the Bin Laden Raid and Cyberwar

Soghoian (@csoghoian), who has a well-earned reputation for finding privacy and security issues in the products and services of the world's biggest tech companies, offered up a talk that made three strong points:

  1. Automatic security updates are generally quite a good thing for users.
  2. It's highly problematic if governments create viruses that masquerade as such updates.
  3. The federal government could use an official who owns consumer IT security, not just "cybersecurity" in at the corporate or national level.

Zac Moffatt | The Real Story of 2012: Using Digital for Persuasion

Moffatt (@zacmoffatt> is the digital director for the Mitt Romney presidential campaign. In his talk, Moffatt said 2012 will be the first election cycle where persuasion and mobilization will be core elements of the digital experience. Connecting with millions of voters who have moved to the Internet is clearly a strategic priority for his team — and it appears to be paying off. The Guardian reported recently that the Romney campaign is closing the digital data gap with the Obama campaign.


Nick Judd wrote up further analysis of Moffatt's talk on digital strategy over at TechPresident.

Alex Torpey | The Local Revolution

Alex Torpey (@AlexTorpey) attracted widespread attention when he was elected mayor of South Orange New Jersey last year at the age of 23. In the months since he was elected, Torpey has been trying to interest his peers in politics. His talk at PDF focused on asking for more participation in local government and to rethink partisanship: Torpey ran as an independent. As Gov 2.0 goes local, Mayor Torpey looks likely to be one of its leaders.

Gilad Lotan | Networked Power: What We Learn From Data

If you're interested in a data-driven analysis of networked political power and media influence, Gilan Lotan's talk is a must-watch. Lotan, who tweets as @gilgul, crunched massive amounts of tweets to help the people formerly known as the audience to better understand networked movements for change.






Cheryl Contee | The End of the Digital Divide

Jack and Jill Politics co-founder Cheryl Contee (@cheryl) took a profoundly personal approach when she talked about the death and rebirth of the digital divide. She posited that what underserved citizens in the United States now face isn't so much the classic concerns of the 1990s, where citizens weren't connected to the Internet, but rather a skills gap for open jobs and a lack of investment to address those issues in poor and minority communities. She also highlighted how important mentorship can be in bridging that divide. When Contee shared how Yale computer lab director Margaret Krebs helped her, she briefly teared up — and she called on technologists, innovators and leaders to give others a hand up.

Tracing the storify of PDF 12

I published a storify of Personal Democracy Forum 2012 after the event. Incomplete though it may be, it preserves some thoughtful commentary and context shared in the Twittersphere during the event.

June 21 2012

Four short links: 21 June 2012

  1. Test, Learn, Adapt (PDF) -- UK Cabinet Office paper on randomised trials for public policy. Ben Goldacre cowrote.
  2. UK EscapeTheCity Raises GBP600k in Crowd Equity -- took just eight days, using the Crowdcube platform for equity-based crowd investment.
  3. DIY Bio SOPs -- CC-licensed set of standard operating procedures for a bio lab. These are the SOPs that I provided to the Irish EPA as part of my "Consent Conditions" for "Contained Use of Class 1 Genetically Modified Microorganisms". (via Alison Marigold)
  4. Shuffling Cards -- shuffle a deck of cards until it's randomised. That order of cards probably hasn't ever been seen before in the history of mankind.

June 08 2012

mHealth apps are just the beginning of the disruption in healthcare from open health data

Two years ago, the potential of government making health information as useful as weather data felt like an abstraction. Healthcare data could give citizens the same "blue dot" for navigating health and illness akin to the one GPS data fuels on the glowing map of geolocated mobile devices that are in more and more hands.

After all, profound changes in entire industries, take years, even generations, to occur. In government, the pace of progress can feel even slower, measured in evolutionary time and epochs.

Sometimes, history works differently, particularly given the effect of rapid technological changes. It's only a little more than a decade since President Clinton announced he would unscramble global positioning system data (GPS) for civilian use. President Obama's second U.S. chief technology officer, Todd Park, estimated that GPS data is estimated to have unlocked some $90 billion dollars in value in the United States.

In the context, the arc of the Health Data Initiative (HDI) in the United States might leave some jaded observers with whiplash. From a small beginning, the initiative to put health data to work has now expanded around the United States and attracted great interest from abroad, including observers from England National Health Service eager to understand what strategies have unlocked innovation around public data sets.

While the potential of government health data driving innovation may well have felt like an abstraction to many observers, in June 2012, real health apps and services are here -- and their potential to change how society accesses health information, deliver care, lowers costs, connects patients to one another, creates jobs, empowers care givers and cuts fraud is profound. The venture capital community seems to have noticed the opportunity here: according to HHS Secretary Sebelius, investment in healthcare startups is up 60% since 2009.

Headlines about rockstar Bon Jovi 'rocking Datapalooza' and the smorgasbord of health apps on display, however, while both understandable and largely warranted, don't convey the deeper undercurrent of change.

On March 10, 2010, the initiative started with 36 people brainstorming in a room. On June 2, 2010, approximately 325 in-person attendees saw 7 health apps demoed at an historic forum in the theater of Institute of Medicine in Washington, D.C, with another 10 apps packed into an expo in the rotunda outside. All of the apps or services used open government data from the United States Department of Health and Human Services (HHS).

In 2012, 242 applications or services that were based upon or use open data were submitted for consideration to third annual "Health Datapalooza. About 70 health app exhibitors made it to the expo. The conference itself had some 1400 registered attendees, not counting press and staff, and was sold out in advance of the event in the cavernous Washington Convention Center in DC. On Wednesday, I asked Dr. Bob Kucher, now of Venrock Capital and the Brookings Institution, about how the Health Data Initiative has grown and evolved. Dr. Kucher was instrumental to its founding when he served in the Obama administration. Our interview is embedded below:

Revolutionizing the healthcare industry --- in HHS Secretary Sebelius's words, reformulating Wired executive editor Thomas Goetz's 'latent data' to "lazy data" --- has meant years of work unlocking government data and actively engaging the developers, entrepreneurial and venture capital community. While the process of making health data open and machine-readable is far from done, there has been incontrovertible progress in standing up new application programming interfaces (APIs) that enable entrepreneurs, academic institutions and government itself to retrieve it one demand. On Monday, in concert with the Health Data Palooza, a new version of HealthData.gov launched, including the release of new data sets that enable not just hospital quality comparisons but insurance fees as well.

Two years later, the blossoming of the HDI Forum into a massive conference that attracted the interest of the media, venture capitalists and entrepreneurs from around the nation is a short-term development that few people would have predicted in 2010 but that a nation starved for solutions to spiraling healthcare costs and some action from a federal government that all too frequently looks broken is welcome.

"The immense fiscal pressure driving 'innovation' in the health context actually means belated leveraging of data insights other industries take for granted from customer databases," said Chuck Curran, executive director and general counsel or the Network Advertising Initiative, when interviewed at this year's HDI Forum. For example, he suggested, look at "the dashboarding of latent/lazy data on community health, combined with geographic visualizations, to enable “hotspot”-focused interventions, or info about service plan information like the new HHS interface for insurance plan data (including the API).

Curran also highlighted the role that fiscal pressure is having on making both individual payers and employers a natural source of business funding and adoption for entrepreneurs innovating with health data, with apps like My Drugs Costs holding the potential to help citizens and businesses alike cut down on an estimated $95 billion dollars in annual unnecessary spending on pharmaceuticals.

Curran said that health app providers have fully internalized smart disclosure : "it’s not enough to have open data available for specialist analysis -- there must be simplified interfaces for actionable insights and patient ownership of the care plan."

For entrepreneurs eying the healthcare industry and established players within it, the 2012 Health Data Palooza offers an excellent opportunity to "take the pulse of mHealth, as Jody Ranck wrote at GigaOm this week:

Roughly 95 percent of the potential entrepreneur pool doesn’t know that these vast stores of data exist, so the HHS is working to increase awareness through the Health Data Initiative. The results have been astounding. Numerous companies, including Google and Microsoft, have held health-data code-a-thons and Health 2.0 developer challenges. These have produced applications in a fraction of the time it has historically taken. Applications for understanding and managing chronic diseases, finding the best healthcare provider, locating clinical trials and helping doctors find the best specialist for a given condition have been built based on the open data available through the initiative.

In addition to the Health Datapalooza, the Health Data Initiative hosts other events which have spawned more health innovators. RockHealth, a Health 2.0 incubator, launched at its SXSW 2011 White House Startup America Roundtable. In the wake of these successful events, StartUp Health, a network of health startup incubators, entrepreneurs and investors, was created. The organization is focused on building a robust ecosystem that can support entrepreneurs in the health and wellness space.

This health data ecosystem has now spread around the United States, from Silicon Valley to New York to Louisiana. During this year's Health Datapalooza, I spoke with Ramesh Kolluru, a technologist who works at the University of Louisiana, about his work on a hackathon in Louisiana, the "Cajun Codefest," and his impressions of the forum in Washington:

One story that stood out from this year's crop of health data apps was Symcat, an mHealth app that enables people to look up their symptoms and find nearby hospitals and clinics. The application was developed by two medical students at Johns Hopkins University who happened to share a passion for tinkering, engineering and healthcare. They put their passion to work - and somehow found the time (remember, they're in medical school) to build a beautiful, usable health app. The pair landed a $100,000 prize from the Robert Wood Johnson Foundation for their efforts. In the video embedded below, I interview Craig Munsen, one of the medical students, about his application. (Notably, the pair intends to use their prize to invest in the business, not pay off medical school debt.)

There are more notable applications and services to profile from this year's expo - and in the weeks ahead, expect to see some of them here on Radar, For now, it's important now to recognize the work of all of the men and women who have worked so hard over the past two years create public good from public data.

Releasing and making open health data useful, however, is about far more than these mHealth apps: It's about saving lives, improving the quality of care, adding more transparency to a system that needs it, and creating jobs. Park spoke with me this spring about how open data relates to much more than consumer-facing mHealth apps:

As the US CTO seeks to scale open data across federal government by applying the lessons learned in the health data initiative, look for more industries to receive digital fuel for innovation, from energy to education to transit and finance. The White House digital government strategy explicitly embraces releasing open data in APIs to enable more accountability, civic utility and economic value creation.

While major challenges lie ahead, from data quality to security or privacy, the opportunity to extend the data revolution in healthcare to other industries looks more tangible now than it has in years past.

Business publications, including the Wall Street Journal, have woken up to the disruptive potential of open government data As Michael Hickins wrote this week, "The potential applications for data from agencies as disparate as the Department of Transportation and Department of Labor are endless, and will affect businesses in every industry imaginable. Including yours. But if you can think of how that data could let someone disrupt your business, you can stop that from happening by getting there first."

This growing health data movement is not placed within any single individual city, state, agency or company. It's beautifully chaotic, decentralized, and self-propelled, said Park this past week.

"The Health Data Initiative is no longer a government initiative," he said. "It's an American one. "

May 29 2012

US CTO seeks to scale agile thinking and open data across federal government

In the 21st century, federal government must go mobile, putting government services and information at the fingertips of citizens, said United States Chief Technology Officer Todd Park in a recent wide-ranging interview. "That's the first digital government result, outcome, and objective that's desired."

To achieve that vision, Park and U.S. chief information officer Steven VanRoekel are working together to improve how government shares data, architects new digital services and collaborates across agencies to reduce costs and increase productivity through smarter use of information technology.

Park, who was chosen by President Obama to be the second CTO of the United States in March, has been (relatively) quiet over the course of his first two months on the job.

Last Wednesday, that changed. Park launched a new Presidential innovation Fellows program, in concert with VanRoekel's new digital government strategy, at TechCrunch's Disrupt conference in New York City. This was followed by another event for a government audience at the Interior Department headquarters in Washington, D.C. Last Friday, he presented his team's agenda to the President's Council of Advisors on Science and Technology.

"The way I think about the strategy is that you're really talking about three elements," said Park, in our interview. "First, it's going mobile, putting government services at the literal fingertips of the people in the same way that basically every other industry and sector has done. Second, it's being smarter about how we procure technology as we move government in this direction. Finally, it's liberating data. In the end, it's the idea of 'government as a platform.'"

"We're looking for a few good men and women"

In the context of the nation's new digital government strategy, Park announced the launch of five projects that this new class of Innovation Fellows will be entrusted with implementing: a broad Open Data Initiative, Blue Button for America, RFP-EZ, The 20% Campaign, and MyGov.

The idea of the Presidential Innovation Fellows Program, said Park, is to bring in people from outside government to work with innovators inside the government. These agile teams will work together within a six-month time frame to deliver results.

The fellowships are basically scaling up the idea of "entrepreneurs in residence," said Park. "It's a portfolio of five projects that, on top of the digital government strategy, will advance the implementation of it in a variety of ways."

The biggest challenge to bringing the five programs that the US CTO has proposed to successful completion is getting 15 talented men and women to join his team and implement them. There's reason for optimism. Park shared vie email that:

"... within 24 hours of TechCrunch Disrupt, 600 people had already registered via Whitehouse.gov to apply to be a Presidential Innovation Fellow, and another several hundred people had expressed interest in following and engaging in the five projects in some other capacity."

To put that in context, Code for America received 550 applications for 24 fellowships last year. That makes both of these fellowships more competitive than getting in to Harvard in 2012, which received 34,285 applications for its next freshman class. There appears to be considerable appetite for a different kind of public service that applies technology and data for the public good.

Park is enthusiastic about putting open government data to work on behalf of the American people, amplifying the vision that his predecessor, Aneesh Chopra, championed around the country for the past three years.

"The fellows are going to have an extraordinary opportunity to make government work better for their fellow citizens," said Park in our interview. "These projects leverage, substantiate and push forward the whole principle of liberating data. Liberate data."

"To me, one of the aspects of the strategy about which I am most excited, that sends my heart into overdrive, is the idea that going forward, the default state of government data shall be open and machine-readable," said Park. "I think that's just fantastic. You'll want to, of course, evolve the legacy data as fast as you can in that same direction. Setting that as 'this is how we are rolling going forward' — and this is where we expect data to ultimately go — is just terrific."

In the videos and interview that follow, Park talks more about his vision for each of the programs.

A federal government-wide Open Data Initiative

In the video below, Park discusses the Presidential Innovation Fellows program and introduces the first program, which focuses on open data:

Park: The Open Data Initiative is a program to seed and expand the work that we're doing to liberate government data as a platform. Encourage, on a voluntary basis, the liberation of data by corporations, as part of the national data platform, and to actively stimulate the development of new tools and services, and enhance existing tools and services, leveraging the data to help improve Americans' lives in very tangible ways, and create jobs for the future.

This leverages the Open Government Directive to say "look, the default going forward is open data." Also the directive to "API-ize" two high priority datasets and also, in targeted ways, go beyond that, and really push to get more data out there in, critically, machine-readable form, in APIs, and to educate the entrepreneur and innovators of the world that it's there through meetups, and hackathons, and challenges, and "Datapaloozas."

We're doubling down on the Health Data Initiative, we are also launching a much more high-profile Safety Data Initiative, which we kicked off last week. An Energy Data Initiative, which kicked off this week. An education data initiative, which we're kicking off soon, and an Impact Data Initiative, which is about liberating data with respect to inputs and outputs in the non-profit space.

We're also going to be exploring an initiative in the realm of personal finance, enabling Americans to access copies of their financial data from public sector agencies and private sector institutions. So, the format that we're going to be leveraging to execute these initiatives is cloned from the Health Data Initiative.

This will make new data available. It will also take the existing public data that is unusable to developers, i.e. in the form of PDFs, books or static websites, and turn it into liquid machine-readable, downloadable, accessible data via API. Then — because we're consistently hearing that 95% of the innovators and entrepreneurs who could turn our data into magic don't even know the data exists, let alone that it's available to them — engage the developer community and the entrepreneurial community with the data from the beginning. Let them know it's there, get their feedback, make it better.

Blue Button for America

Park: The idea is to develop an open source patient portal capability that will replace MyHealthyVet, which is the Veterans Administration's current patient portal. This will actually allow the Blue Button itself to iterate and evolve more rapidly, so that everY time you add more data to it, it won't require heart surgery. It will be a lot easier, and of course will be open source, so that anyone else who wants to use it can use it as well. On top of that, we're going to do a lot of "biz dev" in America to get the word out about Blue Button and encourage more and more holders of data in the private sector to adopt Blue Button. We're also going to work to help stimulate more tool development by entrepreneurs that can upload Blue Button data and make it useful in all kinds of ways for patients. That's Blue Button for America.

What is RFP-EZ?

Park: The objective is "buying smarter." The project that we're working ON with the Small Business Administration on is called "RFP-EZ."

Basically, it's the idea of setting up a streamlined process for the government to procure solutions from innovative, high-growth tech companies. As you know, most high-growth companies regard the government as way too difficult to sell to.

That A) deprives startups and high-growth companies from the government as a marketplace and, B) perhaps even more problematically, actually deprives the government of their solutions.

The hope here is, through the actions of the RFP-EZ team, to create a process and a prototype that the government can much more easily procure solutions from innovative private firms.

It A) opens up this emerging market called "the government" to high-tech startups and B) infects the government with more of their solutions, which are radically more, pound for pound, effective and cost efficient than a lot of the stuff that the government is currently procuring through conventional channels. That's RFP-EZ.

The 20% Campaign

Park: The 20% Campaign is a project that's being championed by USAID. It's an effort at USAID to, working with other government agencies, NGOs and companies, to catalog the movement of foreign assistance payments from cash to electronics. So, just for example, USAID pays its contractors electronically, obviously, but the contractor who, say, pays highway workers in Afghanistan or the way that police officers get paid in Afghanistan is actually principally via cash. Or has been. And that creates all kinds of waste issues, fraud, and abuse.

The idea is actually to move to electronic payment, including mobile payment — and this has the potential to significantly cut waste, fraud and abuse, to improve financial inclusion, to actually let people on phones, to enable them to access bank accounts set up for them. That leads to all kinds of good things, including safety: it's not ideal to be carrying around large amounts of cash in highly kinetic environments.

The Afghan National Police started paying certain contingents of police officers via mobile phones and mobile payments, as opposed to cash, and what happened is that the police officers started reporting an up to a 30% raise. Of course, their pay hadn't changed, but basically, when it was in cash, a bunch of it got lost. This is obviously a good thing, but it's even more important if you realize that when they were paid what they were paid in cash that they ultimately physically received, that was less than the Taliban in this province was actually paying people to join the Taliban — but the mobile payment, and that level of salary, was greater than the Taliban was paying. That's a critical difference.

It's basically taking foreign assistance payments through the last mile to mobile.

MyGov is the U.S. version of Gov.uk

Park: MyGov is an effort to rapidly prototype a citizen-centric system that allows Americans the information and resources of government that are right for them. Think of it as a personalized channel for Americans to be able to access information resources across government and get feedback from citizens about those information and resources.

How do you plan to scale what you learned while you were HHS CTO to the all of the federal government?

Park: Specifically, we're doing exactly the same thing we did with the Health Data Initiative, kicking off the initiatives with a "data jam" — an ideation workshop where we invite, just like with health data, 40 amazing tech and energy minds, tech and safety innovators, to a room — at the White House, in the case of the Safety Data Initiative, or at Stanford University, in the case of the Energy Initiative.

We walk into the room for several hours and say, "Here's a big pile of data. What would you do with this data?" And they invent 15 or 20 news classes of products or services of the future that we could build with the data. And then we challenge them to, at the end of the session, build prototypes or actual working products, that instantiates their ideas in 90 days, to be highlighted at a White House — hosted Safety Datapalooza, Energy Datapalooza, Education Datapalooza, Impact Datapalooza, etc.

We also take the intellectual capital from the workshops, publish it on the White House website, and publicize the opportunity around the country: Discover the data, come up with your own ideas, build prototypes, and throw your hat in the ring to showcase at a Datapalooza.

What happens at the Datapaloozas — our experience in health guides us — is that, first of all, the prototypes and working products inspire many more innovators to actually build new services, products and features, because the data suddenly becomes really concrete to them, in terms of how it could be used.

Secondly, it helps persuade additional folks in the government to liberate more data, making it available, making it machine-readable, as opposed to saying, "Look, I don't know what the upside is. I can only imagine downsides." What happened in health is, when they went to a Datapalooza, they actually saw that, if data is made available, then at no cost to you and no cost to taxpayers, other people who are very smart will build incredible things that actually enhance your mission. And so you should do the same.

As more data gets liberated, that then leads to more products and services getting built, which then inspires more data liberation, which then leads to more products and services getting built — so you have a virtual spiral, like what's happened in health.

The objective of each of these initiatives is not just to liberate data. Data by itself isn't helpful. You can't eat data. You can't pour data on a wound and heal it. You can't pour data on your house and make it more energy efficient. Data is only useful if it's applied to deliver benefit. The whole point of this exercise, the whole point of these kickoff efforts, is to catalyze the development of an ecosystem of data supply and data use to improve the lives of Americans in very tangible ways — and create jobs.

We have the developers and the suppliers of data actually talk to each other, create value for the American people, and then rinse, wash, repeat.

We're recruiting, to join the team of Presidential Innovation Fellows, entrepreneurs and developers from the outside to come in and help with this effort to liberate data, make it machine-readable, and get it out there to entrepreneurs and help catalyze development of this ecosystem.

We went to TechCrunch Disrupt for a reason: it's right smack dab center in the middle of people we want to recruit. We invite people to check out the projects on WhiteHouse.gov and, if you're interested in applying to be a fellow, indicate their interest. Even if they can't come to DC for 6-plus months to be a fellow, but they want to follow one of the projects or contribute or help in some way, we are inviting them express interest in that as well. For example, if you're an entrepreneur, and you're really interested in the education space, and learning about what data is available in education, you can check out the project, look at the data, and perhaps you can build something really good to show at the Education Datapalooza.

Is open data just about government data? What about smart disclosure?

Park: In the context of the Open Data Initiatives projects, it's not just about liberation of government health data: it's also about government catalyzing the release, on a voluntary basis, of private sector data.

Obviously, scaling Blue Button will extend the open data ecosystem. We're also doubling down on Green Button. I was just in California to host discussions around Green Button. Utilities representing 31 million households and businesses have now committed to make Green Button happen. Close to 10 million households and businesses already have access to Green Button data.

There's also a whole bunch of conversation happening about, at some point later this year, having the first utilities add the option of what we're calling "Green Button Connect." Right now, the Green Button is a download, where you go to a website, hit a green button and bam, you download your data. Green Button Connect is the ability for you to say as a consumer, "I authorize this third party to receive a continuous feed of my electricity usage data."

That creates massive additional opportunity for new products and services. That could go live later this year.

As part of the education data initiative, we are pursuing the launch and scale up of something called "My Data," which will have a red color button. (It will probably, ultimately, be called "Red Button.") This is the ability for students and their families to download an electronic copy of their student loan data, of their transcript data, of their academic assessment data.

That notion of people getting their own data, whether it's your health data, your education data, your finance data, your energy use data, that's an important part of these open data initiatives as well, with government helping to catalyze the release of that data to then feed the ecosystem.

How does open data specifically relate to the things that Americans care about, access to healthcare, reducing energy bills, giving their kids more educational opportunities, and job creation? Is this just about apps?

Park: In healthcare, for example, you'll see a growing array of examples that leverage data to create tangible benefit in many, many ways for Americans. Everything from helping me find the right doctor or hospital for my family to being notified of a clinical trial that could assist my profile and save my life, and the ability to get the latest and greatest information about how to manage my asthma and diabetes via government knowledge in the National Library of Medicine.

There is a whole shift in healthcare systems away from pay-for-volume of services to basically paying to get people healthy. It goes by lots of different names — accountable care organizations or episodic payment — but the fundamental common theme is that the doctors and hospitals increasingly will be paid to keep people healthy and to co-ordinate their care, and keep them out of the hospital, and out of the ER.

There's a whole fleet of companies and services that utilize data to help doctors and hospitals do that work, like utilize Medicare claims data to help identity segments of a patient population that are at real risk, and need to get to the ER or hospital soon. There are tools that help journalists identify easily public health issues, like healthcare outcomes disparities by race, gender and ethnicity. There are tools that help country commissioners and mayors understand what's going on in a community, from a health standpoint, and make better policy decisions, like showing them food desserts. There's just a whole fleet of rapidly growing services for consumers, for doctors, nurses, journalists, employers, public policy makers, that help them make decisions, help them deliver improved health and healthcare, and create jobs, all at the same time.

That's very exciting. If you look at all of those products and services — and a subset of them are the ones that self-identify to us, to actually be exhibited at the Health Datapaloozas. Look at the 20 healthcare apps that were at the first Datapalooza or the 50 that were at the second. This year, there are 230 companies that are being narrowed down to about a total of 100 that will be at the Datapalooza. They collectively serve millions of people today, either through brand new products and services or through new features on existing platforms. They help people in ways that we would never have thought of, let alone build.

The taxpayer dollars expended here were zero. We basically just took our data, made it available in machine-readable format, educated entrepreneurs that it was there, and they did the rest. Think about these other sectors, and think about what's possible in those sectors.

In education, through making the data that we've made available, you can imagine much better tools to help you shop for the college that will deliver the biggest bang for your buck and is the best fit for your situation.

We've actually made available a bunch of data about college outcomes and are making more data available in machine-readable form so it can feed college search tools much better. We are going to be enabling students to download machine-readable copies of their own financial aid application, student loan data and school records. That will really turbo charge "smart scholarship" and school search capabilities for those students. You can actually mash that up with college outcomes in a really powerful, personalized college and scholarship search engine that is enabled by your personal data plus machine-readable data. Tools that help kids and their parents pick the right college for their education and get the right financial aid, that's something government is going to facilitate.

In the energy space, there are apps and services that help you leverage your Green Button data and other data to really assess your electricity usage compared to that of others and get concrete tips on how you can actually save yourself money. We're already seeing very clever, very cool efforts to integrate gamification and social networking into that kind of app, to make it a lot more fun and engaging — and make yourself money.

One dataset that's particularly spectacular that we're making a lot more usable is the EnergyStar database. It's got 40,000 different appliances, everything from washing machines to servers that consumers and businesses use. We are creating a much, much easier to use public, downloadable NSTAR database. It's got really detailed information on the energy use profiles and performance of each of these 40,000 appliances and devices. Imagine that actually integrated into much smarter services.

On safety, the kinds of ideas that people are bringing together are awesome. They're everything from using publicly available safety data to plot the optimal route for your kid to walk home or for a first responder to travel through a city and get to a place most expeditiously.

There's this super awesome resource on Data.gov called the "Safer Products API," which is published by the Consumer Products Safety Commission (CPSC). Consumers send in safety reports to CPSC, but until March of last year, you had to FOIA [Freedom of Information Act] CPSC to get these. So what they've now done is actually publish an API which not only makes the entire database of these reports public, without you having to FOIA them, but also makes it available through an API.

One of the ideas that came up is that, when people buy products on eBay, Craiglist, etc, all the time, some huge percentage of Americans never get to know about a recall — a recall of a crib, a recall of a toy. And even when a company recalls new products, old products are in circulation. What if someone built the ability to integrate the recall data and attach it to all the stuff in the eBays and Craigslists of the world?

Former CIO Vivek Kundra often touted government recall apps based upon government data during his tenure. Is this API the same thing, shared again, or something new?

Park: I think the smartest thing the government can do with data like product recalls data is not build our own shopping sites, or our own product information sites: it's to get the information out there in machine-readable form, so that lots and lots of other platforms that have audiences with millions of people already, and who are really good at creating shopping experiences or product comparison experiences, get the data into their hands, so that they can integrate it seamlessly into what they do. I feel that that's really the core play that the government should be engaged in.

I don't know if the Safer Products API was included in the recall app. What I do know is that before 2011, you had to FOIA to get the data. I think that even if the government included it in some app the government built, that it's important for it to get used by lots and lots of other apps that have a collective audience that's massively greater than any app the government could itself build.

Another example of this is the Hospital Compare website. The Hospital Compare website has been around for a long time. Nobody knows about it. There was a survey done that found 94% of Americans didn't know that there was hospital quality data that was available, let alone that there was a hospital compare website. So, the notion of A) making the hospital care data downloadable and B), we actually deployed it a year and a half ago in API form at Medicare.gov.

That then makes the data much easier for lots of other platforms to incorporate it, that are far more likely than HospitalCompare.gov to be able to present the information in actionable forms for citizens. Even if we build our own apps, we have to get this data out to lots of other people that can help people with it. To do that, we have to make it machine-readable, we have to put it into RESTFUL APIs — or at least make it downloadable — and get the word out to entrepreneurs that it's something they can use.

This is a stunning arbitrage opportunity. Even if you take all this data and you "API-ize" it, it's not automatic that entrepreneurs are going to know it's there.

Let's assume that the hospital quality data is good — which it is — and that you build it, and put it into an API. If nobody knows about it, you've delivered no value to the American people. People don't care whether you API a bunch of data. What they care about is that when they need to find a hospital, like I did, for my baby, I can get that information.

The private sector, in the places where we have pushed the pedal to the medal on this, has just demonstrated the incredible ability to make this data a lot more relevant and help a lot more people with it than we could have by ourselves.

White House photo used on associated home and category pages: white house by dcJohn, on Flickr

May 22 2012

Four short links: 22 May 2012

  1. New Zealand Government Budget App -- when the NZ budget is announced, it'll go live on iOS and Android apps. Tablet users get details, mobile users get talking points and speeches. Half-political, but an interesting approach to reaching out to voters with political actions.
  2. Health Care Data Dump (Washington Post) -- 5B health insurance claims (attempted anonymized) to be released. Researchers will be able to access that data, largely using it to probe a critical question: What makes health care so expensive?
  3. Perl 5.16.0 Out -- two epic things here: 590k lines of changes, and announcement quote from Auden. Auden is my favourite poet, Perl my favourite programming language.
  4. WYSIHTML5 (GitHub) -- wysihtml5 is an open source rich text editor based on HTML5 technology and the progressive-enhancement approach. It uses a sophisticated security concept and aims to generate fully valid HTML5 markup by preventing unmaintainable tag soups and inline styles.

May 17 2012

Four short links: 17 May 2012

  1. The Mythology of Big Data (PDF) -- slides from a Strata keynote by Mark R. Madsen. A lovely explanation of the social impediments to the rational use of data. (via Hamish MacEwan)
  2. Scamworld -- amazing deconstruction of the online "get rich quick" scam business. (via Andy Baio)
  3. Ceres: Solving Complex Problems with Computing Muscle -- Johnny Lee Chung explains the (computer vision) uses of the open source Ceres Non-Linear Least Squares Solver library from Google.
  4. How to Start a Think Tank (Guardian) -- The answer to the looming crisis of legitimacy we're facing is greater openness - not just regarding who met who at what Christmas party, but on the substance of policy. The best way to re-engage people in politics is to change how politics works - in the case of our project, to develop a more direct way for the people who use and provide public and voluntary services to create better social policy. Hear, hear. People seize on the little stuff because you haven't given them a way to focus something big with you.

May 15 2012

Profile of the Data Journalist: The Data News Editor

Around the globe, the bond between data and journalism is growing stronger. In an age of big data, the growing importance of data journalism lies in the ability of its practitioners to provide context, clarity and, perhaps most important, find truth in the expanding amount of digital content in the world. In that context, data journalism has profound importance for society. (You can learn more about this world and the emerging leaders of this discipline in the newly released "Data Journalism Handbook.")

To learn more about the people who are doing this work and, in some cases, building the newsroom stack for the 21st century, I conducted in-person and email interviews during the 2012 NICAR Conference and published a series of data journalist profiles here at Radar.

John Keefe (@jkeefe) is a senior editor for data news and journalism technology at WNYC public radio, based in New York City, NY. He attracted widespread attention when an online map he built using available data beat the Associated Press with Iowa caucus results earlier this year. He's posted numerous tutorials and resources for budding data journalists, including how to map data onto county districts, use APIs, create news apps without a backend content management system and make election results maps. As you'll read below, Keefe is a great example of a journalist who picked up these skills from the data journalism community and the Hacks/Hackers group.

Our interview follows, lightly edited for content and clarity. (I've also added a Twitter list of data journalist from the New York Times' Jacob Harris.)

Where do you work now? What is a day in your life like?

I work in the middle of the WNYC newsroom -- quite literally. So throughout the day, I have dozens of impromptu conversations with reporters and editors about their ideas for maps and data projects, or answering questions about how to find or download data.

Our team works almost entirely on "news time," which means our creations hit the Web in hours and days more often than weeks and months. So I'm often at my laptop creating or tweaking maps and charts to go with online stories. That said, Wednesday mornings it's breakfast at a Chelsea cafe with collaborators at Balance Media to update each other on longer-range projects and tools we make for the newsroom and then open source, like Tabletop.js and our new vertical timeline.

Then there are key meetings, such as the newsroom's daily and weekly editorial discussions, where I look for ways to contribute and help. And because there's a lot of interest and support for data news at the station, I'm also invited to larger strategy and planning meetings.

How did you get started in data journalism? Did you get any special degrees or certificates?

I've been fascinated with the intersection of information, design and technology since I was a kid. In the last couple of years, I've marveled at what journalists at the New York Times, ProPublica and the Chicago Tribune were doing online. I thought the public radio audience, which includes a lot of educated, curious people, would appreciate such data projects at WNYC, where I was news director.

Then I saw that Aron Pilhofer of the New York Times would be teaching a programming workshop at the 2009 Online News Association annual meeting. I signed up. In preparation, I installed Django on my laptop and started following the beginner's tutorial on my subway commute. I made my first "Hello World!" web app on the A Train.

I also started hanging out at Hacks/Hackers meetups and hackathons, where I'd watch people code and ask questions along the way.

Some of my experimentation made it onto the WNYC's website -- including our 2010 Census maps and the NYC Hurricane Evacuation map ahead of Hurricane Irene. Shortly thereafter, WNYC management asked me to focus on it full-time.

Did you have any mentors? Who? What were the most important resources they shared with you?

I could not have done so much so fast without kindness, encouragement and inspiration from Pilhofer at the Times; Scott Klein, Al Shaw, Jennifer LaFleur and Jeff Larson at ProPublica; , Chris Groskopf, Joe Germuska and Brian Boyer at the Chicago Tribune; and Jenny 8. Lee of, well, everywhere.

Each has unstuck me at various key moments and all have demonstrated in their own work what amazing things were possible. And they have put a premium on sharing what they know -- something I try to carry forward.

The moment I may remember most was at an afternoon geek talk aimed mainly at programmers programmers. After seeing a demo of a phone app called Twilio, I turned to Al Shaw, sitting next to me, and lamented that I had no idea how to play with such things.

"You absolutely can do this," he said.

He encouraged me to pick up Sinatra, a surprisingly easy way to use the Ruby programming language. And I was off.

What does your personal data journalism "stack" look like? What tools could you not live without?

Google Maps - Much of what I can turn around quickly is possible because of Google Maps. I'm also experimenting with MapBox and Geocommons for more data-intensive mapping projects, like our NYC diversity map.

Google Fusion Tables - Essential for my wrangling, merging and mapping of data sets on the fly.

Google Spreadsheets - These have become the "backend" to many of our data projects, giving reporters and editors direct access to the data driving an application, chart or map. We wire them to our apps using Tabletop.js, an open-source program we helped to develop.

TextMate - A programmer's text editor for Mac. There are several out there, and some are free. TextMate is my fave.

The JavaScript Tools Bundle for Textmate - It checks my JavaScript code ever time I save, flagging me to near-invisible, infuriating errors such as a stray comma or a missing parenthesis. I'm certain this one piece of software has given me more days with my kids.

Firebug for Firefox - Lets you see what your code is doing in the browser. Essential for troubleshooting CSS and JavaScript, and great for learning how the heck other people make cool stuff.

Amazon S3 - Most of what we build are static pages of html and JavaScript, which we host in the Amazon cloud and embed into article pages on our CMS.

census.ire.org - A fabulous, easy-to-navigate presentation of US Census data made by a bunch of journo-programmers for Investigative Reporters and Editors. I send someone there probably once a week.

What data journalism project are you the most proud of working on or creating?

I'd have to say our GOP Iowa Caucuses feature. It has several qualities I like:

  • Mashed-up data -- It mixes live, county vote results with Patchwork Nation community types.
  • A new take -- We know other news sites would shade Iowa's counties by the winner; we shaded them by community type and showed who won which categories.
  • Complete sharability -- We made it super-easy for anyone to embed the map into their own site, which was possible because the results came license-free from the state GOP via Google.
  • Key code from another journalist -- The map-rollover coolness comes from code built by Albert Sun, then of the Wall Street Journal and now at the New York Times.
  • Rapid learning -- I taught myself a LOT of JavaScript quickly.
  • Reusability -- We used it for which we did for each state until Santorum bowed out.


Bonus: I love that I made most of it sitting at my mom's kitchen table over winter break.

Where do you turn to keep your skills updated or learn new things?

WNYC's editors and reporters. They have the bug, and they keep coming up with new and interesting projects. And I find project-driven learning is the most effective way to discover new things. New York Public Radio -- which runs WNYC along with classical radio station WQXR, New Jersey Public Radio and a street-level performance space -- also has a growing stable of programmers and designers, who help me build things, teach me amazing tricks and spot my frequent mistakes.

The IRE/NICAR annual conference. It's a meetup of the best journo-programmers in the country, and it truly seems each person is committed to helping others learn. They're also excellent at celebrating the successes of others.

Twitter. I follow a bunch of folks who seem to tweet the best stuff, and try to keep a close eye on 'em.

Why are data journalism and "news apps" important, in the context of the contemporary digital environment for information?

Candidates, companies, municipalities, agencies and non-profit organizations all are using data. And a lot of that data is about you, me and the people we cover.

So first off, journalism needs an understanding of the data available and what it can do. It's just part of covering the story now. To skip that part of the world would shortchange our audience, and our democracy. Really.

And the better we can both present data to the general public and tell data-driven (or -supported) stories with impact, the better we can do great journalism.

May 07 2012

Four short links: 7 May 2012

  1. Liquid Feedback -- MIT-licensed voting software from the Pirate Party. See this Spiegel Online piece about how it is used for more details. (via Tim O'Reilly)
  2. Putting Gestures Into Objects (Ars Technica) -- Disney and CMU have a system called Touché, where objects can tell whether they're being clasped, swiped, pinched, etc. and by how many fingers. (via BoingBoing)
  3. Real-time Facebook 'likes' Displayed On Brazilian Fashion Retailer's Clothes Racks (The Verge) -- each hanger has a digital counter reflecting the number of likes.
  4. Foldit Games Next Play: Crowdsourcing Better Drug Design (Nature Blogs) -- “We’ve moved beyond just determining structures in nature,” Cooper, who is based at the University of Washington’s Center for Game Science in Seattle, told Nature Medicine. “We’re able to use the game to design brand new therapeutic enzymes.” He says players are now working on the ground-up design of a protein that would act as an inhibitor of the influenza A virus, and he expects to expand the drug development uses of the game to small molecule design within the next year.

  •