Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

May 09 2013

Four short links: 9 May 2013

  1. On Google’s Ingress Game (ReadWrite Web) — By rolling out Ingress to developers at I/O, Google hopes to show how mobile, location, multi-player and augmented reality functions can be integrated into developer application offerings. In that way, Ingress becomes a kind of “how-to” template to developers looking to create vibrant new offerings for Android games and apps. (via Mike Loukides)
  2. Nanoscribe Micro-3D Printerin contrast to stereolithography (SLA), the resolution is between 1 and 2 orders of magnitude higher: Feature sizes in the order of 1 µm and less are standard. (via BoingBoing)
  3. ThingpunkThe problem of the persistence of these traditional values is that they prevent us from addressing the most pressing design questions of the digital era: How can we create these forms of beauty and fulfill this promise of authenticity within the large and growing portions of our lives that are lived digitally? Or, conversely, can we learn to move past these older ideas of value, to embrace the transience and changeability offered by the digital as virtues in themselves? Thus far, instead of approaching these (extremely difficult) questions directly, traditional design thinking has lead us to avoid them by trying to make our digital things more like physical things (building in artificial scarcity, designing them skeumorphically, etc.) and by treating the digital as a supplemental add-on to primarily physical devices and experiences (the Internet of Things, digital fabrication).
  4. Kickstarter and NPRThe internet turns everything into public radio. There’s a truth here about audience-supported media and the kinds of money-extraction systems necessary to beat freeloading in a medium that makes money-collection hard and freeloading easy.

April 24 2013

Four short links: 2 May 2013

  1. Metrico — puzzle game for Playstation centered around infographics (charts and graphs). (via Flowing Data)
  2. The Lease They Can Do (Business Week) — excellent Paul Ford piece on money, law, and music streaming services. So this is not about technology. Nor is it really about music. This is about determining the optimal strategy for mass licensing of digital artifacts.
  3. How Effective Is a Humanoid Robot as a Tool for Interviewing Young Children? (PLosONE) — The results reveal that the children interacted with KASPAR very similar to how they interacted with a human interviewer. The quantitative behaviour analysis reveal that the most notable difference between the interviews with KASPAR and the human were the duration of the interviews, the eye gaze directed towards the different interviewers, and the response time of the interviewers. These results are discussed in light of future work towards developing KASPAR as an ‘interviewer’ for young children in application areas where a robot may have advantages over a human interviewer, e.g. in police, social services, or healthcare applications.
  4. Funding: Australia’s Grant System Wastes Time (Nature, paywalled) — We found that scientists in Australia spent more than five centuries’ worth of time preparing research-grant proposals for consideration by the largest funding scheme of 2012. Because just 20.5% of these applications were successful, the equivalent of some four centuries of effort returned no immediate benefit to researchers.

March 25 2013

The media-marketing merge

I ran across a program Forbes is running called BrandVoice that gives marketers a place on Forbes’ digital platform. During a brief audio interview with TheMediaBriefing, Forbes European managing director Charles Yardley explained how BrandVoice works:

“It’s quite simply a tenancy fee. A licensing fee that the marketer pays every single month. It’s based on a minimum of a six-month commitment. There’s two different tiers, a $50,000-per-month level and a $75,000-per-month level.” [Discussed at the 4:12 mark.]

Take a look at some of the views BrandVoice companies are getting. You can see why marketers would be interested.

BrandVoice exampleBrandVoice example

BrandVoice exampleBrandVoice example

An arrangement like this always leads to big questions: Does pay-to-play content erode trust? Is this a short-term gain that undermines long-term editorial value?

Those are reasonable things to ask, but I have a different take. When I look at BrandVoice posts like this and this, I’m indifferent toward the whole thing — the posts, the partnerships, all of it.

In my mind, these posts don’t reveal a gaping crack in The Foundation of Journalism. Nor do I have an issue with Forbes opening up new revenue streams through its digital platform. Rather, this is just more content vying for attention. It’s material that’s absorbed into the white noise of online engagement.

Now, if a piece of content earns attention — if it has real novelty or insight — that would change my view (I’m using the word “would” because this is all theoretical). I’d still need to know the source and be able to trust the information, and see clear and obvious warnings when content is published outside of traditional edit norms. But if all of those must-haves are present, is there anything wrong with interesting content that comes through a pay-to-play channel?

Heck, TV advertisers pay to spread messages through broadcast platforms, and from time to time those ads are entertaining and maybe even a little useful. Is that any different?

I’ve been neck-deep in media and marketing for years, and it’s possible my perspective is obscured by saturation. That’s why I’d like to hear other viewpoints on these media-marketing arrangements. Please chime in through the comments if you have an opinion.


Disclosure: O’Reilly Media has a blog on Forbes. It’s not part of the BrandVoices program, and there’s no financial arrangement.

March 20 2013

March 18 2013

Why I’m changing my tune on paywalls

The Pew Research Center is out with its annual “State of the News Media” report. Much of it is what you’d expect: newspapers and local television are struggling, mobile is rising, digital revenue hasn’t — and can’t — replace traditional print revenue, and on and on.

But read carefully, and you’ll find hope.

For example, Pew says the embrace of paywalls might improve the quality of the content:

“The rise of digital paid content could also have a positive impact on the quality of journalism as news organizations strive to produce unique and high-quality content that the public believes is worth paying for.”

I used to criticize paywalls. I thought they could only work for specialized content or material that’s attached to a desired outcome (i.e. subscribe to the Wall Street Journal, use the insights to make money).

My concern was that publishers would slam walls around their existing content and ask people to pay for an experience that had once been free. That made no sense. Who wants to pay for slideshows and link bait and general news?

But content that’s “worth paying for” is a different thing altogether. Publishers who go this route are acknowledging that a price tag requires justification.

Will it work? Maybe. What I might pay is different than what you might pay. There’s that pesky return-on-investment thing to consider as well.

However, my bigger takeaway — and this is why I’m changing my tune on paywalls — is that value is now part of the paywall equation. That’s a good start.

February 20 2013

Four short links: 20 February 2013

  1. The Network of Global Control (PLoS One) — We find that transnational corporations form a giant bow-tie structure and that a large portion of control flows to a small tightly-knit core of financial institutions. [...] From an empirical point of view, a bow-tie structure with a very small and influential core is a new observation in the study of complex networks. We conjecture that it may be present in other types of networks where “rich-get-richer” mechanisms are at work. (via The New Aesthetic)
  2. Using SimCity to Diagnose My Home Town’s Traffic Problems — no actual diagnosis performed, but the modeling and observations gave insight. I always feel that static visualizations (infographics) are far less useful than an interactive simulation that can give you an intuitive sense of relationships and behaviour. once I’d built East Didsbury, the strip of shops in Northenden stopped making as much money as they once were, and some were even beginning to close down as my time ran out. Walk along Northenden high street, and you’ll know that feeling.
  3. How the Harlem Shake Went from Viral Sideshow to Global Meme (The Verge) — interesting because again the musician is savvy enough (and has tools and connections) to monetize popularity without trying to own every transaction involving his idea. Baauer and Mad Decent have generally been happy to let a hundred flowers bloom, permitting over 4,000 videos to use an excerpt of the song but quietly adding each of them to YouTube’s Content ID database, asserting copyright over the fan videos and claiming a healthy chunk of the ad revenue for each of them.
  4. typeahead.js (GitHub) — Javascript library for fast autocomplete.

February 14 2013

Four short links: 14 February 2013

  1. Welcome to the Malware-Industrial Complex (MIT) — brilliant phrase, sound analysis.
  2. Stupid Stupid xBoxThe hardcore/soft-tv transition and any lead they feel they have is simply not defensible by licensing other industries’ generic video or music content because those industries will gladly sell and license the same content to all other players. A single custom studio of 150 employees also can not generate enough content to defensibly satisfy 76M+ customers. Only with quality primary software content from thousands of independent developers can you defend the brand and the product. Only by making the user experience simple, quick, and seamless can you defend the brand and the product. Never seen a better put statement of why an ecosystem of indies is essential.
  3. Data Feedback Loops for TV (Salon) — Netflix’s data indicated that the same subscribers who loved the original BBC production also gobbled down movies starring Kevin Spacey or directed by David Fincher. Therefore, concluded Netflix executives, a remake of the BBC drama with Spacey and Fincher attached was a no-brainer, to the point that the company committed $100 million for two 13-episode seasons.
  4. wrka modern HTTP benchmarking tool capable of generating significant load when run on a single multi-core CPU. It combines a multithreaded design with scalable event notification systems such as epoll and kqueue.

February 07 2013

Looking at the many faces and forms of data journalism

Over the past year, I’ve been investigating data journalism. In that work, I’ve found no better source for understanding the who, where, what, how and why of what’s happening in this area than the journalists who are using and even building the tools needed to make sense of the exabyte age. Yesterday, I hosted a Google Hangout with several notable practitioners of data journalism. Video of the discussion is embedded below:

Over the course of the discussion, we talked about what data journalism is, how journalists are using it, the importance of storytelling, ethics, the role of open source and “showing your work” and much more.

Participants

Guests on the hangout included:

Projects

Here are just a few of the sites, services and projects we discussed:

In addition, you can see more of our data journalism coverage here.

February 05 2013

Four short links: 5 February 2013

  1. toolbar — tooltips in jQuery, cf hint.css which is tooltips in CSS.
  2. Security Engineering — 2ed now available online for free. (via /r/netsec)
  3. Economics of Netflix’s $100M New Show (The Atlantic) — Up until now, Netflix’s strategy has involved paying content makers and distributors, like Disney and Epix, for streaming rights to their movies and TV shows. It turns out, however, the company is overpaying on a lot of those deals. [...] [T]hese deals cost Netflix billions.
  4. Inceptiona FireWire physical memory manipulation and hacking tool exploiting IEEE 1394 SBP-2 DMA. The tool can unlock (any password accepted) and escalate privileges to Administrator/root on almost* any powered on machine you have physical access to. The tool can attack over FireWire, Thunderbolt, ExpressCard, PC Card and any other PCI/PCIe interfaces. (via BoingBoing)

January 30 2013

January 14 2013

Four short links: 14 January 2013

  1. Open Source MetricsTalking about the health of the project based on a single metric is meaningless. It is definitely a waste of time to talk about the health of a project based on metrics like number of software downloads and mailing list activities. Amen!
  2. BitTorrent To Your TVThe first ever certified BitTorrent Android box goes on sale today, allowing users to stream files downloaded with uTorrent wirelessly to their television. The new set-top box supports playback of all popular video formats and can also download torrents by itself, fully anonymously if needed. (via Andy Baio)
  3. Tumblr URL Culture — the FOO.tumblr.com namespace is scarce and there’s non-financial speculation. People hoard and trade URLs, whose value is that they say “I’m cool and quirky”. I’m interested because it’s a weird largely-invisible Internet barter economy. Here’s a rant against it. (via Beta Knowledge)
  4. Design-Fiction Slider Bar of Disbelief (Bruce Sterling) — I love the list as much as the diagram. He lays out a sliding scale from “objective reality” to “holy relics” and positions black propaganda, 419 frauds, design pitches, user feedback, and software code on that scale (among many other things). Bruce is an avuncular Loki, pulling you aside and messing with your head for your own good.

January 10 2013

Want to analyze performance data for accountability? Focus on quality first.

Here’s an ageless insight that will endure well beyond the “era of big data“: poor collection practices and aging IT will derail any institutional efforts to use data analysis to improve performance.

According to an investigation by the Los Angeles Times, poor record-keeping is holding back state government efforts to upgrade California’s 911 system. As with any database project, beware “garbage in, garbage out,” or “GIGO.”

As Ben Welsh and Robert J. Lopez reported for the L.A. Times in December, California’s Emergency Medical Services Authority has been working to centralize performance data since 2009.

Unfortunately, it’s difficult to achieve data-driven improvements or manage against perceived issues by applying big data to the public sector if the data collection itself is flawed. The L.A. Times reported quality issues stemmed from how response times were measured to record keeping on paper to a failure to keep records at all.

lafdanalysislafdanalysis
Image Credit: Ben Welsh, who mapped 911 response time data for Los Angeles Times.

When I shared this story with the Radar team, Nat Torkington suggested revisiting the “Observe, Orient, Decide, and Act” (OODA) loop familiar to military strategists.

“If your observations are flawed, your decisions will be too,” wrote Nat, in an email exchange. “If you pump technology investment into the D phase, without similarly improving the Os, you’ll make your crappy decisions faster.”

Alistair Croll explored the relevance of OODA to big data in his post on the feedback economy last year. If California wants to catalyze the use of data-driven analysis to improve response times that vary by geography and jurisdictions, start with the first “O.”

The set of factors at play here, however, means that there won’t be a single silver bullet for putting California’s effort back on track. Lack of participation and reporting standards, and old IT systems are all at issue — and given California’s ongoing financial issues, upgrading the latter and requiring local fire departments and ambulance firms to spend time and money on data collection will not be an easy sell.

Filed from the data desk

The investigative work of the L.A. Times was substantially supported by its Data Desk, a team of reporters and web developers that specializes in maps, databases, analysis and visualization. I included their interactive visualization mapping how fast the Los Angeles Fire Department responded to calls in my recent post on how data journalism is making sense of the world. When I profiled Ben Welsh’s work last year in our data journalist series, he told me this kind of project is exactly the sort of work he’s most proud of doing.

“As we all know, there’s a lot of data out there,” said Welsh, in our interview, “and, as anyone who works with it knows, most of it is crap. The projects I’m most proud of have taken large, ugly datasets and refined them into something worth knowing: a nut graf in an investigative story or a data-driven app that gives the reader some new insight into the world around them.”

The Data Desk set a high bar in this most recent investigation by not only making sense of the data, but also in releasing the data behind the open source maps of California’s emergency medical agencies it published as part of the series.

This isn’t the first time they’ve made code available. As Welsh noted in a post about the series, the Data Desk has “previously written about the technical methods used to conduct [the] investigation, released the base layer created for an interactive map of response times and contributed the location of LAFD’s 106 fire station to the Open Street Map.”

Creating an open source newsroom is not easy. In sharing not only its code but its data, the Los Angeles Times is setting a notable example for the practice of open journalism in the 21st century, building out the newsroom stack and hinting at media’s networked future.

This post is part of our series investigating data journalism.

Want to analyze performance data for accountability? Focus on quality first.

Here’s an ageless insight that will endure well beyond the “era of big data“: poor collection practices and aging IT will derail any institutional efforts to use data analysis to improve performance.

According to an investigation by the Los Angeles Times, poor record-keeping is holding back state government efforts to upgrade California’s 911 system. As with any database project, beware “garbage in, garbage out,” or “GIGO.”

As Ben Welsh and Robert J. Lopez reported for the L.A. Times in December, California’s Emergency Medical Services Authority has been working to centralize performance data since 2009.

Unfortunately, it’s difficult to achieve data-driven improvements or manage against perceived issues by applying big data to the public sector if the data collection itself is flawed. The L.A. Times reported quality issues stemmed from how response times were measured to record keeping on paper to a failure to keep records at all.

lafdanalysislafdanalysis
Image Credit: Ben Welsh, who mapped 911 response time data for Los Angeles Times.

When I shared this story with the Radar team, Nat Torkington suggested revisiting the “Observe, Orient, Decide, and Act” (OODA) loop familiar to military strategists.

“If your observations are flawed, your decisions will be too,” wrote Nat, in an email exchange. “If you pump technology investment into the D phase, without similarly improving the Os, you’ll make your crappy decisions faster.”

Alistair Croll explored the relevance of OODA to big data in his post on the feedback economy last year. If California wants to catalyze the use of data-driven analysis to improve response times that vary by geography and jurisdictions, start with the first “O.”

The set of factors at play here, however, means that there won’t be a single silver bullet for putting California’s effort back on track. Lack of participation and reporting standards, and old IT systems are all at issue — and given California’s ongoing financial issues, upgrading the latter and requiring local fire departments and ambulance firms to spend time and money on data collection will not be an easy sell.

Filed from the data desk

The investigative work of the L.A. Times was substantially supported by its Data Desk, a team of reporters and web developers that specializes in maps, databases, analysis and visualization. I included their interactive visualization mapping how fast the Los Angeles Fire Department responded to calls in my recent post on how data journalism is making sense of the world. When I profiled Ben Welsh’s work last year in our data journalist series, he told me this kind of project is exactly the sort of work he’s most proud of doing.

“As we all know, there’s a lot of data out there,” said Welsh, in our interview, “and, as anyone who works with it knows, most of it is crap. The projects I’m most proud of have taken large, ugly datasets and refined them into something worth knowing: a nut graf in an investigative story or a data-driven app that gives the reader some new insight into the world around them.”

The Data Desk set a high bar in this most recent investigation by not only making sense of the data, but also in releasing the data behind the open source maps of California’s emergency medical agencies it published as part of the series.

This isn’t the first time they’ve made code available. As Welsh noted in a post about the series, the Data Desk has “previously written about the technical methods used to conduct [the] investigation, released the base layer created for an interactive map of response times and contributed the location of LAFD’s 106 fire station to the Open Street Map.”

Creating an open source newsroom is not easy. In sharing not only its code but its data, the Los Angeles Times is setting a notable example for the practice of open journalism in the 21st century, building out the newsroom stack and hinting at media’s networked future.

This post is part of our series investigating data journalism.

December 21 2012

Six ways data journalism is making sense of the world, around the world

When I wrote that Radar was investigating data journalism and asked for your favorite examples of good work, we heard back from around the world.

I received emails from Los Angeles, Philadelphia, Canada and Italy that featured data visualization, explored the role of data in government accountability, and shared how open data can revolutionize environmental reporting. A tweet pointed me to a talk about how R is being used in the newsroom. Another tweet linked to relevant interviews on social science and the media:

Two of the case studies focused on data visualization, an important practice that my colleague Julie Steele and other editors at O’Reilly Media have been exploring over the past several years.

Several other responses are featured at more length below. After you read through, make sure to also check out this terrific Ignite talk on data journalism recorded at this year’s Newsfoo in Arizona.

Visualizing civic health

Meredith Broussard, a professor at the University of Pennsylvania, sent us a link to a recent data journalism project she did for Hidden City Philadelphia, which won an award from the National Council on Citizenship and the Knight Foundation. The project, measuring Philadelphia’s civic health, won honorable mention in Knight’s civic data challenge. Data visualization was a strong theme among the winners of that challenge.

Data journalism in PhiladelphiaData journalism in Philadelphia

Mapping ambulance response times

I profiled the data journalism work of The Los Angeles Times earlier this year, when I interviewed news developer Ben Welsh about the newspaper’s Data Desk, a team of reporters and web developers that specializes in maps, databases, analysis and visualization.

Recently, the Data Desk made an interactive visualization that mapped how fast the Los Angeles Fire Department responds to calls.

LA Times fire response timesLA Times fire response times

Visualizing UK government spending

The Guardian Datablog is one of the best sources of interesting, relevant data journalism work, from sports to popular culture to government accountability. Every post demonstrates an emerging practice when its editors make it possible for readers to download the data themselves. Earlier this month, the Datablog put government spending in the United Kingdom under the microscope and accompanied it with a downloadable graphic (PDF).

The Guardian’s data journalism is particularly important as the British government continues to invest in open data. In June, the United Kingdom’s Cabinet Office relaunched Data.gov.uk and released a new open data white paper. The British government is doubling down on the notion that open data can be a catalyst for increased government transparency, civic utility and economic prosperity. The role of data journalism in delivering those outcomes is central.

(Note: A separate Radar project is digging into the open data economy.)

An Italian data job

The Italian government, while a bit behind the pace set in the UK, has made more open data available since it launched a national platform in 2011.

Elisabetta Tola, an Italian data journalist, wrote in to share her work on a series of Wired Magazine articles that feature data on seismic risk assessment in Italian schools. The interactive lets parents search for schools, a feature that embodies service journalism and offers more value than a static map.

Italian schools and earthquakes visualizationItalian schools and earthquakes visualization

Tola highlighted a key challenge in Italy that exists in many other places around the world: How can data journalism be practiced in countries that do not have a Freedom of Information Act or a tradition of transparency on government actions and spending? If you have ideas, please share them in the comments or email me.

Putting satellite imagery to work

Brazil, by way of contrast, notably passed a freedom of information law this past year, fulfilling one of its commitments to the Open Government Partnership.

Earlier this year, when I traveled to Brazil to moderate a panel at the historic partnership’s annual meeting, I met Gustavo Faleiros, a journalist working with open data focusing on the Amazon rainforest. Faleiros is as a Knight International Journalism Fellow, in partnership with Washington-based organizations International Center for Journalists and Internews. Today, Faleiros continues that work as the project coordinator for InfoAmazonia.org, a beautiful mashup of open data, maps and storytelling.

Faleiros explained that the partnership is training Brazilian journalists to use satellite imagery and collect data related to forest fires and carbon monoxide. He shared this video that shows a data visualization that came out of that work:

As 2012 comes to an end, the rate of Amazon deforestation has dropped to record lows. These tools help the world see what’s happening from high above.

Data-driven broadcast journalism?

I also heard about work in much colder climes when Keith Robinson wrote in from Canada. “As part of large broadcast organizations, one thing that is very satisfying about data journalism is that it often puts our digital staff in the driver’s seat — what starts as an online investigation often becomes the basis for original and exclusive broadcast content,” he wrote in an email.

Robinson, the senior producer for specials and interactive at Global News in Canada, highlighted several examples of their Data Desk’s work, including:

Robinson expects 2013 will see further investment and expansion in the data journalism practice at Global News.

Robinson also pointed to a practice that media should at least consider adopting: Global News is not only consuming and displaying open data, but also publishing the data they receive from the Canadian government. “As we make access to information requests, we’re trying to make the data received available to the public,” he wrote.

From the big picture to next steps

It was instructive to learn more about the work of two large media organizations, the Los Angeles Times and Canada’s Global News, which have been building their capacity to practice data journalism. The other international perspectives in my inbox and tweet stream, however, were a reminder that big-city newsrooms that can afford teams of programmers and designers aren’t the only players here.

To put it another way, acts of data journalism by small teams or individuals aren’t just plausible, they’re happening — from Italy to Brazil to Africa. That doesn’t mean that the news application teams at NPR, The Guardian, ProPublica or the New York Times aren’t setting the pace for data journalism when it comes to cutting edge work — far from it — but the tools and techniques to make something worthwhile are being democratized.

That’s possible in no small part because of the trend toward open source tools and social coding I’m seeing online, from Open Street Map to more open elections.

It’s a privilege to have a global network to tap into for knowledge and, in the best moments, wisdom. Thank you — and please keep the responses coming, whether you use email, Twitter or the phone. Your input is helping shape a report I’m developing that ties together our coverage of data journalism. Look for that to be published early in the new year.

Related

Reposted bycheg00 cheg00

November 29 2012

As digital disruption comes to Africa, investing in data journalism takes on new importance

This interview is part of our ongoing look at the people, tools and techniques driving data journalism.

I first met Justin Arenstein (@justinarenstein) in Chişinău, Moldova, where the media entrepreneur and investigative journalist was working as a trainer at a “data boot camp” for journalism students. The long-haired, bearded South African instantly makes an impression with his intensity, good humor and focus on creating work that gives citizens actionable information.

Justin ArensteinJustin ArensteinWhenever we’ve spoken about open data and open government, Arenstein has been a fierce advocate for data-driven journalism that not only makes sense of the world for readers and viewers, but also provides them with tools to become more engaged in changing the conditions they learn about in the work.

He’s relentlessly focused on how open data can be made useful to ordinary citizens, from Africa to Eastern Europe to South America. For instance, in November, he highlighted how data journalism boosted voter registration in Kenya, creating a simple website using modern web-based tools and technologies.

For the last 18 months, Arenstein has been working as a Knight International Fellow embedded with the African Media Initiative (AMI) as a director for digital innovation. The AMI is a group of the 800 largest media companies on the continent of Africa. In that role, Arenstein has been creating an innovation program for the AMI, building more digital capacity in countries that are as in need of effective accountability from the Fourth Estate as any in the world. That disruption hasn’t yet played itself out in Africa because of a number of factors, explained Arenstein, but he estimates that it will be there within five years.

“Media wants to be ready for this,” he said, “to try and avoid as much of the business disintegration as possible. The program is designed to help them grapple with and potentially leapfrog coming digital disruption.”

In the following interview, Arenstein discusses the African media ecosystem, the role of Hacks/Hackers in Africa, and expanding the capacity of data journalism.

Why did you adopt the Hacks/Hackers model and scale it? Why is it relevant to what’s happening around Africa?

Justin Arenstein: African journalists are under-resourced but also poorly trained, probably even more so than in the U.S. and elsewhere. Very, very few of them have any digital skills, never mind coding skills. Simply waiting for journalists to make the leap themselves and start learning coding skills and more advanced digital multimedia content production skills is just too — well, we don’t have enough time to do that, if we’re going to beat this disruption that’s coming.

The idea was to clone parts of the basic model of Hacks/Hackers from the U.S., which is a voluntary forum and society where journalists, UI people, designers, graphics people and coders meet up on a regular basis.

Unlike in the U.S., where Hacks/Hackers is very focused on startup culture, the African chapters have been very focused on data-driven journalism and imparting some basic skills. We’re trying to avoid some of the pitfalls experienced in the U.S. and get down to using data as a key tool in creating content. A big weakness in a lot of African media is that there’s very little unique content, firstly, and that the unique content that is available is not particularly well produced. It’s not deep. It’s not substantiated. It’s definitely not linked data.

We’ve been focusing on improving the quality of the content so that the companies where these journalists work will be able to start weaning themselves from some of the bad business practices that they are guilty of and start concentrating on building up their own inventory. That’s worked really well in some of the African countries along the coastlines where there’s data access, because you’ve got cables coming in. In the hinterland of Africa, data and Internet are not widely available. The Hacks/Hackers chapters there have been more like basic computer-assisted reporting training organizations.

Like in the U.S., they all run themselves. But unlike in the U.S., we have a structured agenda, a set of protocols, an operating manual, and we do subsidize each of the chapters to help them meet the physical needs of cost. They’re not quite as voluntary as the U.S. ones; it’s a more formal structure. That’s because they’re designed to surface good ideas, to bring together a challenge that you wouldn’t ordinarily find in the media ecosystem at least, and then to help kick-start experimentation.

Do you see any kind of entrepreneurial activity coming out of them now?

Justin Arenstein: I’m not aware of any notable startups. We’ve had ideas where people are collaborating to build toward startups. I haven’t seen any products launched yet, but what we have seen is journalist-led startups that were outside of these Hacks/Hackers chapters now starting to come into the fold.

Why? Because this is where they can find some of the programming and engineering skills that they need, that they were struggling to find outside of the ecosystem. They are finding engineers or programmers, at least, but they’re not finding programmers who are tuned to content needs or to media philosophies and models. There’s a better chance that they’ll find those inside of these chapters.

The chapters are fairly young, though. The oldest chapter is about six months old now, and still fairly small. We’re nowhere near the size of some of the Latin American chapters. We have forged very strong links with them, and we follow their model a lot more closely than the U.S. model. The biggest chapter is probably about 150 members. They all meet, at a minimum, once a month. Interestingly, they are becoming the conduits not just for hackathons and “scrape-a-thons,” but are also now our local partners for implementing thinks like our data boot camps.

Those are week-long, intensive hands-on experiential training, where we’re flying in people from the Guardian data units, the Open Knowledge Foundation and from Google. We’re actually finding the guys behind Google Refine and Google Fusion Tables and flying in some of those people, so they can see end-users in a very different environment to what they’re used to. People walk into those boot camps not knowing what a spreadsheet is and, by the end of it, they’re producing their first elementary maps and visualizations. They’re crunching data.

What stories have “data boot camp” participants produced afterward?

Justin Arenstein: Here’s an example. We had a boot camp in Kenya. NTV, the national free-to-air station, had been looking into why young girls in a rural area of Kenya did very well academically until the ages of 11 or 12 — and then either dropped off the academic record completely or their academic performance plummeted. The explanation by the authorities and everyone else was that this was simply traditional; it’s tribal. Families are pulling them out of school to do chores and housework, and as a result, they can’t perform.

Irene Choge [a Kenyan boot camp participant who attended data journalism training] started mining the data. She came from that area and knew it wasn’t that [cause]. So she looked into public data. She first assumed it was cholera, so she looked into medical records. Nothing there. She then looked into water records. From water, she started looking into physical infrastructure and public works. She discovered these schools had no sanitation facilities and that the schools with the worst performing academics were those that didn’t have sanitation facilities, specifically toilets.

What’s the connection?

Justin Arenstein: When these girls start menstruating, there’s nowhere for them to go to attend to themselves, other than into the bushes around the school. They were getting harassed and embarrassed. They either stopped going to school completely or they would stop going during that part of their cycle and, as a result, their schoolwork suffered dramatically. She then produced a TV documentary that evoked widespread public outcry and changed policies.

In addition to that, her newsroom is working on building an app. A parent who watches this documentary and is outraged will then be able to use the app to find out what’s happening at their daughter’s school. If their daughter’s school is one of those that has no facilities, the app then helps them through a text-based service to sign a petition and petition the responsible official to improve the situation, as well as link up with other outraged parents. It mobilizes them.

What we liked about her example was that it was more than just doing a visualization, which is what people think about when you say “data journalism.”

First, she used data tools to find trends and stories that had been hidden to solve a mystery. Secondly, she then did real old-fashioned journalism and went out in the field and confirmed the data wasn’t lying. The data was accurate.

Thirdly, she then used the data to give people the tools to actually act on the information. She’s using open data and finding out in your district, this is your school, this is how you impact it, this is the official you should be emailing or writing to about it. That demonstrates that, even in a country where most people access information through feature phones, data can still have a massive impact at grassroots level.

These are the kinds of successes that we are looking for in these kinds of outreach programs when it comes to open data.

How does the practice of data-driven journalism or the importance of computer-assisted reporting shift when a reporter can’t use rich media or deploy bandwidth-heavy applications?

Justin Arenstein: We’re finding something that maybe you’re starting to see inklings of elsewhere as well: data journalism doesn’t have to be the product. Data journalism can also be the route that you follow to get to a final story. It doesn’t have to produce an infographic or a map.

Maps are very good ways to organize information. They’re very poor mechanisms for consuming information. No one kicks back on a Sunday afternoon laying on their sofa, reading a map, but if a map triggers geofenced information and pushes relevant local information at you in your vicinity, then it becomes a useful mechanism.

What we’re doing in newsrooms is around investigative journalism. For example, we’re funding projects around extractive industries. We’re mapping out conversations and relationships between people. We’re then using them as analytical tools in the newsroom to arrive at better, deeper and evidence-driven reporting, which is a major flaw and a major weakness in many African media.

What capacity needs to be built in these areas? What are people doing now? What matters most?

Justin Arenstein: Investigative journalism in Africa, like in many other places, tends to be scoop-driven, which means that someone’s leaked you a set of documents. You’ve gone and you’ve verified them and often done great sleuth work. There are very few systematic, analytical approaches to analyzing broader societal trends. You’re still getting a lot of hit-and-run reporting. That doesn’t help us analyze the societies we’re in, and it doesn’t help us, more importantly, build the tools to make decisions.

Some of the apps that we are helping people build, based off of their reporting, are invariably not visualizations. They’re rather saying, “Let’s build a tool that augments the reporting, reflects the deeper data that the report is based on, and allows people to use that tool to make a personal decision.” It’s engendering action.

A lot of the fantastic work you’ve seen from people at the Guardian and others has been about telling complex stories simply via infographics, which is a valid but very different application of data journalism.

I think that, specifically in East Africa and in Southern Africa, there’s growing recognition that the media are important stewards of historical data. In many of these societies, including industrialized societies like South Africa, the state hasn’t been a really good curator of public data and public information because of their political histories.

Nation states don’t see data as an asset? Is that because technical capacity isn’t there? Or is that because data actually contains evidence of criminality, corruption or graft?

Justin Arenstein: It’s often ineptitude and lack of resources in South Africa’s instance. In a couple of other countries, it’s systematic purging of information that is perhaps embarrassing when there’s a change of regime or political system — or in the case of South Africa and many of the colonial countries, a simple unwillingness or lack of insight as to the importance of collecting data about second-class citizens, largely the black population.

The official histories are very thin. There’s nowhere near the depth of nuance or insight into a society that you would find in the U.S. or in Europe, where there’s been very good archival record keeping. Often, the media are the only people who’ve really been keeping that kind of information, in terms of news reportage. It’s not brilliant. It’s often not primary sources — it’s secondary. But the point is that often it’s the only information that’s available.

What we’re doing is working with media companies now to help digitize and turn reportage into structured data. In a vacuum, because there is no other data, suddenly it becomes an important commercial commodity. Anyone who wants to build, for example, a tourism app or a transport app, will find that there is no other information available. This may sound like a bizarre concept to most people living in data-rich countries, like the U.S., but you simply can’t find the content. That means that you have to then go out and create the content yourself before you can build the app.

Is this a different sort of a “data divide,” where a country is “data-poor?”

Justin Arenstein: Well, maybe digitally “data poor,” because what we are doing is we’re saying that there is data. We initially also had the same reaction, saying “there is no data here,” and then realized that there’s a hell of a lot of data. Invariably, it’s locked up in deadwood format. So [we're now] liberating that data, digitizing it, structuring it, and then making sure that it’s available for people to use.

How much are media entities you work with making data, as opposed to just digitizing?

Justin Arenstein: Some are making data. We haven’t, because a lot of other actors are involved in citizen data creation. We haven’t really focused too many of our very scarce resources on that component yet.

We are funding a couple of citizen reporting apps, because there’s a lot of hype around citizen data and we’re trying to see if there are models that can really work where you create credible, sourced and actionable information. We don’t believe that you’re going to be able to do that just from text messaging. We’re looking at alternative kinds of interfaces and methods for transmitting information.

Are there companies and startups that are consuming the digital data that you’re producing? If so, what are they doing?

Justin Arenstein: Outside of the News Challenge, we are co-founding something with the World Bank called “Code for Kenya.” It’s modeled fairly closely on the Mozilla Open Use Fellowships, with a few tweaks. It’s maybe a hybrid of Code for America and the Mozilla Open Fellowships.

Where Code for America focuses on cities and Mozilla focuses on newsrooms, we’ve embedded open data strategists and evangelists into the newsrooms, backed up by an external development team at a civic tech lab. They’re structuring the data that’s available, such as turning old microfiche rolls into digital information, cleaning it up and building a data disk. They’re building news APIs and pushing the idea that rather than building websites, design an API specifically for third-party repurposing of your content. We’re starting to see the first early successes. Four months in, some of the larger media groups in Kenya are now starting to have third-party entrepreneurs developing using their content and then doing revenue-share deals.

The only investment from the data holder, which is the media company, is to actually clean up the data and then make it available for development. Now, that’s not a new concept. The Guardian in the United Kingdom has experimented with it. It’s fairly exciting for these African companies because there’s potentially — and arguably, larger — appetite for the content because there’s not as much content available. Suddenly, the unit cost of value of that data is far higher than it might be in the U.K. or in the U.S.

Media companies are seriously looking at it as one of many potential future revenue streams. It enables them to repurpose their own data, start producing books and the rest of it. There isn’t much book publishing in Africa, by Africans, for Africans. Suddenly, if the content is available in an accessible format, it gives them an opportunity to mash-up stuff and create new kinds of books.

They’ll start seeing that content itself can be a business model. The impact that we’re seeking there is to try and show media companies that investing in high-quality unique information actually gives you a long-term commodity that you can continue to reap benefits from over time. Whereas simply pulling stuff off the wire or, as many media do in Africa, simply lifting it off of the web, from the BBC or elsewhere, and crediting it, is not a good business model.

Photo via International Center for Journalists.

Related:

November 26 2012

Investigating data journalism

Great journalism has always been based on adding context, clarity and compelling storytelling to facts. While the tools have improved, the art is the same: explaining the who, what, where, when and why behind the story. The explosion of data, however, provides new opportunities to think about reporting, analysis and publishing stories.

As you may know, there’s already a Data Journalism Handbook to help journalists get started. (I contributed some commentary to it). Over the next month, I’m going to be investigating the best data journalism tools currently in use and the data-driven business models that are working for news startups. We’ll then publish a report that shares those insights and combines them with our profiles of data journalists.

Why dig deeper? Getting to the heart of what’s hype and what’s actually new and noteworthy is worth doing. I’d like to know, for instance, whether tutorials specifically designed for journalists can be useful, as Joe Brockmeier suggested at ReadWrite. On a broader scale, how many data journalists are working today? How many will be needed? What are the primary tools they rely upon now? What will they need in 2013? Who are the leaders or primary drivers in the area? What are the most notable projects? What organizations are embracing data journalism, and why?

This isn’t a new interest for me, but it’s one I’d like to found in more research. When I was offered an opportunity to give a talk at the second International Open Government Data Conference at the World Bank this July, I chose to talk about open data journalism and invited practitioners on stage to share what they do. If you watch the talk and the ensuing discussion in the video below, you’ll pick up great insight from the work of the Sunlight Foundation, the experience of Homicide Watch and why the World Bank is focused on open data journalism in developing countries.

The sites and themes that I explored in that talk will be familiar to Radar readers, focusing on the changing dynamic between the people formerly known as the audience and the editors, researchers and reporters who are charged with making sense of the data deluge for the public good. If you’ve watched one of my Ignites or my Berkman Center talk, much of this won’t be new to you, but the short talk should be a good overview of where I think this aspect of data journalism is going and why I think it’s worth paying attention to today.

For instance, at the Open Government Data Conference Bill Allison talked about how open data creates government accountability and reveals political corruption. We heard from Chris Amico, a data journalist who created a platform to help a court reporter tell the story of every homicide in a city. And we heard from Craig Hammer how the World Bank is working to build capacity in media organizations around the world to use data to show citizens how and where borrowed development dollars are being spent on their behalf.

The last point, regarding capacity, is a critical one. Just as McKinsey identified a gap between available analytic talent and the demand created by big data, there is a data science skills gap in journalism. Rapidly expanding troves of data are useless without the skills to analyze it, whatever the context. An over focus on tech skills could exclude the best candidates for these jobs — but there will need to be training to build them.

This reality hasn’t gone unnoticed by foundations or the academy. In May, the Knight Foundation gave Columbia University $2 million for research to help close the data science skills gap. (I expect to be talking to Emily Bell, Jonathan Stray and the other instructors and students.)

Media organizations must be able to put data to work, a need that was amply demonstrated during Hurricane Sandy, when public open government data feeds became critical infrastructure.

What I’d like to hear from you is what you see working around the world, from the Guardian to ProPublica, and what you’re working on, and where. To kick things off, I’d like to know which organizations are doing the most innovative work in data journalism.

Please weigh in through the comments or drop me a line at alex@oreilly.com or at @digiphile on Twitter.

October 11 2012

Four short links: 11 October 2012

  1. ABalytics — dead simple A/B testing with Google Analytics. (via Dan Mazzini)
  2. Fastest Rubik Cube Solver is Made of Lego — it takes less than six seconds to solve the cube. Watch the video, it’s … wow. Also cool is watching it fail. (via Hacker News)
  3. Fairfax Watches BitTorrent (TorrentFreak) — At a government broadband conference in Sydney, Fairfax’s head of video Ricky Sutton admitted that in a country with one of the highest percentage of BitTorrent users worldwide, his company determines what shows to buy based on the popularity of pirated videos online.
  4. Web Performance Tools (Steve Souders) — compilation of popular web performance tools. Reminds me of nmap’s list of top security tools.

October 03 2012

The missing ingredient from hyperwired debates: the feedback loop

PodiumPodiumWhat a difference a season makes. A few months after widespread online frustration with a tape-delayed Summer Olympics, the 2012 Presidential debates will feature the most online livestreams and wired, up-to-the-second digital coverage in history.

Given the pace of technological change, it’s inevitable that each election season will bring with it new “firsts,” as candidates and campaigns set precedents by trying new approaches and platforms. This election has been no different: the Romney and Obama campaigns have been experimenting with mobile applications, social media, live online video and big data all year.

Tonight, one of the biggest moments in the presidential campaign to date is upon us and there are several new digital precedents to acknowledge.

The biggest tech news is that YouTube, in a partnership with ABC, will stream the debates online for the first time. The stream will be on YouTube’s politics channel, and it will be embeddable.

With more and more livestreamed sports events, concerts and now debates available online, tuning in to what’s happening no longer means passively “watching TV.” The number of other ways people can tune in online in 2012 has skyrocketed, as you can see in GigaOm’s post listing debate livestreams or Mashable’s ways to watch the debates online.

This year, in fact, the biggest challenge people will have will not be finding an online alternative to broadcast or cable news but deciding which one to watch.

If you’re low on bandwidth or have a mobile device, NPR will stream the audio from the debate online and to its mobile apps. If you’re a Spanish speaker, Univision will stream the debates on YouTube with real-time translation.

The New York Times, Politico and Wall Street Journal are both livestreaming the debates at their websites or through their apps, further eroding the line between broadcast, print and online media.

While the PBS News Hour and CSPAN’s debate hub are good options, my preference is for the Sunlight Foundation’s award-winning Sunlight Live liveblog.

There are a couple of other notable firsts. The Huffington Post will deploy its HuffPost Live platform for the first time, pulling more viewers directly into participatory coverage online.

For those looking for a more… animated approach, the Guardian and Tumblr will ‘live GIF’ the presidential debates.

Microsoft is livestreaming the debates through the XBox, giving gamers an opportunity to weigh in on what they see through their Xboxes. They’ll be polled through the Xbox console during the debate, which will provide more real-time data from a youthful demographic that, according StrategyOne, still has many voters who are not firmly committed.

Social politics

The political news cycle has long since moved from the morning papers and the nightly news to real-time coverage of events. In past years, the post-debate spin by campaigns and pundits shaped public opinion. This year, direct access to online video and to the reaction of friends, family, colleagues and media through the social web means that the spin will begin as soon as any quip, policy position or rebuttal is delivered in the debate.

Beyond real-time commentary, social media will provide useful data for the campaigns to analyze. While there won’t be a “do over,” seeing what resonated directly with the public will help the campaigns tune their messages for the next debates.

Tonight, when I go on Al Jazeera’s special debate night coverage at The Stream, I’ll be looking at a number of factors. I expect the #DenverDebate and #debates hashtags to be moving too fast to follow, so I’ll be looking at which tweets are being amplified and what we can see on Twitter’s new #debates page, what images are popping online, which links are popular, how Facebook and Google+ are reacting, and what people are searching for on Google.com.

This is quite likely to be the most social political event ever, surpassing either of the 2012 political conventions or the State of the Union address. When I watch online, I’ll be looking for what resonated with the public, not just what the campaigns are saying — although that will factor into my analysis. The @mittromney account tweets 1-2 times a day. Will they tweet more? Will @barackobama’s 19 million followers be engaged? How much and how often will they update Facebook, and to what effect?

Will they live tweet open statements with links to policies? Will they link to rebuttals or fact checks in the media? Will they push people to go register or comment or share? Will they echo applause lines or attack lines? In a larger sense, will the campaigns act social, themselves? Will they reshare the people’s posts about them on social platforms or keep broadcasting?

We’ll know answers to all of these questions in a few hours.

Fact-checking in real-time

Continuing a trend from the primary season, real-time fact-checking will play a role in the debate. The difference in this historic moment is it will be the pace of it and the number of players.

As Nick Judd highlighted at techPresident, the campaign response is going to be all about mobile. Both campaigns will be trying their hands at fact checking, using new adaptive microsites at barackobama.com/debate and debates.mittromney.com, dedicated Twitter accounts at @TruthTeam2012 and and @RomneyResponse, and an associated subdomain and Tumblr.

Given the skin that campaigns have in the game, however, undecided or wavering voters are better off going with the Fourth Estate versions. Wired media organizations, like the newspapers streaming the debates I’ve listed above, will be using liveblogs and leveraging their digital readership to help fact check.

Notably, NPR senior social strategist Andy Carvin will be applying the same approach to fact checking during the debate as he has to covering the changes in the Middle East. To participate, follow @acarvin and use the #factcheck hashtag beginning at 8:30 ET.

It’s unclear whether debate moderator Jim Lehrer will tap into the fact-checking efforts online to push back on the candidates during the event. Then again, the wisdom of the crowds may be balanced by one man’s perspective. Given that he’s serving in that capacity for the 12th time, Lehrer possesses substantial experience of his own to draw upon in making his own decisions about when to press, challenge or revisit issues.

The rise of networked polities

In a larger sense, all of this interactivity falls fall short of the promise of networked politics in the Internet age. In the age of the Internet, television debates look antiquated.

When it comes to how much the people are directly involved with the presidential debates of 2012, as Micah Sifry argued earlier this week, little has changed from 2008:

“Google is going to offer some kind of interactive audience dial gadget for YouTube users, which could allow for real-time audience feedback — except it’s already clear none of that feedback is going to get anywhere near the actual debate itself. As best as I can tell, what the CPD [Commission on Presidential Debates] is doing is little more than what they did four years ago, except back then they partnered with Myspace on a site called MyDebates.org that featured video streaming, on-demand playback and archival material. Oh, but this time the partner sites will include a dynamic counter showing how many people have ‘shared their voice’.”

While everyone who has access to the Internet will be able to use multiple screens to watch, read and participate in the conversation around the debates, the public isn’t going to be directly involved in the debate. That’s a missed opportunity that won’t be revisited until the 2016 campaign.

By then, it will be an even more wired political landscape. While many politicians are still delegating the direct use of social media use to staffers, in late 2012 it ill behooves any office to be seen as technically backward and stay off them entirely.

In the years ahead, open government advocates will push politicians to use the Internet to explain their votes, not just broadcast political attacks or campaign events. After all, the United States is a constitutional republic. Executives and Congressmen are obligated to listen to the people they represent. The existing ecosystem of social media platforms may give politicians new tools to interact directly with their constituents but they’re still relatively crude.

Yes, the next generation of social media data analytics will give politicians a dashboard of what their constituents think about their positions. It’s the next generation of polling. In the years to come, however, I’m optimistic that we’re going to see much better use of the Internet to hold politicians accountable for their campaign positions and subsequent votes.

Early experiments in creating an “OKCupid for elections” will evolve. Expect sophisticated choice engines that use social and legislative data to tell voters not only whether candidates share their positions but whether they actually voted or acted upon them. Over time, opposition candidates will be able to use that accumulated data in their campaign platforms and during debates. If a member of Congress or President doesn’t follow through with the wishes of the people, he or she will have to explain why. That will be a debate worth having.

September 24 2012

September 02 2012

02mydafsoup-01
Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!

Schweinderl