Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

The bond between data and journalism grows stronger

While reporters and editors have been the traditional vectors for information gathering and dissemination, the flattened information environment of 2012 now has news breaking first online, not on the newsdesk.

That doesn't mean that the integrated media organizations of today don't play a crucial role. Far from it. In the information age, journalists are needed more than ever to curate, verify, analyze and synthesize the wash of data.

To learn more about the shifting world of data journalism, I interviewed Liliana Bounegru (@bb_liliana), project coordinator of SYNC3 and Data Driven Journalism at the European Journalism Centre.

What's the difference between the data journalism of today and the computer-assisted reporting (CAR) of the past?

Liliana Bounegru: There is a "continuity and change" debate going on around the label "data journalism" and its relationship with previous journalistic practices that employ computational techniques to analyze datasets.

Some argue [PDF] that there is a difference between CAR and data journalism. They say that CAR is a technique for gathering and analyzing data as a way of enhancing (usually investigative) reportage, whereas data journalism pays attention to the way that data sits within the whole journalistic workflow. In this sense, data journalism pays equal attention to finding stories and to the data itself. Hence, we find the Guardian Datablog or the Texas Tribune publishing datasets alongside stories, or even just datasets by themselves for people to analyze and explore.

Another difference is that in the past, investigative reporters would suffer from a poverty of information relating to a question they were trying to answer or an issue that they were trying to address. While this is, of course, still the case, there is also an overwhelming abundance of information that journalists don't necessarily know what to do with. They don't know how to get value out of data. As Philip Meyer recently wrote to me: "When information was scarce, most of our efforts were devoted to hunting and gathering. Now that information is abundant, processing is more important."

On the other hand, some argue that there is no difference between data journalism and computer-assisted reporting. It is by now common sense that even the most recent media practices have histories as well as something new in them. Rather than debating whether or not data journalism is completely novel, a more fruitful position would be to consider it as part of a longer tradition but responding to new circumstances and conditions. Even if there might not be a difference in goals and techniques, the emergence of the label "data journalism" at the beginning of the century indicates a new phase wherein the sheer volume of data that is freely available online combined with sophisticated user-centric tools enables more people to work with more data more easily than ever before. Data journalism is about mass data literacy.

Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.

Save 20% on registration with the code RADAR20

What does data journalism mean for the future of journalism? Are there new business models here?

Liliana Bounegru: There are all kinds of interesting new business models emerging with data journalism. Media companies are becoming increasingly innovative with the way they produce revenues, moving away from subscription-based models and advertising to offering consultancy services, as in the case of the German award-winning OpenDataCity.

Digital technologies and the web are fundamentally changing the way we do journalism. Data journalism is one part in the ecosystem of tools and practices that have sprung up around data sites and services. Quoting and sharing source materials (structured data) is in the nature of the hyperlink structure of the web and in the way we are accustomed to navigating information today. By enabling anyone to drill down into data sources and find information that is relevant to them as individuals or to their community, as well as to do fact checking, data journalism provides a much needed service coming from a trustworthy source. Quoting and linking to data sources is specific to data journalism at the moment, but seamless integration of data in the fabric of media is increasingly the direction journalism is going in the future. As Tim Berners-Lee says, "data-driven journalism is the future".

What data-driven journalism initiatives have caught your attention?

Liliana Bounegru: The data journalism project FarmSubsidy.org is one of my favorites. It addresses a real problem: The European Union (EU) is spending 48% of its budget on agriculture subsidies, yet the money doesn't reach those who need it.

Tracking payments and recipients of agriculture subsidies from the European Union to all member states is a difficult task. The data is scattered in different places in different formats, with some missing and some scanned in from paper records. It is hard to piece it together to form a comprehensive picture of how funds are distributed. The project not only made the data available to anyone in an easy to understand way, but it also advocated for policy changes and better transparency laws.

LRA Crisis Tracker

Another of my favorite examples is the LRA Crisis Tracker, a real-time crisis mapping platform and data collection system. The tracker makes information about the attacks and movements of the Lord's Resistance Army (LRA) in Africa publicly available. It helps to inform local communities, as well as the organizations that support the affected communities, about the activities of the LRA through an early-warning radio network in order to reduce their response time to incidents.

I am also a big fan of much of the work done by the Guardian Datablog. You can find lots of other examples featured on datadrivenjournalism.net, along with interviews, case studies and tutorials.

I've talked to people like Chicago Tribune news app developer Brian Boyer about the emerging "newsroom stack." What do you feel are the key tools of the data journalist?

Liliana Bounegru: Experienced data journalists list spreadsheets as a top data journalism tool. Open source tools and web-based applications for data cleaning, analysis and visualization play very important roles in finding and presenting data stories. I have been involved in organizing several workshops on ScraperWiki and Google Refine for data collection and analysis. We found that participants were quite able to quickly ask and answer new kinds of questions with these tools.

How does data journalism relate to open data and open government?

Liliana Bounegru: Open government data means that more people can access and reuse official information published by government bodies. This in itself is not enough. It is increasingly important that journalists can keep up and are equipped with skills and resources to understand open government data. Journalists need to know what official data means, what it says and what it leaves out. They need to know what kind of picture is being presented of an issue.

Public bodies are very experienced in presenting data to the public in support of official policies and practices. Journalists, however, will often not have this level of literacy. Only by equipping journalists with the skills to use data more effectively can we break the current asymmetry, where our understanding of the information that matters is mediated by governments, companies and other experts. In a nutshell, open data advocates push for more data, and data journalists help the public to use, explore and evaluate it.

This interview has been edited and condensed for clarity.

Photo on associated home and category pages: NYTimes: 365/360 - 1984 (in color) by blprnt_van, on Flickr.

Related:

Don't be the product, buy the product!

Schweinderl