Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

December 21 2012

Six ways data journalism is making sense of the world, around the world

When I wrote that Radar was investigating data journalism and asked for your favorite examples of good work, we heard back from around the world.

I received emails from Los Angeles, Philadelphia, Canada and Italy that featured data visualization, explored the role of data in government accountability, and shared how open data can revolutionize environmental reporting. A tweet pointed me to a talk about how R is being used in the newsroom. Another tweet linked to relevant interviews on social science and the media:

Two of the case studies focused on data visualization, an important practice that my colleague Julie Steele and other editors at O’Reilly Media have been exploring over the past several years.

Several other responses are featured at more length below. After you read through, make sure to also check out this terrific Ignite talk on data journalism recorded at this year’s Newsfoo in Arizona.

Visualizing civic health

Meredith Broussard, a professor at the University of Pennsylvania, sent us a link to a recent data journalism project she did for Hidden City Philadelphia, which won an award from the National Council on Citizenship and the Knight Foundation. The project, measuring Philadelphia’s civic health, won honorable mention in Knight’s civic data challenge. Data visualization was a strong theme among the winners of that challenge.

Data journalism in PhiladelphiaData journalism in Philadelphia

Mapping ambulance response times

I profiled the data journalism work of The Los Angeles Times earlier this year, when I interviewed news developer Ben Welsh about the newspaper’s Data Desk, a team of reporters and web developers that specializes in maps, databases, analysis and visualization.

Recently, the Data Desk made an interactive visualization that mapped how fast the Los Angeles Fire Department responds to calls.

LA Times fire response timesLA Times fire response times

Visualizing UK government spending

The Guardian Datablog is one of the best sources of interesting, relevant data journalism work, from sports to popular culture to government accountability. Every post demonstrates an emerging practice when its editors make it possible for readers to download the data themselves. Earlier this month, the Datablog put government spending in the United Kingdom under the microscope and accompanied it with a downloadable graphic (PDF).

The Guardian’s data journalism is particularly important as the British government continues to invest in open data. In June, the United Kingdom’s Cabinet Office relaunched and released a new open data white paper. The British government is doubling down on the notion that open data can be a catalyst for increased government transparency, civic utility and economic prosperity. The role of data journalism in delivering those outcomes is central.

(Note: A separate Radar project is digging into the open data economy.)

An Italian data job

The Italian government, while a bit behind the pace set in the UK, has made more open data available since it launched a national platform in 2011.

Elisabetta Tola, an Italian data journalist, wrote in to share her work on a series of Wired Magazine articles that feature data on seismic risk assessment in Italian schools. The interactive lets parents search for schools, a feature that embodies service journalism and offers more value than a static map.

Italian schools and earthquakes visualizationItalian schools and earthquakes visualization

Tola highlighted a key challenge in Italy that exists in many other places around the world: How can data journalism be practiced in countries that do not have a Freedom of Information Act or a tradition of transparency on government actions and spending? If you have ideas, please share them in the comments or email me.

Putting satellite imagery to work

Brazil, by way of contrast, notably passed a freedom of information law this past year, fulfilling one of its commitments to the Open Government Partnership.

Earlier this year, when I traveled to Brazil to moderate a panel at the historic partnership’s annual meeting, I met Gustavo Faleiros, a journalist working with open data focusing on the Amazon rainforest. Faleiros is as a Knight International Journalism Fellow, in partnership with Washington-based organizations International Center for Journalists and Internews. Today, Faleiros continues that work as the project coordinator for, a beautiful mashup of open data, maps and storytelling.

Faleiros explained that the partnership is training Brazilian journalists to use satellite imagery and collect data related to forest fires and carbon monoxide. He shared this video that shows a data visualization that came out of that work:

As 2012 comes to an end, the rate of Amazon deforestation has dropped to record lows. These tools help the world see what’s happening from high above.

Data-driven broadcast journalism?

I also heard about work in much colder climes when Keith Robinson wrote in from Canada. “As part of large broadcast organizations, one thing that is very satisfying about data journalism is that it often puts our digital staff in the driver’s seat — what starts as an online investigation often becomes the basis for original and exclusive broadcast content,” he wrote in an email.

Robinson, the senior producer for specials and interactive at Global News in Canada, highlighted several examples of their Data Desk’s work, including:

Robinson expects 2013 will see further investment and expansion in the data journalism practice at Global News.

Robinson also pointed to a practice that media should at least consider adopting: Global News is not only consuming and displaying open data, but also publishing the data they receive from the Canadian government. “As we make access to information requests, we’re trying to make the data received available to the public,” he wrote.

From the big picture to next steps

It was instructive to learn more about the work of two large media organizations, the Los Angeles Times and Canada’s Global News, which have been building their capacity to practice data journalism. The other international perspectives in my inbox and tweet stream, however, were a reminder that big-city newsrooms that can afford teams of programmers and designers aren’t the only players here.

To put it another way, acts of data journalism by small teams or individuals aren’t just plausible, they’re happening — from Italy to Brazil to Africa. That doesn’t mean that the news application teams at NPR, The Guardian, ProPublica or the New York Times aren’t setting the pace for data journalism when it comes to cutting edge work — far from it — but the tools and techniques to make something worthwhile are being democratized.

That’s possible in no small part because of the trend toward open source tools and social coding I’m seeing online, from Open Street Map to more open elections.

It’s a privilege to have a global network to tap into for knowledge and, in the best moments, wisdom. Thank you — and please keep the responses coming, whether you use email, Twitter or the phone. Your input is helping shape a report I’m developing that ties together our coverage of data journalism. Look for that to be published early in the new year.


Reposted bycheg00 cheg00

November 29 2012

As digital disruption comes to Africa, investing in data journalism takes on new importance

This interview is part of our ongoing look at the people, tools and techniques driving data journalism.

I first met Justin Arenstein (@justinarenstein) in Chişinău, Moldova, where the media entrepreneur and investigative journalist was working as a trainer at a “data boot camp” for journalism students. The long-haired, bearded South African instantly makes an impression with his intensity, good humor and focus on creating work that gives citizens actionable information.

Justin ArensteinJustin ArensteinWhenever we’ve spoken about open data and open government, Arenstein has been a fierce advocate for data-driven journalism that not only makes sense of the world for readers and viewers, but also provides them with tools to become more engaged in changing the conditions they learn about in the work.

He’s relentlessly focused on how open data can be made useful to ordinary citizens, from Africa to Eastern Europe to South America. For instance, in November, he highlighted how data journalism boosted voter registration in Kenya, creating a simple website using modern web-based tools and technologies.

For the last 18 months, Arenstein has been working as a Knight International Fellow embedded with the African Media Initiative (AMI) as a director for digital innovation. The AMI is a group of the 800 largest media companies on the continent of Africa. In that role, Arenstein has been creating an innovation program for the AMI, building more digital capacity in countries that are as in need of effective accountability from the Fourth Estate as any in the world. That disruption hasn’t yet played itself out in Africa because of a number of factors, explained Arenstein, but he estimates that it will be there within five years.

“Media wants to be ready for this,” he said, “to try and avoid as much of the business disintegration as possible. The program is designed to help them grapple with and potentially leapfrog coming digital disruption.”

In the following interview, Arenstein discusses the African media ecosystem, the role of Hacks/Hackers in Africa, and expanding the capacity of data journalism.

Why did you adopt the Hacks/Hackers model and scale it? Why is it relevant to what’s happening around Africa?

Justin Arenstein: African journalists are under-resourced but also poorly trained, probably even more so than in the U.S. and elsewhere. Very, very few of them have any digital skills, never mind coding skills. Simply waiting for journalists to make the leap themselves and start learning coding skills and more advanced digital multimedia content production skills is just too — well, we don’t have enough time to do that, if we’re going to beat this disruption that’s coming.

The idea was to clone parts of the basic model of Hacks/Hackers from the U.S., which is a voluntary forum and society where journalists, UI people, designers, graphics people and coders meet up on a regular basis.

Unlike in the U.S., where Hacks/Hackers is very focused on startup culture, the African chapters have been very focused on data-driven journalism and imparting some basic skills. We’re trying to avoid some of the pitfalls experienced in the U.S. and get down to using data as a key tool in creating content. A big weakness in a lot of African media is that there’s very little unique content, firstly, and that the unique content that is available is not particularly well produced. It’s not deep. It’s not substantiated. It’s definitely not linked data.

We’ve been focusing on improving the quality of the content so that the companies where these journalists work will be able to start weaning themselves from some of the bad business practices that they are guilty of and start concentrating on building up their own inventory. That’s worked really well in some of the African countries along the coastlines where there’s data access, because you’ve got cables coming in. In the hinterland of Africa, data and Internet are not widely available. The Hacks/Hackers chapters there have been more like basic computer-assisted reporting training organizations.

Like in the U.S., they all run themselves. But unlike in the U.S., we have a structured agenda, a set of protocols, an operating manual, and we do subsidize each of the chapters to help them meet the physical needs of cost. They’re not quite as voluntary as the U.S. ones; it’s a more formal structure. That’s because they’re designed to surface good ideas, to bring together a challenge that you wouldn’t ordinarily find in the media ecosystem at least, and then to help kick-start experimentation.

Do you see any kind of entrepreneurial activity coming out of them now?

Justin Arenstein: I’m not aware of any notable startups. We’ve had ideas where people are collaborating to build toward startups. I haven’t seen any products launched yet, but what we have seen is journalist-led startups that were outside of these Hacks/Hackers chapters now starting to come into the fold.

Why? Because this is where they can find some of the programming and engineering skills that they need, that they were struggling to find outside of the ecosystem. They are finding engineers or programmers, at least, but they’re not finding programmers who are tuned to content needs or to media philosophies and models. There’s a better chance that they’ll find those inside of these chapters.

The chapters are fairly young, though. The oldest chapter is about six months old now, and still fairly small. We’re nowhere near the size of some of the Latin American chapters. We have forged very strong links with them, and we follow their model a lot more closely than the U.S. model. The biggest chapter is probably about 150 members. They all meet, at a minimum, once a month. Interestingly, they are becoming the conduits not just for hackathons and “scrape-a-thons,” but are also now our local partners for implementing thinks like our data boot camps.

Those are week-long, intensive hands-on experiential training, where we’re flying in people from the Guardian data units, the Open Knowledge Foundation and from Google. We’re actually finding the guys behind Google Refine and Google Fusion Tables and flying in some of those people, so they can see end-users in a very different environment to what they’re used to. People walk into those boot camps not knowing what a spreadsheet is and, by the end of it, they’re producing their first elementary maps and visualizations. They’re crunching data.

What stories have “data boot camp” participants produced afterward?

Justin Arenstein: Here’s an example. We had a boot camp in Kenya. NTV, the national free-to-air station, had been looking into why young girls in a rural area of Kenya did very well academically until the ages of 11 or 12 — and then either dropped off the academic record completely or their academic performance plummeted. The explanation by the authorities and everyone else was that this was simply traditional; it’s tribal. Families are pulling them out of school to do chores and housework, and as a result, they can’t perform.

Irene Choge [a Kenyan boot camp participant who attended data journalism training] started mining the data. She came from that area and knew it wasn’t that [cause]. So she looked into public data. She first assumed it was cholera, so she looked into medical records. Nothing there. She then looked into water records. From water, she started looking into physical infrastructure and public works. She discovered these schools had no sanitation facilities and that the schools with the worst performing academics were those that didn’t have sanitation facilities, specifically toilets.

What’s the connection?

Justin Arenstein: When these girls start menstruating, there’s nowhere for them to go to attend to themselves, other than into the bushes around the school. They were getting harassed and embarrassed. They either stopped going to school completely or they would stop going during that part of their cycle and, as a result, their schoolwork suffered dramatically. She then produced a TV documentary that evoked widespread public outcry and changed policies.

In addition to that, her newsroom is working on building an app. A parent who watches this documentary and is outraged will then be able to use the app to find out what’s happening at their daughter’s school. If their daughter’s school is one of those that has no facilities, the app then helps them through a text-based service to sign a petition and petition the responsible official to improve the situation, as well as link up with other outraged parents. It mobilizes them.

What we liked about her example was that it was more than just doing a visualization, which is what people think about when you say “data journalism.”

First, she used data tools to find trends and stories that had been hidden to solve a mystery. Secondly, she then did real old-fashioned journalism and went out in the field and confirmed the data wasn’t lying. The data was accurate.

Thirdly, she then used the data to give people the tools to actually act on the information. She’s using open data and finding out in your district, this is your school, this is how you impact it, this is the official you should be emailing or writing to about it. That demonstrates that, even in a country where most people access information through feature phones, data can still have a massive impact at grassroots level.

These are the kinds of successes that we are looking for in these kinds of outreach programs when it comes to open data.

How does the practice of data-driven journalism or the importance of computer-assisted reporting shift when a reporter can’t use rich media or deploy bandwidth-heavy applications?

Justin Arenstein: We’re finding something that maybe you’re starting to see inklings of elsewhere as well: data journalism doesn’t have to be the product. Data journalism can also be the route that you follow to get to a final story. It doesn’t have to produce an infographic or a map.

Maps are very good ways to organize information. They’re very poor mechanisms for consuming information. No one kicks back on a Sunday afternoon laying on their sofa, reading a map, but if a map triggers geofenced information and pushes relevant local information at you in your vicinity, then it becomes a useful mechanism.

What we’re doing in newsrooms is around investigative journalism. For example, we’re funding projects around extractive industries. We’re mapping out conversations and relationships between people. We’re then using them as analytical tools in the newsroom to arrive at better, deeper and evidence-driven reporting, which is a major flaw and a major weakness in many African media.

What capacity needs to be built in these areas? What are people doing now? What matters most?

Justin Arenstein: Investigative journalism in Africa, like in many other places, tends to be scoop-driven, which means that someone’s leaked you a set of documents. You’ve gone and you’ve verified them and often done great sleuth work. There are very few systematic, analytical approaches to analyzing broader societal trends. You’re still getting a lot of hit-and-run reporting. That doesn’t help us analyze the societies we’re in, and it doesn’t help us, more importantly, build the tools to make decisions.

Some of the apps that we are helping people build, based off of their reporting, are invariably not visualizations. They’re rather saying, “Let’s build a tool that augments the reporting, reflects the deeper data that the report is based on, and allows people to use that tool to make a personal decision.” It’s engendering action.

A lot of the fantastic work you’ve seen from people at the Guardian and others has been about telling complex stories simply via infographics, which is a valid but very different application of data journalism.

I think that, specifically in East Africa and in Southern Africa, there’s growing recognition that the media are important stewards of historical data. In many of these societies, including industrialized societies like South Africa, the state hasn’t been a really good curator of public data and public information because of their political histories.

Nation states don’t see data as an asset? Is that because technical capacity isn’t there? Or is that because data actually contains evidence of criminality, corruption or graft?

Justin Arenstein: It’s often ineptitude and lack of resources in South Africa’s instance. In a couple of other countries, it’s systematic purging of information that is perhaps embarrassing when there’s a change of regime or political system — or in the case of South Africa and many of the colonial countries, a simple unwillingness or lack of insight as to the importance of collecting data about second-class citizens, largely the black population.

The official histories are very thin. There’s nowhere near the depth of nuance or insight into a society that you would find in the U.S. or in Europe, where there’s been very good archival record keeping. Often, the media are the only people who’ve really been keeping that kind of information, in terms of news reportage. It’s not brilliant. It’s often not primary sources — it’s secondary. But the point is that often it’s the only information that’s available.

What we’re doing is working with media companies now to help digitize and turn reportage into structured data. In a vacuum, because there is no other data, suddenly it becomes an important commercial commodity. Anyone who wants to build, for example, a tourism app or a transport app, will find that there is no other information available. This may sound like a bizarre concept to most people living in data-rich countries, like the U.S., but you simply can’t find the content. That means that you have to then go out and create the content yourself before you can build the app.

Is this a different sort of a “data divide,” where a country is “data-poor?”

Justin Arenstein: Well, maybe digitally “data poor,” because what we are doing is we’re saying that there is data. We initially also had the same reaction, saying “there is no data here,” and then realized that there’s a hell of a lot of data. Invariably, it’s locked up in deadwood format. So [we're now] liberating that data, digitizing it, structuring it, and then making sure that it’s available for people to use.

How much are media entities you work with making data, as opposed to just digitizing?

Justin Arenstein: Some are making data. We haven’t, because a lot of other actors are involved in citizen data creation. We haven’t really focused too many of our very scarce resources on that component yet.

We are funding a couple of citizen reporting apps, because there’s a lot of hype around citizen data and we’re trying to see if there are models that can really work where you create credible, sourced and actionable information. We don’t believe that you’re going to be able to do that just from text messaging. We’re looking at alternative kinds of interfaces and methods for transmitting information.

Are there companies and startups that are consuming the digital data that you’re producing? If so, what are they doing?

Justin Arenstein: Outside of the News Challenge, we are co-founding something with the World Bank called “Code for Kenya.” It’s modeled fairly closely on the Mozilla Open Use Fellowships, with a few tweaks. It’s maybe a hybrid of Code for America and the Mozilla Open Fellowships.

Where Code for America focuses on cities and Mozilla focuses on newsrooms, we’ve embedded open data strategists and evangelists into the newsrooms, backed up by an external development team at a civic tech lab. They’re structuring the data that’s available, such as turning old microfiche rolls into digital information, cleaning it up and building a data disk. They’re building news APIs and pushing the idea that rather than building websites, design an API specifically for third-party repurposing of your content. We’re starting to see the first early successes. Four months in, some of the larger media groups in Kenya are now starting to have third-party entrepreneurs developing using their content and then doing revenue-share deals.

The only investment from the data holder, which is the media company, is to actually clean up the data and then make it available for development. Now, that’s not a new concept. The Guardian in the United Kingdom has experimented with it. It’s fairly exciting for these African companies because there’s potentially — and arguably, larger — appetite for the content because there’s not as much content available. Suddenly, the unit cost of value of that data is far higher than it might be in the U.K. or in the U.S.

Media companies are seriously looking at it as one of many potential future revenue streams. It enables them to repurpose their own data, start producing books and the rest of it. There isn’t much book publishing in Africa, by Africans, for Africans. Suddenly, if the content is available in an accessible format, it gives them an opportunity to mash-up stuff and create new kinds of books.

They’ll start seeing that content itself can be a business model. The impact that we’re seeking there is to try and show media companies that investing in high-quality unique information actually gives you a long-term commodity that you can continue to reap benefits from over time. Whereas simply pulling stuff off the wire or, as many media do in Africa, simply lifting it off of the web, from the BBC or elsewhere, and crediting it, is not a good business model.

Photo via International Center for Journalists.


October 15 2012

New ethics for a new world

Since the first of our ancestors chipped stone into weapon, technology has divided us. Seldom more than today, however: a connected, always-on society promises health, wisdom, and efficiency even as it threatens an end to privacy and the rise of prejudice masked as science.

On its surface, a data-driven society is more transparent, and makes better uses of its resources. By connecting human knowledge, and mining it for insights, we can pinpoint problems before they become disasters, warding off disease and shining the harsh light of data on injustice and corruption. Data is making cities smarter, watering the grass roots, and improving the way we teach.

But for every accolade, there’s a cautionary tale. It’s easy to forget that data is merely a tool, and in the wrong hands, that tool can do powerful wrong. Data erodes our privacy. It predicts us, often with unerring accuracy — and treating those predictions as fact is a new, insidious form of prejudice. And it can collect the chaff of our digital lives, harvesting a picture of us we may not want others to know.

The big data movement isn’t just about knowing more things. It’s about a fundamental shift from scarcity to abundance. Most markets are defined by scarcity — the price of diamonds, or oil, or music. But when things become so cheap they’re nearly free, a funny thing happens.

Consider the advent of steam power. Economist Stanley Jevons, in what’s known as Jevons’ Paradox, observed that as the efficiency of steam engines increased, coal consumption went up. That’s not what was supposed to happen. Jevons realized that abundance creates new ways of using something. As steam became cheap, we found new ways of using it, which created demand.

The same thing is happening with data. A report that took a month to run is now just a few taps on a tablet. An unthinkably complex analysis of competitors is now a Google search. And the global distribution of multimedia content that once required a broadcast license is now an upload.

Big data is about reducing the cost of analyzing our world. The resulting abundance is triggering entirely new ways of using that data. Visualizations, interfaces, and ubiquitous data collection are increasingly important, because they feed the machine — and the machine is hungry.

The results are controversial. Journalists rely on global access to data, but also bring a new skepticism to their work, because facts are easy to manufacture. There’s good evidence that we’ve never been as polarized, politically, as we are today — and data may be to blame. You can find evidence to support any conspiracy, expose any gaffe, or refute any position you dislike, but separating truth from mere data is a growing problem.

Perhaps the biggest threat that a data-driven world presents is an ethical one. Our social safety net is woven on uncertainty. We have welfare, insurance, and other institutions precisely because we can’t tell what’s going to happen — so we amortize that risk across shared resources. The better we are at predicting the future, the less we’ll be willing to share our fates with others. And the more those predictions look like facts, the more justice looks like thoughtcrime.

The human race underwent a huge shift when we banded together into tribes, forming culture and morals to tie us to one another. As groups, we achieved great heights, building nations, conquering challenges, and exploring the unknown. If you were one of those tribesmen, it’s unlikely you knew what was happening — it’s only in hindsight that the shift from individual to group was radical.

We’re in the middle of another, perhaps bigger, shift, one that’s taking us from physical beings to digital/physical hybrids. We’re colonizing an online world, and just as our ancestors had to create new social covenants and moral guidelines to work as groups, so we have to craft new ethics, rights and laws.

Those fighting for social change have their work cut out for them, because they’re not just trying to find justice — they’re helping to rewrite the ethical and moral guidelines for a nascent, always-on, data-driven species.


Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!