Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

March 22 2013

Sensoring the news

When I went to the 2013 SXSW Interactive Festival to host a conversation with NPR’s Javaun Moradi about sensors, society and the media, I thought we would be talking about the future of data journalism. By the time I left the event, I’d learned that sensor journalism had long since arrived and been applied. Today, inexpensive, easy-to-use open source hardware is making it easier for media outlets to create data themselves.

“Interest in sensor data has grown dramatically over the last year,” said Moradi. “Groups are experimenting in the areas of environmental monitoring, journalism, human rights activism, and civic accountability.” His post on what sensor networks mean for journalism sparked our collaboration after we connected in December 2011 about how data was being use in the media.

AP Beijing Air Quality graphicAP Beijing Air Quality graphic

Associated Press visualization of Beijing air quality. See related feature.

At a SXSW panel on “sensoring the news,” Sarah Williams, an assistant professor at MIT, described how the Civic Data Design Project had partnered with the Associated Press to independently measure air quality in Beijing.

Prior to the 2008 Olympics, the coaches of the Olympic teams had expressed serious concern about the impact of air pollution on the athletes. That, in turn, put pressure on the Chinese government to take substantive steps to improve those conditions. While the Chinese government released an index of air quality, explained Williams, they didn’t explain what went into it, nor did they provide the raw data.

The Beijing Air Tracks project arose from the need to determine what the conditions on the ground really were. AP reporters carried sensors connected to their cellphones to detect particulate and carbon monoxide levels, enabling them to report air quality conditions back in real-time as they moved around the Olympic venues and city.

The sensor data helped the AP measure the effect of policy decisions that the Chinese government made, said Williams, from closing down factories to widespread shutdowns of different kinds of industries. The results from the sensor journalism project, which showed a decrease in particulates but conditions 12 to 25 times worse than New York City on certain days, were published as an interactive data visualization.

AP Beijing mash-up of particulate levels and photography in Beijing.AP Beijing mash-up of particulate levels and photography in Beijing.

Associated Press mashup of particulate levels and photography at the Olympic stadium in Beijing over time.

This AP project is a prime example of how sensors, data journalism, and old-fashioned, on-the-ground reporting can be combined to shine a new level of accountability on official reports. It won’t be the last time this happens, either. Around the world, from the Amazon to Los Angeles to Japan, sensor data is now being put to use by civic media and journalists.

Sensing civic media

There are an increasing number of sensors in our lives, said John Keefe, a data news editor for WNYC, speaking at his SXSW panel in Austin. From the physical sensors in smartphones to new possibilities built with Arduino or Raspberry Pi hardware, Keefe highlighted how journalists could seize hold of new possibilities.

“Google takes data from maps and Android phones and creates traffic data,” Keefe said. “In a sense, that’s sensor data being used live in a public service. What are we doing in journalism like that? What could we do?”

The evolution of Safecast offers a glimpse of networked accountability, collecting and publishing radiation data through sensors, citizen science and the Internet. The project, which won last year’s Knight News Challenge on data, is now building the infrastructure to enable people to help monitor air quality in Los Angeles.

Sensor journalism is also being applied to make sense of the world in using remote sensing data and satellite imagery. The director of that project, Gustavo Faleiros, recently described how environmental reporting can be combined with civic media to collect data, with relevant projects in Asia, Africa and the Americas. For instance, Faleiros cited an environmental monitoring project led by Eric Paulos of the University of California at Berkeley’s Center for New Media, where sensors on taxis were used to gather data in Accra, Ghana.

Another direction that sensor data could be applied lies in social justice and education. At SXSW, Sarah Williams described [slides] how the Air Quality Egg, an open source hardware device, is being used to make an argument for public improvements. At the Cypress Hills Community School, kids are bringing the eggs home, measuring air quality and putting data online, said Williams.

Air Quality Eggs at Cypress Hill Community SchoolAir Quality Eggs at Cypress Hill Community School

Air Quality Eggs at Cypress Hill Community School.

“Health sensors are useful when they can compare personal real-time data against population-wide data,” said Nadav Aharony, who also spoke on our panel in Austin.

Aharony talked about how Behavio, a startup based upon his research on smartphones and data at MIT, has created funf, an open source sensing toolkit for Android devices. Aharony’s team has now deployed an integration with Dropbox that requires no coding ability to use.

According to Aharony, the One Laptop Per Child project is using funf in tablets deployed in Africa, in areas where there are no schools. Researchers will use funf as a behavioral tool to sense how children are interacting with the devices, including whether tablets are next to one another.

Sensing citizen science

While challenges lie ahead, it’s clear that sensors will be used to create data where there was none before. At SXSW, Williams described a project in Nairobi, Kenya, where cellphones are being used to map informal bus systems.

The Digital Matatus project is publishing the data into the General Transit Feed Standard, one of the most promising emerging global standards for transit data. “Hopefully, a year from now [we] will have all the bus routes from Nairobi,” Williams said.

Map of Matatus stops in Nairobi, KenyaMap of Matatus stops in Nairobi, Kenya

Map of Matatus stops in Nairobi, Kenya

Data journalism has long depended upon official data released by agencies. In recent years, data journalists have begun scraping data. Sensors allow another step in that evolution to take place, where civic media can create data to inform the public interest.

Matt Waite, a professor of practice and head of the Drone Journalism Lab at the University of Nebraska-Lincoln, joined the panel in Austin using a Google Hangout and shared how he and his students are experimenting with sensors to gather data for projects.

Journalists are going to run up against stories where no one has data, he said. “The old way was to give up,” said Waite. “I don’t think that’s the way to do it.”

Sensors give journalists a new, interesting way to enlist a distributed audience in gathering needed data, he explained. “Is it ‘capital N’ news? Probably not,” said Waite. “But it’s something people are really interested in. The easy part is getting a parts list together and writing software. The hard part is the creative process it takes to figure out what we are going to measure and what it means.”

In an interview with the Nieman Journalism Lab on sensor journalism, Waite also raised practical concerns with the quality of data collection that can be gathered with inexpensive hardware. “One legitimate concern about doing this is, you’re talking about doing it with the cheapest software you can find,” Waite told the Nieman Lab’s Caroline O’Donovan. “It’s not expertly calibrated. It’s not as sensitive as it possibly could be.”

Those are questions that will be explored practically in New York in the months ahead, when New York City’s public radio station will be collaborating with the Columbia School of Public Health to collect data about New York’s environmental conditions. They’ll put particulate detectors, carbon dioxide monitors, leg motion sensors, audio monitors, cameras and GPS trackers on bicycles and ride around the city collecting pollution data.

“At WNYC, we already do crowdsourcing, where we ask our audience to do something,” said Keefe. “What if we could get our audience to do something with this? What if you could get an audience to work with you to solve a problem?”

Keefe also announced the Cicada Project, where WNYC is inviting its listeners to build homemade sensors and track the emergence of cicadas this spring across New Jersey, New York and the Northeast region.

This cicada tracker project is a 21st century parallel to the role that birders have played for decades in the annual Christmas Bird Count, creating new horizons for citizen science and public media.

Update: WNYC’s public is responding in interesting ways that go beyond donations. On Twitter, Keefe highlighted the work of a NYC-based hacker, Guan, who was able to make a cicada tracker for $20, 1/4 the cost of WNYC’s kit.

Sensing challenges ahead

Just as civic technologists need to be mindful of “solutionism,” so too will data journalists need to be aware of the “sensorism” that exists in the health care world, as John Wilbanks pointed out this winter.

“Sensorism is rife in the sciences,” Wilbanks wrote. “Pick a data generation task that used to be human centric and odds are someone is trying to automate and parallelize it (often via solutionism, oddly — there’s an app to generate that data). What’s missing is the epistemic transformation that makes the data emerging from sensors actually useful to make a scientific conclusion — or a policy decision supposedly based on a scientific consensus.”

Anyone looking to practice sensor journalism will face interesting challenges, from incorrect conclusions based upon faulty data to increased risks to journalists carrying the sensors, to gaming or misreporting.

“Data accuracy is both a real and a perceived problem,” said Moradi at SXSW. “Third-party verification by journalists or other non-aligned groups may be needed.”

Much as in the cases of “drone journalism” and data journalism, context, usage and ethics have to be considered before you launch a quadcopter, fire up a scraper or embed sensors around your city. The question you come back to is whether you’re facing a new ethical problem or an old ethical problem with new technology, suggested Waite at SXSW. “The truth is, most ethical issues you can find with a new analogue.”

It may be, however, that sensor data, applied to taking a “social MRI” or other uses, may present us with novel challenges. For instance, who owns the data? Who can access or use it? Under what conditions?

A GPS device is a form of sensor, after all, and one that’s quite useful to law enforcement. While the Supreme Court ruled that the use of a GPS device for tracking a person without a warrant was unconstitutional, sensor data from cellphones may provide law enforcement with equal or greater insight into a target’s movements. Journalists may well face unexpected questions about protecting sources if their sensor data captures the movements or actions of a person of interest.

“There’s a lot of concern around privacy,” said Moradi. “What data can the government request? Will private companies abuse personal data for marketing or sales? Do citizens have the right to personal data held by companies and government?”

Aharony outlined many of the issues in a 2011 paper on stealing reality, exploring what happens when criminals become data scientists.

“It’s like a slow-moving attack if you attach yourself to someone’s communication,” said Aharony, in a follow-up interview in Austin. “‘iPhonegate‘ didn’t surprise people who know about mobile app data or how the cellular network is architected. Look at what happened to Path. You can make mistakes without meaning to. You have to think about this and encrypt the data.”

This post is part of our series investigating data journalism.

October 22 2012

What I learned about #debates, social media and being a pundit on Al Jazeera English

The Stream - Al Jazeera EnglishThe Stream - Al Jazeera EnglishEarlier this month, when I was asked by Al Jazeera English if I’d like to be go on live television to analyze the online side of the presidential debates, I didn’t immediately accept. I’d be facing a live international audience at a moment of intense political interest, without a great wealth of on-air training. That said, I felt honored to be asked by Al Jazeera. I’ve been following the network’s steady evolution over the past two decades, building from early beginnings during the first Gulf War to its current position as one of the best sources of live coverage and hard news from the Middle East. When Tahrir Square was at the height of its foment during the Arab Spring, Al Jazeera was livestreaming it online to the rest of the world.

I’ve been showing a slide in a presentation for months now that features Al Jazeera’s “The Stream” as a notable combination of social media, online video and broadcast journalism since its inception.

So, by and large, the choice was clear: say “yes,” and then figure out how to do a good job.

As is ever the case with new assignments, what would follow from that choice wasn’t as easy as it might have seemed. Some of the nuts and bolts of appearing were quite straightforward: Do a long pre-interview with the producer about my work and my perspective on how the Internet and social media were changing the dynamics of a live political event like the debate. (I captured much of that thinking here at Radar, in a post on digital feedback loops and the debate.) Go through makeup each time. Get wired up with a mic and an earpiece that connected me to the control room. Review each show’s outline, script and online engagement channels, from Twitter to YouTube to Google+ to Reddit.

I was also afforded a few luxuries that bordered on the surreal: a driver that picked me up and took me home from the studio. Bottled spring water. A modest honorarium to hang out in a television studio for a couple of hours and talk for a few intense minutes about what moments from the debates resonated online and why. The realization that my perspective could be seen by millions in Al Jazeera English’s international audience. People would be watching. I’d need to deliver something worth their time.

Entering The Stream

Live television doesn’t give anyone much room for error. On this particular show, The Stream, there was no room for a deep dive into analysis. We had time to answer a couple of questions of what happened on social media during the debates. Some spots were 30 seconds. Adding context in that context is a huge challenge. How much do you assume the people viewing know? What moments do you highlight? For this debate show, I had to assume that they watched the two candidates spar — but were they following the firehouse of commentary on Twitter? Even if they did, given how personalized social media has become, it was inevitable that what viewers saw online would be different than what we did in the studio.

When we saw the campaigns focus on Twitter during the debates, I saw that as news, and said as much. While the campaigns were also on Facebook, Google+, Tumblr, YouTube and blogs, along with the people formerly known as the audience, the forum for real-time social politics in the fall of 2012 remained Twitter, in all its character-limited glory.

Once the debates ended each night, campaigns and voters turned to the new watercoolers of the moment — blogs and article comment sections — to discuss what they’d seen. They went to Facebook and Google+ to share their reactions. To their credit, the Stream producers used Google+ Hangouts to immediately ask undecided voters what they thought and bring in political journalists to share their impressions. It’s a great use of the platform to involve more people in a show using the tools of the moment.

I’ve embedded each of the debate videos below, along with the full length episode of The Stream on data mining in the 2012 election. (I think I delivered, based upon the feedback I’ve received since in person and online, but I’m quite open to feedback if you’d like to comment.)

The Stream: Presidential Debates [10/3/2012]

The Stream: Vice Presidential Debate [10/11/2012]

The Stream: Presidential debates pre-show [10/16/2012]

On memes, social journalism and listening

The first two presidential debates and the vice-presidential debate spawned online memes. Given the issues before the country and the world, reducing these debates to those rapid expressions and the other moments that catalyzed strong online reactions was inherently self-limiting. The role of The Stream during the debates, however, was to look at these political events through the prism of social media to explain quickly and precisely what popped online. At this point, if you’re following the election, you’ve probably heard of at least two of them: Big Bird and “binders full of women.” (I explain both in the videos embedded above.) We also saw acmes of attention and debate conflict reflected online, from Vice President Biden’s use of “malarkey” to reaction to CNN chief political correspondent Candy Crowley’s real-time correction of former Massachusetts Governor Mitt Romney’s challenge to President Obama regarding his use of “act of terror” on the day after the United States Embassy to Libya was attacked.

There are limits to what you can discern through highlighting memes. While it might be easy to dismiss memes as silly or out-of-context moments, I think they serve a symbolic, even totemic role for people who share them online. There’s also a simple historic parallel: animated GIFs are the political cartoons of the present.

Reducing the role of social media in networked political debates to just Twitter, GIFs and status updates, however, would be a mistake. The combination of embeddable online video, blogs and wikis are all part of a blueprint for democratic participation that enables people to explore the issues debated in depth, which is particularly relevant if cable news shows fail to do so.

There’s also a risk of distracting from what we can learn about how the candidates would make policy or leadership decisions. I participated in a Google+ Hangout hosted by Storify last week about social media and elections. The panel of “social journalists” shared their perspectives on how the #debates are being covered in this hyper-connected moment — and whether social media is playing a positive role or not.

Personally, I see the role of social media in the current election as a mixed bag. Networked fact checking is a positive development. The campaigns and media alike can find interesting trends in real-time sentiment analysis, if they dive into the social data. I also see an important role for the broader Internet in providing as much analysis on policy or context as people are willing to search for, on social media or off.

There’s a risk, however, that public opinion or impressions of the debates are being prematurely shaped by the campaigns and their proxies, or that confirmation bias is being reaffirmed through homophilic relationships that are not representative of the electorate as whole.

All that being said, after these three shows, I plan to watch the last presidential debate, on foreign policy, differently. I’m going to pocket my smartphone, sleeve my iPad and keep my laptop closed. Instead of tracking the real-time feedback during the debates and participating in the ebb and flow of the conversation, I’m just going to actively listen and take notes. There are many foreign policy questions that will confront the 45th President of the United States. Tonight, I want to hear the responses of the candidates, unadorned by real-time spin, fact checking, debate bingo or instant reaction.

Afterwards, I’ll go back online to read liveblogs, see where the candidates may have gone awry, and look abroad to see how the world is reacting to a debate on foreign policy that stands to directly affect billions of people who will never vote in a U.S. election. First, however, I’ll form my own impressions, supported by the virtues of solitude, not the clamor of social media.

July 25 2012

Mr. Issa logs on from Washington

To update an old proverb for the Information Age, digital politics makes strange bedfellows. In the current polarized atmosphere of Washington, certain issues create more interesting combinations than others.

In that context, it would be an understatement to say that’s been interesting to watch how Representative Darrell Issa (CA-R) has added his voice to the open government and Internet policy community over the last several years.

Rep. Issa was a key member of the coalition of open government advocates, digital rights advocates, electronic privacy wonks, Internet entrepreneurs, nonprofits, media organizations and congressmen that formed a coalition to oppose the passage of the Stop Online Piracy Act (SOPA) and PROTECT IP Act (PIPA) this winter. Rep. Issa strongly opposed SOPA after its introduction last fall and, working with key allies on the U.S. House Judicial Committee, effectively filibustered its advance by introducing dozens of amendments during the bill’s markup.

The delay created time over Congress’ holiday recess for opposition to SOPA and its companion bill in the Senate (The PROTECT IP Act) to build, culminating in a historic “black out day” on January 18, 2012. Both bills were halted.

While he worked across the aisle on SOPA and PIPA, Rep. Issa has been fiercely partisan in other respects, using his powerful position as the chairman of the U.S. House Oversight and Government Reform Committee to investigate various policy choices and actions of the Obama administration and federal agencies. During the same time period, he’s also become one of the most vocal proponents of open government data and Internet freedom in Congress, from drafting legislation to standardize federal finance data to opposing bills that stood to create uncertainty in the domain name system. He also sponsored the ill-conceived Research Works Act, which expired after received fierce criticism from open access advocates.

In recent years, Rep. Issa and his office have used the Web and social media to advance his legislative agenda, demonstrating in the process a willingness to directly engage with citizens and public officials alike on Twitter as @DarrellIssa, even to the extent of going onto Reddit to personally do an “Ask Me Anything.” Regardless of where one stands on his politics, the extent to which he and staff have embraced using the Web to experiment with more participatory democracy have set an example that perhaps no other member of Congress has matched.

In June 2012, I interviewed Rep. Issa over the phone, covering a broader range of his legislative and oversight work, including the purpose of this foundation and his views on regulation, open data, and technology policy in general. More context on other political issue, his personal life, business background and political career can be found at his Wikipedia entry and in Ryan Lizza’s New Yorker feature.

Our interview, lightly edited for content and clarity, is broken out into a series of posts that each explore different aspects of the conversation. Below, we talk about open government data and his new “Open Gov Foundation.”

What is the Open Gov Foundation?

In June, Representative Darrell Issa (R-CA) launched an “Open Gov Foundation” at the 2012 Personal Democracy Forum. Rep. Issa said then the foundation would institutionalize the work he’s done while in office, in particular “Project MADISON,” the online legislative markup software that his technology staff and contractors developed and launched after the first Congressional hackathon last December. If you visit the Open Gov Foundation website, you’ll read language about creating “platforms” for government data, from regulatory data to legislative data.

Congressman Issa’s office stated that this Open Gov Foundation will be registered as a non-partisan 501c3 by mid-fall 2012. A year from now, he would like to have made “major headway” on the MADISON project working in a number of different places, not just federal House but elsewhere.

For that to happen, MADISON code will almost certainly need to be open sourced, a prospect that the Congressman indicated is highly likely to in our interview, and integrated into other open government projects. On that count, Congressman Issa listed a series of organizations that he admired in the context of open government work, including the Sunlight Foundation, Govtrack, public.resource.org, the New York State Senate, OpenCongress and the Open Knowledge Foundation

Th general thrust of his admiration, said the Congressman, comes from that fact that these people are not just working hard to get government data out there, to deliver raw data, but to build things that are useful and that use that government data, helping to build tools that help bridge the gap for citizens.

What do you hope to achieve with the Open Government Foundation?

Rep. Issa: I’ve observed over 12 years that this expression that people use in Congress is actually a truism. And the expression they use is you’re entitled to your opinion but not your facts.

Well, the problem in government is that, in fact, facts seem to be very fungible. People will have their research, somebody will have theirs. Their ability to get raw data in a format where everybody can see it and then reach, if you will, opinions as to what it means, tends to be limited.

The whole goal that I’d like to have, whether it’s routing out waste and fraud — or honestly knowing what somebody’s proposal is, let’s just say SOPA and PIPA — is [to] get transparency in real-time. Get it directly to any and all consumers, knowing that in some cases, it can be as simple as a Google search by the public. In other cases, there would need to be digesting and analysis, but at least the raw data would be equally available to everyone.

What that does is it eliminates one of the steps that people like Ron Wyden and myself find ourselves in. Ron and I probably reach different conclusions if we’re given the same facts. He will see the part of the cup that is empty and needs government to fill it. And I will see the part that exists only because government isn’t providing all of the answers. But first, we have to have the same set of facts. That’s one of the reasons that a lot of our initiatives absolutely are equally desired by the left and the right, even though once we have the facts, we may reach different conclusions on policy.

Does you that mean more bulk data from Congress, which you supported with an amendment to a recent appropriations bill?

Rep. Issa: Let’s say it’s not about the quantity of data; it’s about whether or not there’s meaningful metadata attached to it. If you want to find every penny being spent on breast cancer research, there’s no way to compare different programs, different dollars in different agencies today. And yet, you may want to find that.

What we learned with the control board — or the oversight board that went with the stimulus — was that you’ve got to bring together all of the data if you’re going to find, if you will, people who are doing the same things in different parts of government and not have to find out only forensically after you’ve had rip-off artists rip-off the government.

The other example is on the granting of grants and other programs. That’s what we’re really going for in the DATA Act: to get that level of information that can, in fact, be used across platforms to find like data that becomes meaningful information.

Do you think more open government data remove some of the asynchronies of information around D.C.?

Issa:A lot of people have monetized the compiling of data in addition to monetizing the consulting as to what its meaning is. What we would like to do is take the monetization of data and take it down to a number that is effectively zero. Analysis by people who really have value-added will always be part of the equation.

Do you envision putting the MADISON Code onto GitHub, for open source developers in this country and around the world to use and deploy in their own legislatures if they wish?

Rep. Issa: Actually, the reason that we’ve formed a public nonprofit is for just that reason. I don’t want to own it or control it or to produce it for any one purpose, but rather, a purpose of open government. So if it spawns hundreds of other not-for-profits, that’s great. If people are able to monetize some of the value provided by that service, then I can also live with that.

I think once you create government information and, for that matter, appropriate private sector information, in easier and easier to use formats, people will monetize it. Oddly enough, they’ll monetize it for a fairly low price, because that which is easy, but you have to create value at a low cost. That which is hard, you can charge a fortune to provide that information to those who need it.

Will you be posting the budget of the Open Gov Foundation in an open format so people know where the funding is coming from and what it’s being spent on?

Rep. Issa: Absolutely. Although, at this point, we’re not inviting any other contributions of cash, we will take in-kind contributions. But at least for the short run, I’ll fund it out of my own private foundation. Once we have a board established and a set of policies to determine the relationships that would occur in the way of people who might contribute, then we’ll open it up. And at that point, the posting would become complex. Right now, it’s fairly easy: whatever money it needs, the Issa Family Foundation will provide to get this thing up and going.

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!

Schweinderl