Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

August 31 2011

Why the finance world should care about big data and data science

ABOVE by Lyfetime, on FlickrFinance experts already understand that data has value. It's the lifeblood of their industry, after all. But as O'Reilly director of market research Roger Magoulas notes in the following interview, some in the financial domain may not grasp all that data has to offer. Data science and big data have led to an expansion of data types, Magoulas says, and the associated influx of information could very well shape investment strategies and create new businesses.

How does big data apply to the financial world?

Roger Magoulas: There are two flavors of it. One is analyzing things like your investments, econometrics, trading activity, and longer-term data analysis. That's clearly part and parcel of the finance business, and people in the space already have great familiarity with this side of data.

The second flavor is the integrated approach to data in all facets of how organizations do business. This involves understanding customers, understanding competitors, understanding behavior, taking advantage of the world of sensors, and using a computational and quantitative mindset to make sense of a very confusing world.

Is there a disconnect between the finance world and terms like "data science" and "big data"?

Roger Magoulas: Everyone is struggling with the semantics, so finance isn't worse off than others. They're actually making an effort to understand it. Adding to the semantic confusion, the terms "data science" and "big data" are sometimes co-opted by organizations trying to show how they embody these attributes. That's fine, but the finance ecosystem has a responsibility to learn as much as it can about these areas. The best way to do that is directly from the data science practitioners: see the tools data scientists use and how they approach their work. That firsthand experience will help finance experts inform their investment strategies and see where the data space is heading.

What's the relationship between data science and business intelligence?

Roger Magoulas: My background is in data warehousing, and the front-end access to the data warehouse was known as "business intelligence" in the '90s. These early data warehouses were mostly constructed out of quantitative data from operational systems — things like order entry and customer service systems. "Business intelligence" tools were used to access the mostly well-understood operational data in the data warehouses.

What's changed is that we've had an explosion of data types. For example, no one was doing analysis on search terms back in the '90s because the tools to do that weren't available. Now, we need new terms to help accommodate what analysts do: natural language processing, machine learning, etc. Moreover, the old business intelligence tools were based on operational things, like how many orders a customer placed. They weren't built to tackle these new tasks.

Will data science and big data incrementally improve existing techniques with new tools? Or are we also talking about the creation of whole new industries?

Roger Magoulas: It's going to do both. The analogy might be to when open source software became widely used. While there were open source business models and companies, the real growth of open source came from companies like Google, Yahoo and Amazon that based their core technologies on the open source stacks. There was this two-headed approach that came out of the adoption of open source.

LinkedIn is an example of this two-headed approach. The company is a social network, but it uses data science tools, techniques and processes to build products that make sense of the social network for LinkedIn's clients. Would LinkedIn exist without data science? I think you can imagine a social network that just helps business people connect with each other, but the real monetization part — the thing that helped them go public — came from LinkedIn using the data they capture to identify and build products.

This interview was edited and condensed.

Strata Conference New York 2011, being held Sept. 22-23, covers the latest and best tools and technologies for data science — from gathering, cleaning, analyzing, and storing data to communicating data intelligence effectively.

Save 30% on registration with the code ORM30

Finance sessions at Strata New York

A number of sessions at the three Strata NY events (Sept. 19-23 in New York City) will examine the intersection of finance and data science. Here's a selection:

Thin and Thick Value in a Transparent Environment

Presenter: Umair Haque, Havas Media Lab, HBR

Big data is a necessary part of a transition to an economy that's not just more efficient and productive, but more efficient and productive in 21st century terms. Yet today, we're hyper-connected, but in a relative data vacuum, which leaves us prone to large-scale crises and "too big to fail" thinking. In this session, Harvard's Umair Haque looks at the future of thin and thick value in a data-driven world.

Next Best Action for MBAs

Presenter: James Kobielus, Forrester Research, Inc.

Leading-edge organizations have implemented "next best action" (NBA) technologies, such as big data analytics, within their multichannel customer relationship management programs. In this session, Forrester senior analyst James Kobielus will provide a vision, case studies, ROI metrics, and guidance for business professionals evaluating applications of NBA in their organizations.

Big Data: The Next Frontier

Presenter: Michael Chui, McKinsey Global Institute

McKinsey's influential big data report has helped define and explain the opportunity created by the torrent of data flowing daily through business. Michael Chui outlines the big picture of data innovation, challenges and competitive advantage.

The New Corporate Intelligence

Presenter: Sean Gourley, Quid

What if corporate strategists could literally draw a map to find growth opportunities? A technique called semantic clustering analysis makes this possible. When applied to technology entities worldwide, this analysis can reveal not only which innovation areas are thick with competition, but also where in the market there are opportunities, or "white spaces," ripe for innovation.

Creating a National Data Utility: Dodd-Frank Financial Reforms

Presenters: Donald F. Donahue, The Depository Trust & Clearing Corporation; Paul Sforza, U.S. Department of the Treasury

Donahue and Sforza will discuss America's first public financial services data utility. This project is being incorporated into the United States' existing information infrastructure to provide consistent, quality data to investors, institutions, and regulators.

Photo: ABOVE by Lyfetime, on Flickr

September 10 2010

Widgets, maps and an API make World Bank data sing

The new website that's launching today is designed to make the vast wealth of open data easier to use. The Bank is increasing the number of indicators available on the site from 339 to more than 1,200, and it has substantially improved its API. Four different languages are supported on the site, along with an improved data browser, feedback buttons, instant search, and embeddable widgets.

"The new site shows the art of the possibility," said Eric Gundersen of Development Seed, the D.C.-based Drupal shop behind the World Bank's data catalog. "This is really actionable information. So many more NGOs [non-governmental organizations] can now make data-informed decisions if they have access."

There's more to come next month, as well, said Livia Barton, web product manager at the World Bank. The World Bank will launch new maps in October. "It will be a way to visualize the work that we're doing in countries," she said. "Like how much money is going into certain schools, or roads that we're working on, and then show if the work is paying off. Are more kids in schools moving on to the secondary part of their educations? Are infant mortality rates decreasing due to the work they're doing? Marrying mapping with operational data can speed up data-driven decision making."

Sharing open data and open code

New tools will help tell stories, but they won't make every aspect of World Bank data analysis easy. For one, World Bank workers have to integrate data input into their business processes, building a regular reporting framework. For another, there's the classic challenge of instituting governance and quality for all of that data.

"What's important is that the economists and statisticians have extremely high standards for data quality," said Barton. "Before anything goes into a catalog, it must be vetted by these teams. The are even high standards to get into the API or website." The community can help with validation via feedback buttons, which have been integrated into every page.

"One of the exciting parts is how much data there is," said Gundersen. "The other is the steps taken to make it accessible." For example, every indicator page will now have tabular data, a map view, or a data view that can be made into widgets and dropped into webpages. Queries and custom graphs are also supported.

Gundersen is excited about the release of the Drupal code that powers the site. It's now open sourced and hosted at

"This will radically reduce the barrier of entry for Drupal folks looking to work with the raw API, and it capitalizes on the long tail -- especially in the international development space -- that Drupal offers for adoption," said Gundersen.

Apps for Development coming in October

Yesterday, the World Bank previewed an Apps for Development contest that will launch on Oct. 7. Todd Park, the CTO of Health and Human Services, challenged the audience: "What are the really useful things the world needs and what are we going to do about it?"

In addition to the contest, the World Bank will host an open forum on Oct. 7 that will feature experts from the open data movement via live webcasts and a 24-hour chatroom.

One open data expert already offered his perspective at the forum yesterday. "Don't assume the data you already have is going to be used in isolation," Tim O'Reilly said. "We don't necessarily need more apps. We need apps that do the right thing."


Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!