Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

August 29 2011

July 05 2011

Search Notes: Why Google's Social Analytics tools matter

The big search news over the past week has been the launch of Google Plus, but lots of other stuff has been going on as well. Read on for the run dow.

Google social analytics

Plus isn't the only social launch Google had recently. The company also pushed out social analytics features in both Google Analytics and Google Webmaster Tools.

If you use the new version of Google Analytics, you'll now see a social engagement report. Use the social plugin to configure your site for different social media platforms to monitor the behavior of visitors coming from those platforms. Do those coming from Twitter convert better than those coming from Facebook? Do those who "+1" a page spend more time on it? Those are the sorts of questions the new social reports aim to answer.

You can also use Google Webmaster Tools to see how +1 activity is impacting how searchers interact with your pages in search results. In particular, you can see if the click-through rate of a result improves when it includes +1 annotations.

This is just one example of how the silos of the web are integrating. You shouldn't think of "social" users and "search" users when you are doing audience analysis for your site. You instead have one audience who many be coming to your site any number of ways. Engaging in social media can help your site be more visible in search, as results become more personalized and pages that our friends have shared, liked, and "plussed" show up more often for us.

Some may wonder if integrations like this mean that Google is weighting social signals more strongly in search. But those kinds of questions miss the point. The specific signals will continue to change, but the important thing is to engage your audiences wherever they are. The lines will continue to blur.

Google Realtime Search goes offline "temporarily"

A few day ago, Google's realtime search mysteriously disappeared. The reason: Google's agreement with Twitter expired and Google is now working on a new system to display realtime information. While this has temporarily impacted a number of results pages (such as top shared links and top tweets on Google News), it has not impacted Google's social results, which show results that your friends have shared.

Google social results

New Google UI

Google launched the first of many user interface updates last week, with the promise of many more changes to follow throughout the summer.

Google, Twitter and the FTC

But the Google world is not just about launches. The FTC formally notified Google that they are reviewing the business. Google says that they are "unclear exactly what the FTC's concerns are" but that they "focus on the user [and] all else will follow."

The Wall Street Journal reports that the investigation focuses on Google's core search advertising business, including "whether Google searches unfairly steer users to the company's own growing network of services at the expense of rival providers."

The FTC may also being investigating Twitter, due to how Twitter may be acquiring applications.

Android Open, being held October 9-11 in San Francisco, is a big-tent meeting ground for app and game developers, carriers, chip manufacturers, content creators, OEMs, researchers, entrepreneurs, VCs, and business leaders.

Save 20% on registration with the code AN11RAD

Google Plus (or is it +?)

Google PlusAnd of course we have to dig into that well-chronicled launch. As you're no doubt aware, Google launched their latest social effort last week: Google+. Or Google Plus. Or Plus. Or +. I don't know. But it's different from Plus One (+1?). Also it's not Wave, Buzz, Social Circles. Or Facebook.

I've just started using it, so I don't have a verdict on it yet, although I don't know that I buy intoGoogle's premise that "online sharing is awkward. Even broken." And that Google Plus will fix that. It doesn't mean I won't like the product, either. Google is of course under more scrutiny than usual since earlier social launches haven't gone over as well as they'd have liked. What do you all think of it?

Lots of sites have done comprehensive run downs, including:

(Google's Joseph Smarr, a member of the Google+ team, will discuss the future of the social web at OSCON. Save 20% on registration with the code OS11RAD.)

Yahoo search BOSS updates

Yahoo launched updates to their BOSS (Build your own search service) program. If you're a developer who uses Yahoo BOSS, you might be interested in the changes. and rel=author

A few weeks ago, Google, Microsoft, and Yahoo launched the alliance, which provides joint support for 100+ microdata formats. At the same time, Google announced support for rel=author, which enables site owners to provide structured markup on a page that specifies the author of the content.

The announcement seems to be a foundational announcement to encourage platform providers, such as content management system creators, to build in support of microdata formats for future use by the search engines.

On the other hand, Google has already launched integration of rel=author with search results. You can see examples of how this looks with results for the initial set of authors Google is working with.


June 06 2011

Google Correlate: Your data, Google's computing power

Google CorrelateGoogle Correlate is awesome. As I noted in Search Notes last week, Google Correlate is a new tool in Google Labs that lets you upload state- or time-based data to see what search trends most correlate with that information.

Correlation doesn't necessarily imply causation, and as you use Google Correlate, you'll find that the relationship (if any) between terms varies widely based on the topic, time, and space.

For instance, there's a strong state-based correlation between searches for me and searches for Vulcan Capital. But the two searches have nothing to do with each other. As you see below, the correlation is that the two searches have similar state-based interest.

Picture 476.png

For both searches, the most volume is in Washington state (where we're both located). And both show high activity in New York.

State-based data

For a recent talk I gave in Germany, I downloaded state-by-state income data from the U.S. Census Bureau and ran it through Google Correlate. I found that high income was highly correlated with searches for [lohan breasts] and low income was highly correlated with searches for [police shootouts]. I leave the interpretation up to you.

Picture 443.png

Picture 445.png

By default, the closest correlations are with the highest numbers, so to get correlations with low income, I multiplied all of the numbers by negative one.

Clay Johnson looked at correlations based on state obesity rates from the CDC. By looking at negative correlations (in other words, what search queries are most closely correlated with states with the lowest obesity rates), we see that the most closely related search is [yoga mat bags]. (Another highly correlated term is [nutrition school].)

Picture 478.png

Maybe there's something to that "working out helps you lose weight" idea I've heard people mention. Then again, another highly correlated term is [itunes movie rentals], so maybe I should try the "sitting on my couch, watching movies work out plan" just to explore all of my options.

To look at this data more seriously, we can see with search data alone that the wealthy seem to be healthier (at least based on obesity data) than the poor. In states with low obesity rates, searches are for optional material goods, such as Bose headphones, digital cameras, and red wine and for travel to places like Africa, Jordan, and China. In states with high obesity rates, searches are for jobs and free items.

With this hypothesis, we can look at other data (access to nutritious food, time and space to exercise, health education) to determine further links.

Time-based data

Time-based data works in a similar way. Google Correlate looks for matching patterns in trends over time. Again, that the trends are similar doesn't mean they're related. But this data can be an interesting starting point for additional investigation.

One of the economic indicators from the U.S. Census Bureau is housing inventory. I looked at the number of months' supply of homes at the current sales rate between 2003 and today. I have no idea how to interpret data like this (the general idea is that you, as an expert in some field, would upload data that you understand). But my non-expert conclusion here is that as housing inventory increases (which implies no one's buying), we are looking to spiff up our existing homes with cheap stuff, so we turn to Craigslist.

Picture 481.png

Picture 482.png

Picture 483.png

Of course, it could also be the case that the height of popularity of Craiglist just happened to coincide with the months when the most homes were on the market, and both are coincidentally declining at the same rate.

Search-based data

You can also simply enter a search term, and Google will analyze the state or time-based patterns of that term and chart other queries that most closely match those patterns. Google describes this as a kind of Google Trends in reverse.

Google Insights for Search already shows you state distribution and volume trends for terms, and Correlate takes this one step further by listing all of the other terms with a similar regional distribution or volume trend.

For instance, regional distribution for [vegan restaurants] searches is strongly correlated to the regional distribution for searches for [mac store locations].

Picture 484.png

What does the time-trend of search volume for [vegan restaurants] correlate with? Flights from LAX.

Picture 485.png

Time-based data related to a search term can be a fascinating look at how trends spark interest in particular topics. For instance, as the Atkins Diet lost popularity, so too did interest in the carbohydrate content of food.

Picture 486.png

Interest in maple syrup seems to follow interest in the cleanse diet (of which maple syrup is a key component).

Picture 488.png

Drawing-based data

Don't have any interesting data to upload? Aren't sure what topic you're most interested in? Then just draw a graph!

Maybe you want to know what had no search volume at all in 2004, spiked in 2005, and then disappeared again. Easy. Just draw it on a graph.

Picture 489.png

Apparently the popular movies of the time were "Phantom of the Opera," "Darkness," and "Meet the Fockers." And we all were worried about our Celebrex prescriptions.

Picture 490.png

Picture 491.png

(Note: the accuracy of this data likely is dependent on the quality of your drawing skills.)

OSCON Data 2011, being held July 25-27 in Portland, Ore., is a gathering for developers who are hands-on, doing the systems work and evolving architectures and tools to manage data. (This event is co-located with OSCON.)

Save 20% on registration with the code OS11RAD


June 01 2011

Search Notes: Connecting Google's dots

Here's what recently caught my attention in the search space.

Google Wallet

Google WalletLast week, Google unveiled Google Wallet, which on the one hand, might be the future of payments, but on the other hand, seems like it's just using your phone instead of your credit card to pay for things. And phones so far are bulkier to carry around than credit cards. But Google says:

... because Google Wallet is a mobile app, it will do more than a regular wallet ever could. You'll be able to store your credit cards, offers, loyalty cards and gift cards, but without the bulk.

Wallet will be integrated with Google Offers (Google's answer to Groupon) and one can imagine the possible future integrations. For instance, Google could manage travel from start to finish by integrating elements of its ITA acquisition for booking, Hotpot and Places for reviews and maps, and Wallet for paying on the go.

Google Wallet will be available this summer, initially on the Nexus S.

After the unveiling of Wallet, PayPal sued. They said that Google had been nearing the end of negotiations with PayPal to make it a payment option in the Android marketplace, but instead of signing, Google hired away the PayPal executive they'd been negotiating with and built their own version.

Of course, this isn't the first time Google has been sued for hiring talent away from a competitor. And since they had the two key ex-PayPal employees introduce Google Wallet publicly, they weren't exactly keeping things on the down low to avoid this lawsuit.

Android Open, being held October 9-11 in San Francisco, is a big-tent meeting ground for app and game developers, carriers, chip manufacturers, content creators, OEMs, researchers, entrepreneurs, VCs, and business leaders.

Save 20% on registration with the code AN11RAD

Google Correlate: Mine search trends using uploaded state-based or time-based data

Google CorrelateGoogle Correlate, new in Google Labs, takes the idea behind Flu Trends and makes it available to anyone, for any data. You can enter data by state or by time and find out what searches are most closely correlated. You can also simply enter a search term and see what other queries are most closely correlated (by state or by time).

This is all U.S. data for now. Google Correlate was launched in Labs, so hopefully when it graduates from there it will be launched worldwide.

Google's comic book about the product stresses that correlation does not imply causation. This data simply shows similar search patterns. But data patterns can provide insight. Flu Trends, for instance, predicts when and where flu is spreading based on how much people are searching for flu-related information. "We found aggregated flu-related queries which produced a seasonal curve that suggested actual flu activity," Google notes. They have corroborated these trends historically with government data about flu activity.

Google's worldwide market share

This column is "Search Notes," not "Google Notes," so why so much Google coverage? The fact is Google is the dominant search engine worldwide, more so even outside the U.S. Along those lines, as I was finalizing slides for a conference session in Germany, I double checked Google's search share there. I found that Google's share was relatively unchanged year over year, at more than 90% for Germany, France, the UK, and Spain. This week, comScore noted that Google is at more than 90% share in Latin America as well.

Removing content from Google

Last fall, I wrote two fairly detailed articles about removing content from Google search results:

Now, Google has made it easier for content owners to remove content. Just verify ownership of your site in Webmaster Tools, and then you can specify what pages from your site you want Google to remove from its results.


May 13 2011

Search Notes: Trying to understand Facebook's whisper campaign

Earlier this week, it seemed clear that the top news in the world of search would be the announcements that came out of Google I/O. But yesterday came word that Facebook had launched a "whisper" campaign against Google. While juicy gossip doesn't completely trump shiny gadgets, it certainly holds its own.

Does Facebook know Google runs a search engine?

Yesterday, the Daily Beast told the story of how Facebook had hired a PR firm to pitch anti-Google stories to reporters and bloggers. Facebook wanted the world to be just as outraged as they are about Google's invasion of our privacy — wait, what?

It seems that the crux of Facebook's argument was that Google organizes information about people and makes it easily accessible through its search results. (I'm fairly sure Google isn't keeping this particular feature secret.)

Facebook focused on Google's "Social Circle" results. In a statement, Facebook said:

We wanted third parties to verify that people did not approve of the collection and use of information from their accounts on Facebook and other services for inclusion in Google Social Circles — just as Facebook did not approve of use or collection for this purpose.

The PR firm Facebook hired had previously sent emails trying to drum up reporter interest. Accusations included:

Google's robots scour the web for people's social connections on different websites. These connections are then stored in a collection people's connections on different websites. This collection is then mined, creating connections between people on different websites, that those people never intended and can't control.

Google Social Circles automatically enables people to trace their contacts' connections and profile information by crawling and scraping the sites you and your contacts use, like Twitter, MySpace, YouTube, Facebook, Yelp, Yahoo and many others, likely in direct violation of the Terms of Service for those sites, unless those sites have partnered with Google on this "service," something else users ought to be aware of.

Google is a search engine. Its entire purpose is to enable users of the Internet to navigate the web's content in a structured way. Any site that doesn't want to make its content available to search engines can simply indicate as such in a robots.txt file. Or pages can be made even more private by placing them behind a login.

Facebook's CTO and COO both previously worked at Google, so one assumes they have an understanding of how search engines work.

In 2007, Facebook decided they were pretty interested in having Google's robots "scour" their profile pages so those pages would be easily available to Google searchers (and in turn Facebook could get more traffic).

Danny Sullivan over at Search Engine Land goes through the details of exactly what Google is indexing and how, but the bottom line is that search engines index the public web. Social networks and other sites have an established way to opt out.

The Chromebook arrives

The Samsung 5 3G Chromebook
The Samsung 5 3G Chromebook.

And now, on to the gadgets! At Google I/O this week, Google announced its new Chrome laptops. Part tablet and part computer, the Chromebooks are instant-on, 3G -enabled, and they have tons of battery life. The drawback? You can't run traditional client applications on them. This is cleverly noted as a benefit in the Chromebook announcement:

At the core of each Chromebook is the Chrome web browser. The web has millions of applications and billions of users. Trying a new application or sharing it with friends is as easy as clicking a link. A world of information can be searched instantly and developers can embed and mash-up applications to create new products and services. The web is on just about every computing device made, from phones to TVs, and has the broadest reach of any platform. With HTML5 and other open standards, web applications will soon be able to do anything traditional applications can do, and more.

Maybe so, but as of right now, Google Docs just doesn't offer the things I need to do in Excel and Powerpoint.

Google music and movies

Google's goal of "organizing the world's information and making it universally accessible" has made its way into movies and music. You can now rent movies on YouTube (3,000 titles for now) and Google is finally launching its music product, although at the moment you can only upload your collection and stream it.

Beyond text search

Google GogglesThe future of search, in the short term, is about moving beyond textual input (the search box) and textual results (web pages). On the input side, Google has launched a new version of Google Goggles (which uses visual input). I love the idea of Goggles, which lets you point at things to search for information about them.

On the output side, Google has launched a kind of street view for the interiors of stores.

One day, this will all be connected. As I'm walking down the street and see a girl wearing a cute skirt, I'll be able to point my phone at it and find a store that has the skirt hanging on a rack for sale. Ah, the future.


May 06 2011

Search Notes: The high cost of search market share

Here's what caught my attention in the search world this week.

Bing's partnership with RIM: Will distribution lead to increased mobile search share?

Search market share isn't just about providing great search results. It's also about distribution. Become the default search provider in an application or on a device, and as a search engine, you've at least partially won the battle for those users (unless your search experience is so bad it drives users from their normal behavior of not changing defaults right to your competitor).

Google currently has 97% mobile market share in the United States, which is partially due to distribution — both with its Android OS and as the default search on the iPhone. (And consumers are increasingly interested in Android and iPhone over RIM and Microsoft Windows mobile.)

BingBut Bing is trying to change the market share balance, in part by becoming the default search provider on RIM BlackBerry devices. Microsoft Smartphones make up 9% of the SmartPhone market (vs. more than 50% for the combination of Android and iPhone). RIM makes up an additional 33%.

Some think that Microsoft's aggressive pursuit of distribution deals makes poor business sense:

Microsoft's Bing search engine is indeed gaining some share of search queries in the US market (globally, Bing is nowhere). But it is gaining this share at an absolutely mind-boggling cost. Specifically, Microsoft is gaining share for Bing by doing spectacularly expensive distribution deals, deals that don't even come close to paying for themselves in additional revenue.

How much is Microsoft spending to buy market share for Bing?

Based on an analysis of Microsoft's financial statements, Bing is paying about 3X as much for every incremental search query as it generates in revenue from that query.

Continued personalization of Google News

Radar's Alex Howard, writing recently about research around how we increasingly look online for political news, noted:

Polarization can express itself in how people group online and offline. As with so many activities online, political information gathering online requires news consumers to be more digitally literate. That may mean recognizing the potential for digital echo chambers, where unaware citizens become trapped in a filter bubble created by rapidly increasing personalization in search, commercial and social utilities like Google, Amazon and Facebook.

The research, conducted by the Pew Internet and Life Project, found that actually, we are exposed to a variety of viewpoints online. But those who are concerned about potential filter bubbles may be wary of new personalization features of Google News that use previous Google News activity to shape the "News for you" and a new "Recommended Sections" feature. Google says personalization uses both "subjects and sources," so it will expose content based on topics you're interested in (which may come from a variety of sources and viewpoints) and sources you've clicked on (which may be more likely to share your perspective).

Search and Osama Bin Laden

News events always cause search spikes, but the death of Osama Bin Laden caused an all out search frenzy. Yahoo reported a 98,550% increase in searches for the name on May 1, in part driven by teenagers wondering who he was.

Google Trends result for May 2 2011
Google Trends result for May 2, 2011.

Over on Search Engine Land, Danny Sullivan compared Google results on September 11, 2001, when Google posted a message on their home page advising searchers looking for new information to go elsewhere, vs. May 1, 2011, when a combination of news articles and tweets provided up-to-the minute news in search results. (Google's inability to provide real-time news coverage on September 11, 2011 led to the creation of Google News.)


April 23 2011

Search Notes: Search and privacy and writing robots

This week, we continue looking at search privacy issues and at the ongoing battle between Google, Bing, and Yahoo. Oh, and writing robots — we'll look at those, too.

Privacy and tracking issues

Searchers don't often think about privacy, but governments certainly do, and over time, search engines have had to balance gathering as much data as possible to improve search results and concerns about privacy. In 2008, Yahoo was very vocal about their policy of only retaining data for 90 days. Now, they've changed that policy. They'll keep raw search log data for 18 months and "have gone back to the drawing board" regarding other log file data.

Microsoft and Google keep search logs for 18 months and Yahoo may have found that keeping this data for a shorter period of time put them at a competitive disadvantage. In the new book "In the Plex," Steven Levy talks about how important Google found search data to be early on.

The search behavior of users, captured and encapsulated in the logs that could be analyzed and mined, would make Google the ultimate learning machine ... Over the years, Google would make the data in its logs the key to evolving its search engine. It would also use those data on virtually every other product the company would develop.

Perhaps that's why Google hasn't added the new "do not track" header to Chrome. The data is too valuable to provide encouragement for users to opt out.

Firefox tracking
Firefox 4 includes a no tracking option. Whether sites choose to accept this is another matter.

Although, as security researcher Christopher Soghoian said to Wired:

"The opt-out cookies and their plug-in are not aimed at consumers. They are aimed at policy makers. Their purpose is to give them something to talk about when they get called in front of Congress. No one is using this plug-in and they don't expect anyone to use it."

And as the Wired article notes, the header doesn't mean much at the moment as companies aren't using it and legislation doesn't require them to.

Bing continues to gain search share

Last week, I noted that Bing was slowly gaining search share in the United States. This week, the Bing UK blog said that they are gaining share in the UK as well. Of course, the gain between February and March of 2011 was only .28% and Google is still at 90% share, but hey, Bing will take what they can get.

Yahoo reports revenue declines

On Search Engine Land, Danny Sullivan has a great article digging into the details of Yahoo's second quarter earnings. Yahoo is blaming the revenue decline on the new partnership with Microsoft, but the article points out that the explanation isn't as easy as that, and in fact, revenue began declining long before the switch was made.

Can robots write better content than humans?

In recent weeks, Google has been in the news for tweaking its algorithms to better rank sites with unique, high-quality content rather than pages from "content farms." But in some cases, can machines write higher quality stories than people? A recent NPR story recounts a journalism face off between a robot journalist and a human journalist ... and the robot won. Certainly, algorithms are great at data extraction and in some cases, at presenting that data. But we probably don't want machines to take over the analysis, do we?


April 15 2011

Search Notes: More scrutiny for Google, more share for Bing

This week, worldwide courts continue their interest in Google while Bing is edging up in market share. That may actually be good news for Google as they fight antitrust allegations.

Google and privacy and governments

GoogleI've written in this column before about both U.S. and international courts looking at all aspects of Google, including antitrust and citizen privacy. That scrutiny continues. The Justice Department has given the go-ahead to Google's acquisition of travel technology company ITA, but the FTC has also instituted conditions to prevent the acquisition from substantially lessening competition. Google agreed to the terms and closed the deal on April 12.

This could pave the way for a FTC antitrust investigation, however. It remains to be seen if the FTC will see the concessions stipulated by the Justice Department to be enough to forgo the investigation. As the result of another FTC investigation, Google hasagreed to 20 years of privacy audits.

The U.S. isn't the only country keeping an eye over Google. Courts in Italy have ruled that for search results in Italy, Google has to filter out negative suggested queries in its autocomplete product.

Swiss courts have ruled that Google has to ensure all faces and license plates are blurred out in its Street View product. Google's technology currently catches and blurs out 98%-99% of both already, but the Swiss ruling mandates that Google blur out the remaining by hand if necessary.

In Germany, Google has stopped Street View photography, possibly to avoid burdensome requirements from German courts. Bing is already facing objections from the German government for its plans to operate a similar service.

Bing's growing market share

BingBoth Hitwise and comScore search engine market share numbers are out, and both show Bing gaining.

Hitwise shows that Bing gained 6% in March, for a current share of 14.32%. Bing-powered search (which includes Yahoo) now stands at 30.1%. (Google lost 3% for a share of 64.42%.)

comScore March data shows that Bing's gain from the previous month is much smaller at .3%, for a current share of 13.9% (and a total Bing-powered search share of 29.6%). ComScore's data shows Google with a .3% increase as well for a current share of 65.7%.

Bing's increase may be due, in part, to increased usage of Internet Explorer 9.

Yahoo BOSS relaunches

The original version of Yahoo BOSS was intended to spark innovation in the startup industry and provide a free, white labeled search index that developers could build from. The newest version, however is a fairly substantial change from the original mission, as it includes branding and pricing requirements.

Of course, Yahoo search itself has changed since the original launch. When BOSS was first envisioned, Yahoo had its own search engine and was looking to disrupt the search engine landscape and compete with both Bing and Google. Now, Yahoo uses Bing's search engine, and in fact, this new version of BOSS uses Bing's index as well.

Will applications built on Yahoo BOSS continue to use the platform with these new requirements? I'd be interested in talking to developers who are facing this decision.

Google rolls out its "content farm" algorithm internationally

In late February, Google launched a substantial change to its ranking algorithms that impacted nearly 12% of queries. This change was intended to identify low quality sites, such as those known as content farms and reduce their ranking.

Google has now made some tweaks and has rolled out the change worldwide for all English queries. Sites around the world are already beginning to see the impact.

One tweak is that Google is now taking into account data about which sites searchers block. Google uses hundreds of signals to determine what web pages are the most useful to searchers and this is one example of how user behavior can play into that.

Online reputation management

Nick Bilton recently wrote a piece in the New York Times about the rise of online reputation management. In today's online world, a quick search for a person's name or a company can surface old past discretions, mistakes, or the crazy rantings of someone with a grudge and passable HTML skills.

Mike Loukides followed this up with a Radar post about how he was disturbed by the idea of manipulating search results and using black hat SEO techniques to make negative information disappear.

This topic becomes more important as our lives and culture move online. Justask Rick Santorum.

So what can you do that's not "black hat" if negative information starts appearing about you or your organization? Google recommends that you "proactively publish [positive] information." For example, make sure your business website is optimized well for search and claim ownership of your business listings on the major search engine maps.

Make sure that you've filled out profiles on social media sites, use traditional public relations to raise visibility, and get involved in the conversation. For instance, if negative forum posts appear about your company in search results, reply in those forums with additional information.

[Note: If the "traditional public relations" that you use is to raise visibility of the negative issue a la Rick Santorum, you'll likely only increase the number of search results that appear about the negative issue, as he's perhaps learned.]

If you're able to get a site owner to take down negative information about you, you can request that Google remove that page from its index. And if you have gotten a court order related to unlawful content, you can request Google remove that content from its index as well.


March 30 2011

Search Notes: The future of advertising could get really personal

This week, we imagine the future of advertising as we think about how much can really be tracked about us, including what we watch, our chats with our friends, and if we buy a lot of bacon.

Google expands its predictions

Search engines such as Google have an amazing amount of data, both in general (they do store the entire web, after all) and about what we search for (in aggregate, regionally, and categorized in all kinds of segments). In 2009, Google published a fascinating paper about predictions based on search data. The company has made use of this data in kinds of ways, such forecasting flu trends and predicting the impact of the Gulf oil spill on Florida tourism.

You can see the forecasted interest for all kinds of things using Google Insights for Search. Own a gardening web site? You might want to know that people are going to be looking for information on planting bulbs in April and October.

Web Search Interest: planting bulbs
Click to enlarge

Those predictions are all based on search data, but search engines can do similar things with data from websites. Google is now predicting answers to searches using its Google Squared technology. Want to know the release date of a movie or video game? Just ask Google. A Google spokesperson said this feature is for any type of query as long as they have "enough high quality sites corroborating the answer."

Movie guess
Click to enlarge

Yahoo and Bing evolve the search experience

We hear a lot about Google's experiments with changes in the user experience of search, but the other major search engines are changing as well.

When Yahoo replaced their search engine with Bing's, they said they would continue to innovate the search experience. The most recent change they've made is with Search Direct, which is similar to Google's instant search but includes rich media and advertising directly in a dropdown box.

Bing also continues to revise their user interface, the latest being tweets shown on the Bing news search results page (in a box called "public updates"). This is in addition to their "most recent" box.

Bing results
Click to enlarge

Search engines and social networks continue to change the face of advertising

Most of us don't spend much time thinking about the ads that appear next to Google search results, but search-based ads were an amazing transformation in advertising. For the first time, advertisers could target consumers who were looking for exactly what those advertisers had to offer. At scale. Want to target an audience looking to buy black waterproof boots? A snowboard roof rack for a 2007 Mini Cooper? A sparkly pink mini skirt? No problem!

Several years ago, Google introduced ads in Gmail that were intended to be contextually relevant to the email you were reading. This attempt was a bit more hit or miss. Contextual advertising is always going to be a bit less relevant than search advertising. If I'm searching for "best hiking gear," I'm likely looking to buy some. If I'm reading an article in the New York Times about hiking trails in Vermont, I might just be filling time while I wait in line to renew my driver's license. And matching advertising to email is even harder. I might open an email about hiking and wonder how I got on an outdoor mailing list.

For Gmail ads, Google is now looking to use additional signals about how you interact with your mail beyond just the content of the message. They noted that when working on the Priority Inbox feature, they found that signals that determined what mail was important could also potentially be used to figure out what types of ads you might be most interested in.

For example, if you've recently received a lot of messages about photography or cameras, a deal from a local camera store might be interesting. On the other hand if you've reported these messages as spam, you probably don't want to see that deal.

Facebook is also looking to show us ads based on conversations we're having online. This type of advertising has been available in a more general way on Facebook for some time, but this newest test shows ads based on posts in real time. AdAge's description of it sounds like it hits upon the core reason search ads are so effective:

The moment between a potential customer expressing a desire and deciding on how to fulfill that desire is an advertiser sweet spot, and the real-time ad model puts advertisers in front of a user at that very delicate, decisive moment.

Simply showing better ads in email and next to conversations in social networks is one thing, but the more interesting idea is how this idea can be used more broadly. Advertising has always provided the profit for most media (television, newspapers, websites) and innovation as we saw with the original search ads is critical in thinking through the future of journalism.

A breakthrough that makes advertising in online versions of videos more successful than commercials on television could be key in the transition of television to online viewing. Americans engaged in 5 billion online video viewing sessions in February 2011. We watched 3.8 billion ads, but if you are like me and watch a lot of Hulu (and many of you are, as Hulu served more video ads than anyone else), you might wonder if all of those ad views were of the same PSA.

Part of why mainstream advertisers haven't taken the leap from traditional television commercials to video ads is that TV commercials are tried and true. Why transition away from that? A good motivator would be an entirely new ad platform that takes real advantage of the online medium. (In the future, perhaps awebcam will track our facial expressions and use that data to stop showing us that annoying commercial!)

Ad platforms have been evolving use of behavioral targeting for a while, but it's still early days. As for the changes in Gmail ads, it will be interesting to see if the types of email we get one day is part of the personalization algorithm for our search (and search ad) results and if what kinds of email lists we subscribe to and what types of things we search for impact the video ads we see on YouTube.

Add to that the predictive elements of search and that organizations such as Rapleaf can tie our email addresses to what we buy at the grocery store (Googlers drink a lot of Mountain Dew and snack on Dorritos ... and bacon) and it's pretty clear that radical shifts in personalized advertising are likely not too far away.

Google still the top place to work

One in four job applicants wants to work at Google. That's nearly twice the number who want to work at Apple. The top write-in company (a list of 150 was offered in the study) was Facebook, followed by the Department of Homeland Security. No, I don't know why either.

Google was also named the top brand of 2011. So,despite their legal woes, consumers and potential employees are still fans.


March 24 2011

Search Notes: Google and government scrutiny

This week's column explores the latest in how we access information online and how the courts and governments are weighing in.

Google continues to be one of the primary ways we navigate the web

GoogleA recent Citi report using comScore data is yet the latest that illustrates how much we use Google to find information online.

The report found that Google is the top source of traffic for 74% of the 35 properties analyzed and that Google traffic has remained steady or increased for 69% of them.

However, it was a slightly different picture for media sites, as many saw less traffic from Google and more traffic from Facebook.

Also, a recent Pew study found that for the 24% of Americans who get most of their political news from the internet, Google comes in third at 13% (after CNN and Yahoo).

More generally, 67% of Americans get most political news from TV and 27% rely on newspapers (the latter is down from 33% in 2002). This trend is what's being seen generally for media, as noted in a recent comprehensive study by Pew Research Center's Internet & American Life Project and Project for Excellence in Journalism, in partnership with the John S. and James L. Knight Foundation.

Google and governments, courts, and other legal entanglements

Google's mission is to "organize the world's information and make it universally accessible and useful." Notice the use of the word "world" rather than "Internet." They're organizing our email, our voice mail, and the earth.

While having everything at our fingertips at a moment's notice is awesome, it also can make governments and courts nervous.

Case in point, the U.S. Senate is planning to hold an anti-trust investigation into Google's "dominance over Internet search" and their increasing competition with ecommerce sites.

Senator Herb Kohl noted that the "Internet continues to grow in importance to the national economy." He wants to look into allegations by websites that they "are being treated unfairly in search ranking, and in their ability to purchase search advertising."

Texas also recently filed an anti-trust lawsuit against Google, looking for access to information about how both organic and paid results are ranked.

Of course, if Google reveals too much, then their systems can be gamed. Searchers won't get the best results. Site owners would lose out too as the most relevant and useful result wouldn't appear at the top of results.

Why should we trust Google to rank results fairly? Ultimately, if they build a searcher experience that doesn't benefit the searcher, they could lose users and market share, so it's in their best interest to continue on their stated path.

"Right to be forgotten"

Another fairly recent case involves the Spanish courts. Google search simply indexes and ranks content that exists on the web. When something negative appears about a person or company, they will sometimes ask Google to remove it, but Google's stance is typically that the person or company has to work with the content owner to remove the content — Google just indexes what is public. (Exceptions to this exist.)

In Spain (and other parts of Europe), someone has "the right to be forgotten," but this doesn't apply to newspapers as they are protected by freedom of expression rules. Does it apply to Google's index of that newspaper content? Apparently, it's been ruled both that freedom of expression rules don't apply to search engines and that Google is a publisher and laws that apply to newspapers apply equally to Google.

A Spanish plastic surgeon wants Google to remove a negative newspaper article from 1991 from their search results (although he can't legally ask the newspaper itself to remove the article). The Wall Street Journal sums up the case this way:

The Spanish regulator says that in situations where having material included in search results leads to a massive disclosure of personal data, the individual concerned has the right to ask the search engine to remove it on privacy grounds. Google calls that censorship.

Google does remove content based on government requests when legally obligated to do so and it makes a summary of those requests available.

Sidenote to anyone upset about a negative newspaper article appearing in search results: It's probably a bad idea to try to bribe the journalist into taking the content down.

Google can't become the "Alexandria of out of-print books" quite yet

Google BooksSearch isn't the only area being scrutinized. Google has also been scanning the world's books and making them universally accessible. The courts justrejected a settlement between Google and the Authors Guild that created an opt-out model for authors. Neither Google nor the Authors Guild is happy. Authors Guild president Scott Turow said, "this Alexandria of out-of-print books appears lost at the moment."

Block any site from your Google search results

Since we all use Google to navigate the web, it makes sense that we want to be able to have our own personal Google and block the sites we don't like. Last month in this column, we talked about Google's chrome extension that enabled searchers to create a personal blocklist. Now this ability is open to everyone. Once you click on a listing and then return to the search results, the listing you clicked includes a "block all results" link. Click that and you'll never see results from that site again. You can manage this block list in your Google account.

Bye, AllTheWeb!

AllTheWebGoogle may seem unstoppable, but only a few years before Google launched, another search engine was dominant on the web. Alta Vista launched in late 1995 with innovative crawling technology that helped it gain vast popularity. Alta Vista later lost out to Google and was acquired by Yahoo. In late 2010, Yahoo announced they were closing down several properties, including Alta Vista.

That hasn't happened yet, but AllTheWeb, another of Yahoo's search properties is closing April 4th, at which time you'll be redirected to Yahoo. Alta Vista can't be far behind.


March 23 2011

In the future we'll be talking, not typing

Search algorithms thus far have relied on links to serve up relevant results, but as research in artificial intelligence (AI), natural language processing (NLP), and input methods continue to progress, search algorithms and techniques likely will adapt.

In the following interview, Stephan Spencer (@sspencer), co-author of "The Art of SEO" and a speaker at the upcoming Web 2.0 Expo, discusses how next-generation advancements will influence search and computing (hint: your keyboard may soon be obsolete).

What role will artificial intelligence play in the future of search?

Stephan SpencerStephan Spencer: I think more and more, it'll be an autonomous intelligence — and I say "autonomous intelligence" instead of "artificial intelligence" because it will no longer be artificial. Eventually, it will be just another life form. You won't be able to tell the difference between AI and a human being.

So, artificial intelligence will become autonomous intelligence, and it will transform the way that the search algorithms determine what is considered relevant and important. A human being can eyeball a web page and say: "This doesn't really look like a quality piece of content. There are things about it that just don't feel right." An AI would be able to make those kinds of determinations with much greater sophistication than a human being. When that happens, I think it will be transformative.

What does the future of search look like to you?

Stephan Spencer: I think we'll be talking to our computers more than typing on them. If you can ask questions and have a conversation with your computer — with the Internet, with Google — that's a much more efficient way of extracting information and learning.

The advent of the Linguistic User Interface (LUI) will be as transformative for humanity as the Graphical User Interface (GUI) was. Remember the days of typing in MS-DOS commands? That was horrible. We're going to think the same about typing on our computers in — who knows — five years' time?

Web 2.0 Expo San Francisco 2011, being held March 28-31, will examine key pieces of the digital economy and the ways you can use important ideas for your own success.

Save 20% on registration with the code WEBSF11RAD

In a "future of search" blog post you mentioned "Utility Fog." What is that?

Stephan Spencer: Utility Fog is theoretical at this point. It's a nanotechnology that will be feasible once we reach the phase of molecular nanotechnology — where nano machines can self-replicate. That changes the game completely.

Nano machines could swarm like bees and take any shape, color, or luminosity. They could, in effect, create a three-dimensional representation of an object, of a person, of an animal — you name it. That shape would be able to respond and react.

Specific to how this would affect search engines and use of the internet, I see it as the next stage: You would have a visual three-dimensional representation of the computer that you can interact with.


March 18 2011

Social media design should start with human behavior

Businesses are under pressure to crack the social media code. There's all those tools and platforms to harness, and all those best practices to adopt. Staying on top of it is exhausting. Staying ahead of it is almost impossible.

Fortunately, there's a better way.

In the following interview, Paul Adams (@Padday), global brand experience manager at Facebook and a speaker at the upcoming Web 2.0 Expo, explains how a simple commitment to value can unravel the complications of social media. The key, Adams says, is to understand and serve basic human behavior.

How is social media design lacking? How can it be improved?

Paul AdamsPaul Adams: I'm not sure we should even start with the concept of "social media design." Social behavior in humans is as old as our species, so the emergence of an Internet based on social behavior is simply our rudimentary technology catching up with offline life. Thinking about "social design" should be embedded in everything we do, and not thought of in isolation. We should think about it the same way designers of electronic appliances think of electricity — it's just there, it's the hub, powering other things.

It's problematic that many businesses focus on existing and emerging technology, and not on social behavior. Thinking about platform integration first, like Twitter or Facebook, or technologies first, like what could be enabled by "mobile location" or "real-time updates," is the wrong place to start. Often, businesses need to step back and consider what will motivate people to use what they are developing, above and beyond what exists today. Something that I've been saying for a while is that human behavior changes slowly, much slower than technology. By focusing on human behavior, not only are you much more likely to create something that people value and use, but you're more likely to protect yourself from sudden changes in technology.

Interestingly, even this may not be enough. When it comes to designing around social behavior, it's not just about meeting a need that people currently struggle with, it's about understanding why people would change their current behavior. When things don't work well, people develop workarounds and form habits. These habits are hard to shift. Why try something new, even if it looks a little better, when what you currently use works fine? This is why basic technologies such as SMS remain popular with people, and advanced technologies like Google Wave didn't catch on. SMS works, so why try something else?

People also take the path of least resistance, and trying something new involves change. This is compounded by the fact that many of these opportunities for design are latent needs. In other words, people can't see the problems they have because they have developed workarounds. So simply saying that what you developed is better won't cut it.

Web 2.0 Expo San Francisco 2011, being held March 28-31, will examine key pieces of the digital economy and the ways you can use important ideas for your own success.

Save 20% on registration with the code WEBSF11RAD

Search engines are integrating social media into their products. How do you see this area evolving over the next few years?

Paul Adams: Integrating content created on social media platforms into search engines raises big questions around data usage and privacy. How this evolves will be very interesting because it highlights the difference between being public and being publicized.

At SXSW last year, danah boyd spoke articulately about this — her talk is worth checking out. My take, heavily influenced by the work of danah and others, is that when most people think of the public, they think of the public as they have experienced it offline in the past, usually being outside and being bounded by space and time. The most public setting for many would be something like a large music festival, where many thousands of people are gathered, and at least hundreds can observe your behavior. But only those people who are there, at that time, can observe what you say and do. Search and discovery platforms online, however, are very different. They are bounded by neither space nor time.

The problem is that many people don't realize or understand that. They have no idea what it means to "index the web." They act in the moment, in a specific context, and don't think about how that content might look in the future, in a different context. Some technology pundits say that people should be more careful with what they post, but I strongly disagree with that. It's up to us — the people designing and building the technology — to design the right thing in the first place. Our tools should respect the context in which the content was first created.

This raises a really important question: Does the fact that a post or update was public on a blog, social media site, or review site make it permissible for anyone to take that content and publish it wherever they choose? I don't think it does. Yet, that's what search engines are starting to do. It's interesting to me to see billboard ads with tweets on them. Do those people know that what they said is plastered across town? Is that what they expected when they created the content? The unfortunate fact is that many people will probably come to understand what it means to post publicly online by exposure of something they thought had a limited audience. And I don't think that's a good thing for anybody.

How important is reputation in social media interactions?

Paul Adams: Reputation, or the broader concept of "Identity," is the cornerstone for all other interactions. People need to know who they are interacting with in order to act appropriately, and they constantly scan for cues. As with influence, this is really complex. For example, it's possible to take the view that no one has a single reputation. We are uniquely viewed by every single person, based on our previous interactions and their previous experiences. In this light, the current trend in representing reputation online with people being assigned a single score, makes no sense.

We're also undervaluing the influence of strong ties. The people closest to us are often the ones who influence us most. To heavily generalize, people are influenced by five different groups in the following descending order: closest friends and family, groups of people they have a strong affiliation with, groups of people who are similar to them, very large groups of people, and finally, by random individuals they don't recognize.

Is there too much focus on the total number of followers or "likes"?

Paul Adams: We're still seeing the fans and followers arms race — businesses trying to gather as many fans as possible. But I think that's fundamentally wrong. It's more important to focus on quality, not quantity, of connections.

For example, many brands run competitions on social media platforms. You have to "Like" or "Follow" that business to enter. So the question is whether they are making connections with advocates of their brand, or with people who simply love competitions. If it's the latter, then they're filling their social media interactions and data with noise.

As I mentioned earlier, people are often most influenced by their closest friends. So only make connections with true advocates of your brand, and market to the friends of those fans.

Will we get to a point where "social media" is not an online thing, but a bridge between the digital and real worlds?

QR codePaul Adams: I think we're already seeing it happening. We see Facebook, Twitter and Google Maps stickers on business windows all over town. I do think this is where it's headed. As I mentioned earlier, social media should be like electricity. It's there, powering everything, but we don't really think about it.

Our phone, or whatever we carry around with us, will probably be our primary source and producer of social media data, so it's important that when we use it, we're not burdened by its place in the ecosystem — for example, by seeing constant privacy controls or too many invasive alerts.

Fundamentally, the phone collects a number of datasets that other devices don't. It knows who we communicate with the most, who we care about the most — because it knows who we call and text most often — and it also knows where we are, where we've been, and probably where we're going. And in the near future, it will know the things we buy.

Mobile is going to be a very disruptive space, and I'm not sure how it will evolve. Rather than try and predict which technologies will be dominant, I think the safer bet for businesses is to understand how these technologies will support human behavior and how they will help people do things they are struggling to do today.

Photo: QR code by Projeto Sticker Map on Flickr


March 10 2011

Search Notes: The future is mobile. And self-driving cars

In the search world, the last week has been all about mobile.

Foursquare 3.0

FoursquareAt SMX West on Tuesday, Foursquare's Tristan Walker gave a keynote where he talked about expanding Foursquare as a customer loyalty and acquisition platform for business. To that end, they've launched new social and engagement features (just in time for SXSW!).

How is this related to search? Here's the key sentence from Foursquare's 3.0 announcement:

For years we've wanted to build a recommendation engine for the real world by turning all the check-ins and tips we've seen from you, your friends, and the larger foursquare community into personalized recommendations.

Foursquare's new "explore" tab lets you search for anything you want (from "coffee" to "80s music") and provides results based on all the information Foursquare has at its disposal, including places your friends have visited and the time of day.

Google is trying to get in this space with Latitude and Hotpot. After all, how can Google possibly hope to offer the same quality search results for "wifi coffee" without data about what kinds of coffee houses you and your friends frequent most often? This is personalization based on overall behavior, not just online behavior, and it's both fascinating and creepy to think about the logical next steps.

Unfortunately for Google, they missed a huge opportunity to get in on this space early when they acquired Dodgeball and effectively killed it, causing the founders to leave Google and start Foursquare.

Bing is also investing in mobile/local search, the latest being "local deals" on iPhone and Android (although not yet on Windows mobile).

Where 2.0: 2011, being held April 19-21 in Santa Clara, Calif., will explore the intersection of location technologies and trends in software development, business strategies, and marketing.

Save 25% on registration with the code WHR11RAD

Continued growth in mobile

According to discussion from a recent local online advertising conference, mobile advertising could become the dominant form of online advertising by 2015. About 5% of paid search is currently mobile, and that number could double by year's end. Google has about 98% mobile search share in the United States and 97% of mobile search spend.

Google says mobile search accounts for 15% of their total searches, distributed as follows:

  • 30% - restaurants
  • 17% - autos
  • 16% - consumer electronics
  • 15% - finance and insurance
  • 15% - beauty and personal

Continued discussion of Google's "content farm" update

As discussed last week, Google's algorithm change impacted 12% of queries and the talk about it has not died down. I wrote a diagnostic guide about analyzing data and creating an action plan and Google opened a thread in their discussion forum to get feedback from site owners.

Self-driving cars!

OK, maybe this isn't really search, except that it's coming from Google, but it's self-driving cars! We live in the future!

Search Engine Land's Danny Sullivan took some video at TED of the cars in action, including some footage inside an actual self-driving car.

Surely flying cars are next.

Got news?

News tips are always welcome, so please send them along.


March 04 2011

Computers are looking back at us

As researchers work to increase human-computer interactivity, the lines between real and digital worlds are blurring. Augmented reality (AR), just in its infant stage, may be set to explode. As the founders of Bubbli, a startup developing an AR iPhone app, said in a recent Silicon Valley Blog post by Barry Bazzell: "Once we understand reality through a camera lens ... the virtual and real become indistinguishable.'"


Kevin Kelly, co-founder and senior maverick at Wired magazine, recently pointed out in a keynote speech at TOC 2011 that soon the computers we're looking at would be looking back at us (the image above is from Kelly's presentation).

"Soon" turns out to be now: Developers at Swedish company Tobii Technology have created 20 computers that are controlled by eye movement. Tom Simonite described the technology in a recent post for MIT's Technology Review:

The two cameras below the laptop's screen use infrared light to track a user's pupils. An infrared light source located next to the cameras lights up the user's face and creates a "glint" in the eyes that can be accurately tracked. The position of those points is used to create a 3-D model of the eyes that is used to calculate what part of the screen the user is looking at; the information is updated 40 times per second.

The exciting part here is a comment in the article by Barbara Barclay, general manager of Tobii North America:

We built this conceptual prototype to see how close we are to being ready to use eye tracking for the mass market ... We think it may be ready.

Tobii released the following video to explain the system:

This type of increased interaction has potential across industries — intelligent search, AR, personalization, recommendations, to name just a few channels. Search models built on interaction data gathered directly from the user could also augment the social aggregation that search engine companies are currently focused on. Engines could incorporate what you like with what you see and do.


March 03 2011

Startups get social with browser extensions

HeyStaksWajam.pngAs Google and Bing try to increase reach in a battle over who displays what social elements in search results, search engine add-ons are emerging to bring all social network results together.

Start-up HeyStaks Technologies launched its social search Firefox extension March 2 at the DEMO Spring 2011 conference. The extension works in tandem with established engines, displaying social results in a section above the regular search results. The product is built around "staks," which are described in a company press release:

Users can create "search staks," collections of the best Web pages from a group of users on a particular topic; these "staks" can be made public and easily shared with colleagues and friends via email, Twitter, etc., or kept private or shared on an invite-only basis.

The product provides an effective solution for users who share a common goal or shared interest, allowing them to search the web in a collaborative fashion using mainstream search engines, to make their searches much more effective by keeping the content relevance of results high.

Example of a HeyStak stak
An example of a HeyStak "search stak"

Another startup, Wajam, also recently launched a search extension. Wajam scours networks like Facebook, Twitter and Delicious and results are displayed horizontally across the top of search results.

Example of a Wajam result
A Wajam search result

The Wajam site says they're currently "oversubscribed," but several reviewers have mentioned if you send them a tweet (@wajam) requesting a subscription, or like them on Facebook, the wait isn't too long.

HeyStaks is available as a Firefox add-on and an iPhone app. The Wajam extension is available for Firefox, Chrome, Safari and Internet Explorer. Apps for the iPhone and Android are pending.


Search Notes: Google targets "content farms"

This space is meant to be a weekly roundup of the latest news in the world of search, but last week all the news was about one thing: how Google's latest algorithm changes target "content farms." Below, a recap of what this change means to you as a searcher or a content owner.

Google announces substantial algorithm change

GoogleOn February 24th, Google announced that they were making a substantial change in their ranking algorithms that would impact 12% of queries. Google makes algorithm changes every day, but few are at this scale.

According to Google's blog post:

This update is designed to reduce rankings for low-quality sites — sites which are low-value add for users, copy content from other websites or sites that are just not very useful. At the same time, it will provide better rankings for high-quality sites — sites with original content and information such as research, in-depth reports, thoughtful analysis and so on.

The post noted that feedback from the personal blocklist Chrome extension was not used. However, Google did compare extension data to sites that lost ranking after the algorithm change: 84% of the sites users are blocking with the Chrome extension have been impacted

Google said that:

We've been tackling these issues for more than a year, and working on this specific change for the past few months. And we're working on many more updates that we believe will substantially improve the quality of the pages in our results.

Change targeted at "content farms"?

Danny Sullivan at Search Engine Land talked to Google about the changes and concluded that they were likely aimed at so-called "content farms" — sites with armies of writers generating (sometimes low-quality) articles based on popular searches.

About a month ago, fledgling search startup Blekko chose another path in reducing the number of content-farm style pages in their search results: they simply banned the top 20 sites their users had marked as spam.

Why didn't Google go that route? Likely because Google doesn't want to become an editorial censor of search results. Subjectively deciding a site is low-quality, rather than using large-scale signals to determine quality, becomes a slippery slope. And beyond that, what if some pages on a site are low quality but others are useful? A site that allows thousands of writers to pen articles is undoubtedly going to include an extremely wide range of quality.

Who was impacted?

After the algorithm change came a host of compilations of affected sites and statements from companies assuring the world that they were unscathed. New stats continue to come out, including the latest data showing that Associated Content had a 90% drop in search performance.

Better results?

Shortly after Google's announcement, the Atlantic decided the change was for the better.

Google Fellow Amit Singhal said Google conducts extensive testing before rolling out changes.

If you [test] over a large range of queries, you get a very good picture of whether the new results are better than the old ... The outcome was widely positive.

Not everyone agreed, particularly those sites that dropped out of the search rankings.

What's a site owner to do?

Wired ran an article implying that Google realized it had made some mistakes and was taking appeals from innocent sites that were unfairly affected.

Google clarified that was not the case. Google is always testing and tweaking its algorithms and has substantial internal resources to judge overall search quality. That, after all, is core to its success.

If your site has been impacted by this change, keep in mind that this change was not a manual action and doesn't target specific sites. Google can't therefore manually restore a site's rankings. This change is based entirely on algorithmic signals. Take a look at your analytics data:

  • What's the bounce rate of the site from search? Do searchers click through from search results and stay on the site or do they bounce back and click on a different result?
  • If the bounce rate is high, is it high for all queries or just for certain topics?
  • Do the pages searchers land on provide high value and clearly answer the searchers' questions?
  • What pages on the site have external links? Do large sections of the site not have external links at all? What can you do to make the content more interesting and valuable or raise awareness that it exists?
  • Is the content on the site unique or is it aggregated or syndicated from other locations? If it's not unique, what value do the pages add beyond the original source?

The positive outlook is that as you make changes to your site to gain back rankings, the site will become more valuable to your audiences and they will be come more engaged.

Got news?

News tips are always welcome, so please send them along.


March 02 2011

Before you interrogate data, you must tame it

The Guardian's coverage of the WikiLeaks cablesIBM, Wolfram|Alpha, Google, Bing, groups at universities, and others are trying to develop algorithms that parse useful information from unstructured data.

This limitation in search is a dull pain for many industries, but it was sharply felt by data journalists with the WikiLeaks releases. In a recent interview, Simon Rogers (@smfrogers), editor of the Guardian's Datablog and Datastore, talked about the considerable differences between the first batch of WikiLeaks releases — which arrived in a structured form — and the text-filled mass of unstructured cables that came later.

There were three WikiLeaks releases. One and two, Afghanistan and Iraq, were very structured. We got a CSV sheet, which was basically the "SIGACTS" — that stands for "significant actions" — database. It's an amazing data set, and in some ways it was really easy to work with. We could do incredibly interesting things, showing where things happened and events over time, and so on.

With the cables, it was a different kettle of fish. It was just a massive text file. We couldn't just look for one thing and think, "oh, that's the end of one entry and the beginning of the next." We had a few guys working on this for two or three months, just trying to get it into a state where we could have it in a database. Once it was in a database, internally we could give it to our reporters to start interrogating and getting stories out of it.

During the same interview, Rogers said that providing readers with the searchable data behind stories is a counter-balance to the public's cynicism toward the media.

When we launched the Datablog, we thought it was just going to be developers [using it]. What it turned out to be, actually, is real people out there in the world who want to know what's going on with a story. And I think part of that is the fact that people don't trust journalists any more, really. They don't trust us to be truthful and honest, so there's a hunger to see the stories behind the stories.

For more about how Rogers' group dealt with the WikiLeaks data and how data journalism works, check out the full interview in the following video:


March 01 2011

February 25 2011

Smaller search engines tap social platforms

BuzzFeed.pngAs the major search engines work to integrate social components into their search algorithms, it's interesting to also see niche engines tapping those same social networks for targeted results.

BuzzFeed, for example, recently launched its Pop Culture Search Engine to search pop-culture memes. It searches in a viral sort of way — the more "buzz" a story gets on social media platforms, the more likely it is to appear in the results. They also use this traffic indicator as a basis for isolating quality content.

Foodily is another targeted engine. It aggregates recipes from around the web and integrates the information with your friends' comments, recommendations, tips and recipes from Facebook. This approach creates more of a community environment for foodies, setting it apart from straight-up recipe search engines such as those on, Epicurious, or Food & Wine. Foodily can also search for recipes that don't contain certain ingredients. If you're allergic to garlic or out of milk, this feature might come in handy. (Note: Google's new Recipe View also allows you to select ingredients.)

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!