Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

April 22 2010

Preparing for the realtime web

The dominance of static web pages -- and their accompanying user expectations and analytics -- is drawing to a close. Taking over: the links, notes, and updates that make up the realtime web. Ted Roden, author of O'Reilly's upcoming "Building the Realtime User Experience" and a creative technologist at the New York Times, discusses the realtime web's impact in the following Q&A.

Web 2.0 Expo San FranciscoMac Slocum: Have we shifted from a website-centric model to a user-centric model?

Ted Roden: It used to be that a user sat down at a computer and checked Yahoo and and whatever else. Now, users get their Yahoo updates via Twitter and pushed into Facebook, wherever they are. So rather than a user going to to a specific website, websites are coming to where the users already are.

MS: Has push technology finally found its footing with realtime applications?

TR: I think so. It's not that push technology was a solution looking for a problem, it was only a partial solution. But now broadband has a wide penetration, browsers are much more stable and resource friendly, servers are cheap or free, and the development of realtime applications has gotten drastically easier. Using push technology without all of those other bits in place was a lot more painful for everybody involved and as a result, designers and programmers stuck with standard web apps. Now we can take a lot of that for granted and think like desktop application designers.

MS: What skill sets do developers need to take advantage of the realtime web?

TR: Python and Java are certainly strong contenders at this point. Tornado is really exciting to me. But one thing is totally clear: the world of a dominant HTTP server is gone. Developers need to get comfortable with the fact that there's going to be a huge shakeup in the server world. Instead of big fat Apache servers, we'll see a lot more apps running with built-in HTTP servers that are specific to the application, or at least the type of application.

MS: How about publishers: what do they need to consider as they're working with realtime apps?

TR: There is a lot to do on the content side. Creators will have to figure out if pushing each new message is important and which messages are important. They'll also need to know if content has to change as it's distributed in different formats. Big publishers like the New York Times have long understood that a headline that works in print doesn't work on the web. And they've also figured out that regular web headlines don't work on the mobile web. At the Times, we've recently started retooling the headlines as they get pushed out to Twitter and Facebook because the content needs to fit those platforms specifically.

MS: How can users organize all this incoming information?

TR: I think a lot of users will start relying on things like the "hide" button and "unfollow." I think more and more apps are going to ship with a "mute for a bit" button, too. So manual filtering will be a big part of this. Apps and websites will start to get smarter about what they show you as well. Facebook already tries to do this with varying degrees of success.

I tend to think information overload is a lot less of a problem than most people do. Essentially, since not long after the printing press was invented, there has been more information being created than we could consume. But we've never really had a problem with it. When people go into a library, they don't have panic attacks because of all the information. They know they can see it all, but it's not all for them. I look at my Twitter stream the same way.

MS: Are we at a point yet where realtime analytics are in place?

TR: We're a long way from having these analytics in place. There are some great services, like Chartbeat, that are getting us there. But we have work to do.

It isn't completely clear what's important in realtime analytics. For starters, we need to know a) an article is blowing up and b) why? That is absolutely crucial information to have. But we're going to have to start mixing realtime analytics with A/B testing and all kinds of other things to really understand.

It isn't a problem limited to realtime analytics, either. If you want to track when a user comes from a certain URL and ends up checking out with his shopping cart, we can do that easily. But we have a tough time figuring out who that person is. Beyond that, we're going to have to start tracking the amount of influence the users have. We'll also need to continue tracking the conversation about a piece of content even if the conversation happens in far off corners of the web, far from our websites and Facebook pages.

Note: This interview was condensed and edited.

January 21 2010

Four short links: 21 January 2010

  1. DD-WRT -- replacement firmware for cheap wireless router boxes that add new functionality like wireless bridging and quality-of-service controls (so Skype doesn't break up while you're web-browsing). Not a new thing, but worth remembering that it exists.
  2. Brain Dump of Real Time Web and WebSocket -- long primer on the different technology for real-time web apps. Conclusion is that there's no silver bullet yet, so more development work is needed. (via TomC on Delicious)
  3. Data Decs -- 3d-printing Christmas decorations based on social network data. My favourite is the blackletter 404. (via foe on Delicious)
  4. ZSync -- open source syncing application that makes it easy for app writers to connect desktop apps and iPhone apps. (via Dave Wiskus)

December 11 2009

Four short links: 11 December 2009

  1. Real Time Text Taskforce -- standardising live typing ala EtherPad and Google Wave, for accessibility reasons.
  2. NoSQL Required Reading -- papers and presentations to get up to speed in the theory and practice of scalable key-value data stores. (via Hacker News)
  3. It's Official, 2.0 is Coming -- pointer to the design and philosophy document for the next iteration of Interesting to see so much activity on US open government happening now: open government directive and progress report were released, along with a request for ideas on open access to publicly-funded science research.
  4. Breakdancing Robot -- we live in the future, and it is good. (via @hollowaynz)

November 30 2009

What Would Jane Austen Have Twittered?

After the recent Web 2.0 Expo NY--a sprawling, week-long conference and exhibition--I ducked into the Morgan Library to catch "A Woman's Wit: Jane Austen's Life and Legacy." A one-room show about an 18th century novelist seemed like the perfect antidote to a week of tech talk in the Death Star Javits Center.

As I'd hoped, the Morgan focuses on a handful of objects from Austen's life, and the commentary is thoughtful. I was surprised, though, to find myself thinking that had Twitter been around in Austen's time (1775-1817), she would likely have been a fan.

Austen wrote more than 3,000 letters, many to her sister Cassandra. They corresponded constantly, starting new letters to each other the minute they finished the last one and sharing the minutia of their lives. From reading Austen's novels, I'd always assumed that people in her era spent a long time waiting for the mail. But the show mentions that during Austen's life, mail in London and environs was delivered six times a day. Sometimes, a letter sent in the morning was delivered the same evening. Which makes snail mail sound a lot more like email or twitttering.

The speed of mail at the time and the content of the Austen sisters' letters suggest that the desires to communicate instantly and to let other people know what you ate for breakfast aren't modern phenomenon. Of course, Twitter lets you share your soy milk-to-cereal ratio with strangers and thus adds a layer of publishing to our updates. But people today often assume that email, Twitter and other relatively instant communication media have created a slew of brand new communication behaviors. The Jane Austen show at the Morgan suggests just the opposite: our human patterns are surprisingly consistent, and technology evolves to meet us.

Incidentally, the show doesn't say when multi-daily snail mail faded, and I wonder if it passed out of fashion with the rise of the telegraph in the mid-1800s. Anyone know?

November 11 2009

Counting Unique Users in Real-time with Streaming Databases

As the web increasingly becomes real-time, marketers and publishers need analytic tools that can produce real-time reports. As an example, the basic task of calculating the number of unique users is typically done in batch mode (e.g. daily) and in many cases using a random sample from the relevant log files. If unique user counts can be accurately computed in real-time, publishers and marketers can mount A/B tests or referral analysis to dynamically adjust their campaigns.

In a previous post I described SQL databases designed to handle data streams. In their latest release, Truviso announced technology that allows companies to track unique users in real-time. Truviso uses the same basic idea I described in my earlier post:

Recognizing that "data is moving until it gets stored", the idea behind many real-time analytic engines is to start applying the same analytic techniques to moving (streams) and static (stored) data.

Truviso uses (compressed) bitmaps and set theory to compute the number of unique customers in real-time. In the process they are able to handle the standard SQL queries associated with these types of problems: counting the number of distinct users, for any given set of demographic filters. Bitmaps are built as data streams into the system and uses the same underlying technology that allows Truviso to handle massive data sets from high-traffic web sites.


Once companies can do simple counts and averages in real-time, the next step is to use real-time information for predictive analytics. Truviso has customers using their system for "on-the-fly predictive modeling".

The other major enhancement in this release is a major step towards parallel processing. Truviso's new execution engine processes runs or blocks of data in parallel in multi-core systems or multi-node environments. Using Truviso's parallel execution engine is straightforward on a single multi-core server, but on a multi-node cluster it may require considerable attention to configuration.

[For my previous posts on real-time analytic tools see here and here.]

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!