Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

February 09 2012

It's time for a unified ebook format and the end of DRM

This post originally appeared on Publishers Weekly.

EreadersImagine buying a car that locks you into one brand of fuel. A new BMW, for example, that only runs on BMW gas. There are plenty of BMW gas stations around, even a few in your neighborhood, so convenience isn't an issue. But if one of those other gas stations offers a discount, a membership program, or some other attractive marketing campaign, you can't participate. You're locked in with the BMW gas stations.

This could never happen, right? Consumers are too smart to buy into something like this. Or are they? After all, isn't that exactly what's happening in the ebook world? You buy a dedicated ebook reader like a Kindle or a NOOK and you're locked in to that company's content. Part of this problem has to do with ebook formats (e.g., EPUB or Mobipocket) while another part of it stems from publisher insistence on the use of digital rights management (DRM). Let's look at these issues individually.

Platform lock-in

I've often referred to it as Amazon's not-so-secret formula: Every time I buy another ebook for my Kindle, I'm building a library that makes me that much more loyal to Amazon's platform. If I've invested thousands or even hundreds of dollars in Kindle-formatted content, how could I possibly afford to switch to another reading platform?

It would be too inconvenient to have part of my library in Amazon's Mobipocket format and the rest in EPUB. Even though I could read both on a tablet (e.g., the iPad), I'd be forced to switch between two different apps. The user interface between any two reading apps is similar but not identical, and searching across your entire library becomes a two-step process since there's no way to access all of your content within one app.

This situation isn't unique to Amazon. The same issue exists for all the other dedicated ereader hardware platforms (e.g., Kobo, NOOK, etc.). Google Books initially seemed like a solution to this problem, but it still doesn't offer mobi formats for the Kindle, so it's selling content for every format under the sun — except the one with the largest market share.

EPUB would seem to be the answer. It's a popular format based on web standards, and it's developed and maintained by an organization that's focused on openness and broad industry adoption. It also happens to be the format used by seemingly every ebook vendor except the largest one: Amazon.

Even if we could get Amazon to adopt EPUB, though, we'd still have that other pesky issue to deal with: DRM.

The myth of DRM

I often blame Napster for the typical book publisher's fear of piracy. Publishers saw what happened in the music industry and figured the only way they'd make their book content available digitally was to tightly wrap it with DRM. The irony of this is that some of the most highly pirated books were never released as ebooks. Thanks to the magic of high-speed scanner technology, any print book can easily be converted to an ebook and distributed illegally.

Some publishers don't want to hear this, but the truth is that DRM can be hacked. It does not eliminate piracy. It not only fails as a piracy deterrent, but it also introduces restrictions that make ebooks less attractive than print books. We've all read a print book and passed it along to a friend. Good luck doing that with a DRM'd ebook! What publishers don't seem to understand is that DRM implies a lack of trust. All customers are considered thieves and must be treated accordingly.

The evil of DRM doesn't end there, though. Author Charlie Stross recently wrote a terrific blog post entitled "Cutting Their Own Throats." It's all about how publisher fear has enabled a big ebook player like Amazon to further reinforce its market position, often at the expense of publishers and authors. It's an unintended consequence of DRM that's impacting our entire industry.

Given all these issues, why not eliminate DRM and trust your customers? Even the music industry, the original casualty of the Napster phenomenon, has seen the light and moved on from DRM.

TOC NY 2012 — O'Reilly's TOC Conference, being held Feb. 13-15, 2012, in New York, is where the publishing and tech industries converge. Practitioners and executives from both camps will share what they've learned and join together to navigate publishing's ongoing transformation.

Register to attend TOC 2012

Lessons from the music industry

Several years ago, Steve Jobs posted a letter to the music industry pleading for them to abandon DRM. The letter no longer appears on Apple's website, but community commentary about it lives on. My favorite part of that letter is where Jobs asks why the music industry would allow DRM to go away. The answer is that, "DRMs haven't worked, and may never work, to halt music piracy." In fact, a study last year by Rice University and Duke University contends that removing DRM can actually decrease piracy. Yes, you read that right.

I recently had an experience with my digital music collection that drove this point home for me. I had just switched from an iPhone to an Android phone and wanted to get my music from the old device onto the new one. All I had to do was drag and drop the folder containing my music in iTunes to the SD card in my new phone. It worked perfectly because the music file formats are universal and there was no DRM involved.

Imagine trying to do that with your ebook collection. Try dragging your Kindle ebooks onto your new NOOK, for example. Incompatible file formats and DRM prevent that from happening ... today. At some point in the not-too-distant future, though, I'm optimistic the book publishing industry will get to the same stage as the music industry and offer a universal, DRM-free format for all ebooks. Then customers will be free to use whatever e-reader they prefer without fear of lock-in and incompatibilities.

The music industry made the transition, why can't we?


January 20 2012

Don't expect the end of electronics obsolescence anytime soon

This post originally appeared in Mike Loukides' Google+ feed.

Solid State by skippyjon, on FlickrA cheery post-CES Mashable article talks about "the end of obsolescence." Unfortunately, that's precisely not what the sort of vendors distributing at CES have in mind.

The general idea behind the end of obsolescence is that consumer electronics — things like Internet-enabled TVs — are software upgradeable. The vendor only needs to ship a software update, and your TV is as good as the new ones in the store.

Except it isn't. It's important to think about what drives people to buy new gear. It used to be that you bought a TV or a radio and used it for 10 or 20 years, until it broke. (Alas, repairability is no longer in our culture.) Yes, the old one didn't have all the features of the new one, but who really cared? It worked. I've got a couple of flat screen monitors in my office; they're sorta old, they're not the best, and sure, I'd like brand new ones, but they work just fine and will probably outlive the computers they're connected to.

The point of field upgrades is that your old TV will have all the "new" features — just like my office computers that get regular updates from Apple and Microsoft. But your old TV will also have its 10-year-old CPU, 10-year-old RAM, and its 10-year-old SD card slot that doesn't know what to do with terabyte SD cards. And the software upgrades will make you painfully aware of that. Instead of a TV that works just fine, you'll have a TV that works worse and worse as time goes by. If you're in the computing business, and you've used a five-year-old machine, you know how painful that can be.

End of obsolescence? No, it's rubbing obsolescence in your face. Good for vendors, maybe, but not for consumers.

See comments and join the conversation about this topic at Google+.

Photo: Solid State by skippyjon, on Flickr

Sponsored post
Reposted byLegendaryy Legendaryy

September 15 2011

The evolution of data products

In "What is Data Science?," I started to talk about the nature of data products. Since then, we've seen a lot of exciting new products, most of which involve data analysis to an extent that we couldn't have imagined a few years ago. But that begs some important questions: What happens when data becomes a product, specifically, a consumer product? Where are data products headed? As computer engineers and data scientists, we tend to revel in the cool new ways we can work with data. But to the consumer, as long as the products are about the data, our job isn't finished. Proud as we may be about what we've accomplished, the products aren't about the data; they're about enabling their users to do whatever they want, which most often has little to do with data.

It's an old problem: the geeky engineer wants something cool with lots of knobs, dials, and fancy displays. The consumer wants an iPod, with one tiny screen, one jack for headphones, and one jack for charging. The engineer wants to customize and script it. The consumer wants a cool matte aluminum finish on a device that just works. If the consumer has to script it, something is very wrong. We're currently caught between the two worlds. We're looking for the Steve Jobs of data — someone who can design something that does what we want without getting us involved in the details.

Disappearing data

We've become accustomed to virtual products, but it's only appropriate to start by appreciating the extent to which data products have replaced physical products. Not that long ago, music was shipped as chunks of plastic that weighed roughly a pound. When the music was digitized and stored on CDs, it became a data product that weighed under an ounce, but was still a physical object. We've moved even further since: many of the readers of this article have bought their last CD, and now buy music exclusively in online form, through iTunes or Amazon. Video has followed the same path, as analog VHS videotapes became DVDs and are now streamed through Netflix, a pure data product.

Strata Conference New York 2011, being held Sept. 22-23, covers the latest and best tools and technologies for data science — from gathering, cleaning, analyzing, and storing data to communicating data intelligence effectively.

Save 30% on registration with the code ORM30

But while we're accustomed to the displacement of physical products by
virtual products, the question of how we take the next step — where
data recedes into the background — is surprisingly tough.
Do we want products that deliver data?
Or do we want products that deliver results based on data? We're
evolving toward the latter, though we're not there yet. The iPod may
be the best example of a product that pushes the data into the
background to deliver what the user wants, but its partner
application, iTunes, may be the worst. The user interface to iTunes
is essentially a spreadsheet that exposes all of your music
collection's metadata. Similarly, the
"People You May Know" feature on social sites such as LinkedIn and
Facebook delivers recommendations: a list of people in the
database who are close to you in one way or another. While that's
much more friendly than iTunes' spreadsheet, it is still a
list, a classic data structure. Products like these have a "data smell." I call them "overt"
data products because the data is clearly visible as part
of the deliverable.

A list may be an appropriate way to deliver potential contacts, and a
spreadsheet may be an appropriate way to edit music metadata. But
there are many other kinds of deliverables that help us to understand
where data products are headed. At a recent event at IBM Research, IBM
demonstrated an application that accurately predicts bus arrival times,
based on real-time analysis of traffic data.
(London is about to roll out something similar.) Another IBM project implemented a

congestion management system
for Stockholm that brought about significant
decreases in traffic and air pollution. A href="">newer
initiative allows drivers to text their destinations to a service, and receive an optimized route, given current traffic and weather conditions. Is a bus arrival time data?
Probably so. Is a route another list structure, like a list of potential Facebook friends? Yes, though the real deliverable here is reduced transit time and an improved
environment. The data is still in the foreground, but we're starting
to look beyond the data to the bigger picture: better quality of life.

These projects suggest the next step in the evolution toward data products that deliver results rather than data. Recently, Ford discussed some experimental work in which they used Google's prediction and mapping capabilities to optimize mileage in hybrid cars based on predictions about where the driver was going. It's clearly a data product: it's doing data analysis on historical driving data and knowledge about road conditions. But the deliverable isn't a route or anything the driver actually sees — it's optimized engine usage and lower fuel consumption. We might call such a product, in which the data is hidden, a "covert" data product.

We can push even further. The user really just wants to get from point A to point B. Google has demonstrated a self-driving car that solves this problem. A self-driving car is clearly not delivering data as the result, but there are massive amounts of data behind the scenes, including maps, Street View images of the roads (which, among other things, help it to compute the locations of curbs, traffic lights, and stop signs), and data from sensors on the car. If we ever find out everything that goes into the data processing for a self-driving car, I believe we'll see a masterpiece of extracting every bit of value from many data sources. A self-driving car clearly takes the next step to solving a user's real problem while making the data hide behind the scenes.

Once you start looking for data products that deliver real-world results rather than data, you start seeing them everywhere. One IBM project involved finding leaks in Dubuque, Iowa's, public water supply. Water is being used all the time, but sudden changes in usage could represent a leak. Leaks have a unique signature: they can appear at any time, particularly at times when you would expect usage to be low. Unlike someone watering his lawn, flushing a toilet, or filling a pool, leaks don't stop. What's the deliverable? Lower water bills and a more robust water system during droughts — not data, but the result of data.

In medical care, doctors and nurses frequently have more data at their disposal than they know what to do with. The problem isn't the data, but seeing beyond the data to the medical issue. In a collaboration between IBM and the University of Ontario, researchers knew that most of the data streaming from the systems monitoring premature babies was discarded. While readings of a baby's vital signs might be taken every few milliseconds, they were being digested into a single reading that was checked once or twice an hour. By taking advantage of the entire data stream, it was possible to detect the onset of life-threatening infections as much as 24 hours before the symptoms were apparent to a human. Again, a covert data product; and the fact that it's covert is precisely what makes it valuable. A human can't deal with the raw data, and digesting the data into hourly summaries so that humans can use it makes it less useful, not more. What doctors and nurses need isn't data, they need to know that the sick baby is about to get sicker.

Eben Hewitt, author of "Cassandra: The Definitive Guide," works for a large hotel chain. He told me that the hotel chain considers itself a software company that delivers a data product. The company's real expertise lies in the reservation systems, the supply management systems, and the rest of the software that glues the whole enterprise together. It's not a small task. They're tracking huge numbers of customers making reservations for hundreds of thousands of rooms at tens of thousands of properties, along with various awards programs, special offers, rates that fluctuate with holidays and seasons, and so forth. The complexity of the system is certainly on par with LinkedIn, and the amount of data they manage isn't that much smaller. A hotel looks awfully concrete, but in fact, your reservation at Westin or Marriott or Day's Inn is data. You don't experience it as data, however — you experience it as a comfortable bed at the end of a long day. The data is hidden, as it should be.

I see another theme developing. Overt products tend to depend on overt data collection: LinkedIn and Facebook don't have any data that wasn't given to them explicitly, though they may be able to combine it in unexpected ways. With covert data products, not only is data invisible in the result, but it tends to be collected invisibly. It has to be collected invisibly: we would not find a self-driving car satisfactory if we had to feed it with our driving history. These products are frequently built from data that's discarded because nobody knows how to use it; sometimes it's the "data exhaust" that we leave behind as our cell phones, cars, and other devices collect information on our activities. Many cities have all the data they need to do real-time traffic analysis; many municipal water supplies have extensive data about water usage, but can't yet use the data to detect leaks; many hospitals connect patients to sensors, but can't digest the data that flows from those sensors. We live in an ocean of ambient data, much of which we're unaware. The evolution of data products will center around discovering uses for these hidden sources of data.

The power of combining data

The first generation of data products, such as CDDB, were essentially a single database. More recent products, such as LinkedIn's Skills database, are composites: Skills incorporates databases of users, employers, job listings, skill descriptions, employment histories, and more. Indeed, the most important operation in data science may be a "join" between different databases to answer questions that couldn't be answered by either database alone.

Facebook's facial recognition provides an excellent example of the power in linked databases. In the most general case, identifying faces (matching a face to a picture, given millions of possible matches) is an extremely difficult problem. But that's not the problem Facebook has solved. In a reply to Tim O'Reilly, Jeff Jonas said that while one-to-many picture identification remains an extremely difficult problem, one-to-few identification is relatively easy. Facebook knows about social networks, and when it sees a picture, Facebook knows who took it and who that person's friends are. It's a reasonable guess that any faces in the picture belong to the taker's Facebook friends. So Facebook doesn't need to solve the difficult problem of matching against millions of pictures; it only needs to match against pictures of friends. The power doesn't come from a database of millions of photos; it comes from joining the photos to the social graph.

The goal of discovery

Many current data products are recommendation engines, using collaborative filtering or other techniques to suggest what to buy, who to friend, etc. One of the holy grails of the "new media" is to build customized, personalized news services that automatically find what the user thinks is relevant and interesting. Tools like Apple's Genius look through your apps or your record collection to make recommendations about what else to buy. "People you may know," a feature common to many social sites, is effectively a recommendation engine.

But mere recommendation is a shallow goal. Recommendation engines aren't, and can't, be the end of the road. I recently spent some time talking to Bradford Cross (@bradfordcross), founder of Woven, and eventually realized that his language was slightly different from the language I was used to. Bradford consistently talked about "discovery," not recommendation. That's a huge difference. Discovery is the key to building great data products, as opposed to products that are merely good.

The problem with recommendation is that it's all about recommending something that the user will like, whether that's a news article, a song, or an app. But simply "liking" something is the wrong criterion. A couple months ago, I turned on Genius on my iPad, and it said things like "You have Flipboard, maybe you should try Zite." D'oh. It looked through all my apps, and recommended more apps that were like the apps I had. That's frustrating because I don't need more apps like the ones I have. I'd probably like the apps it recommended (in fact, I do like Zite), but the apps I have are fine. I need apps that do something different. I need software to tell me about things that are entirely new, ideally something I didn't know I'd like or might have thought I wouldn't like. That's where discovery takes over. What kind of insight are we talking about here? I might be delighted if Genius said, "I see you have ForScore, you must be a musician, why don't you try Smule's Magic Fiddle" (well worth trying, even if you're not a musician). That's where recommendation starts making the transition to discovery.

Eli Pariser's "The Filter Bubble" is an excellent meditation on the danger of excessive personalization and a media diet consisting only of stuff selected because you will "like" it. If I only read news that has been preselected to be news I will "like," news that fits my personal convictions and biases, not only am I impoverished, but I can't take part in the kind of intelligent debate that is essential to a healthy democracy. If I only listen to music that has been chosen because I will "like" it, my music experience will be dull and boring. This is the world of E.M. Forster's story "The Machine Stops," where the machine provides a pleasing, innocuous cocoon in which to live. The machine offers music, art, and food — even water, air, and bedding; these provide a context for all "ideas" in an intellectual space where direct observation is devalued, even discouraged (and eventually forbidden). And it's no surprise that when the machine breaks down, the consequences are devastating.

I do not believe it is possible to navigate the enormous digital library that's available to us without filtering, nor does Pariser. Some kind of programmatic selection is an inevitable part of the future. Try doing Google searches in Chrome's Incognito mode, which suppresses any information that could be used to personalize search results. I did that experiment, and it's really tough to get useful search results when Google is not filtering based on its prior knowledge of your interests.

But if we're going to break out of the cocoon in which our experience of the world is filtered according to our likes and dislikes, we need to get beyond naïve recommendations to break through to discovery. I installed the iPad Zite app shortly after it launched, and I find that it occasionally breaks through to discovery. It can find articles for me that I wouldn't have found for myself, that I wouldn't have known to look for. I don't use the "thumbs up" and "thumbs down" buttons because I don't want Zite to turn into a parody of my tastes. Unfortunately, that seems to be happening anyway. I find that Zite is becoming less interesting over time: even without the buttons, I suspect that my Twitter stream is telling Zite altogether too much about what I like and degrading the results. Making the transition from recommendation to true discovery may be the toughest problem we face as we design the next generation of data products.


In the dark ages of data products, we accessed data through computers: laptops and desktops, and even minicomputers and mainframes if you go back far enough. When music and video first made the transition from physical products to data products, we listened and watched on our computers. But that's no longer the case: we listen to music on iPods; read books on Kindles, Nooks, and iPads; and watch online videos on our Internet-enabled televisions (whether the Internet interface is part of the TV itself or in an external box, like the Apple TV). This transition is inevitable. Computers make us aware of data as data: one disk failure will make you painfully aware that your favorite songs, movies, and photos are nothing more than bits on a disk drive.

It's important that Apple was at the core of this shift. Apple is a master of product design and user interface development. And it understood something about data that those of use who preferred listening to music through WinAmp or FreeAmp (now Zinf) missed: data products would never become part of our lives until the computer was designed out of the system. The user experience was designed into the product from the start. DJ Patil (@dpatil), Data Scientist in Residence at Greylock Partners, says that when building a data product, it is critical to integrate designers into the engineering team from the beginning. Data products frequently have special challenges around inputting or displaying data. It's not sufficient for engineers to mock up something first and toss it over to design. Nor is it sufficient for designers to draw pretty wireframes without understanding what the product is or how it works. The earlier design is integrated into the product group and the deeper the understanding designers have of the product, the better the results will be. Patil suggested that FourSquare succeeded because it used GPS to make checking into a location trivially simple. That's a design decision as much as a technical decision. (Success isn't fair: as a Dodgeball review points out, position wasn't integrated into cell phones, so Dodgeball's user interface was fundamentally hobbled.) To listen to music, you don't want a laptop with a disk drive, a filesystem, and a user interface that looks like something from Microsoft Office; you want something as small and convenient as a 1960s transistor radio, but much more capable and flexible.

What else needs to go if we're going to get beyond a geeky obsession with the artifact of data to what the customer wants? Amazon has done an excellent job of packaging ebooks in a way that is unobtrusive: the Kindle reader is excellent, it supports note taking and sharing, and Amazon keeps your location in sync across all your devices. There's very little file management; it all happens in Amazon's cloud. And the quality is excellent. Nothing gives a product a data smell quite as much as typos and other errors. Remember Project Gutenberg?

Back to music: we've done away with ripping CDs and managing the music ourselves. We're also done with the low-quality metadata from CDDB (although I've praised CDDB's algorithm, the quality of its data is atrocious, as anyone with songs by John "Lennnon" knows). Moving music to the cloud in itself is a simplification: you don't need to worry about backups or keeping different devices in sync. It's almost as good as an old phonograph, where you could easily move a record from one room to another, or take it to a friend's house. But can the task of uploading and downloading music be eliminated completely? We're partway there, but not completely. Can the burden of file management be eliminated? I don't really care about the so-called "death of the filesystem," but I do care about shielding users from the underlying storage mechanism, whether local or in the cloud.

New interfaces for data products are all about hiding the data itself, and getting to what the user wants. The iPod revolutionized audio not by adding bells and whistles, but by eliminating knobs and controls. Music had become data. The iPod turned it back into music.

The drive toward human time

It's almost shocking that in the past, Google searches were based on indexes that were built as batch jobs, with possibly a few weeks before a given page made it into the index. But as human needs and requirements have driven the evolution of data products, batch processing has been replaced by "human time," a term coined by Justin Sheehy (@justinsheehy), CTO of Basho Technologies. We probably wouldn't complain about search results that are a few minutes late, or maybe even an hour, but having to wait until tomorrow to search today's Twitter stream would be out of the question. Many of my examples only make sense in human time. Bus arrival times don't make sense after the bus has left, and while making predictions based on the previous day's traffic might have some value, to do the job right you need live data. We'd laugh at a self-driving car that used yesterday's road conditions. Predicting the onset of infection in a premature infant is only helpful if you can make the prediction before the infection becomes apparent to human observers, and for that you need all the data streaming from the monitors.

To meet the demands of human time, we're entering a new era in data tooling. Last September, Google blogged about Caffeine and Percolator, its new framework for doing real-time analysis. Few details about Percolate are available, but we're starting to see new tools in the open source world: Apache Flume adds real-time data collection to Hadoop-based systems. A recently announced project, Storm, claims to be the Hadoop of real-time processing. It's a framework for assembling complex topologies of message processing pipelines and represents a major rethinking of how to build data products in a real-time, stream-processing context.


Data products are increasingly part of our lives. It's easy to look at the time spent in Facebook or Twitter, but the real changes in our lives will be driven by data that doesn't look like data: when it looks like a sign saying the next bus will arrive in 10 minutes, or that the price of a hotel reservation for next week is $97. That's certainly the tack that Apple is taking. If we're moving to a post-PC world, we're moving to a world where we interact with appliances that deliver the results of data, rather than the data itself. Music and video may be represented as a data stream, but we're interested in the music, not the bits, and we are already moving beyond interfaces that force us to deal with its "bitly-ness": laptops, files, backups, and all that. We've witnessed the transformation from vinyl to CD to digital media, but the process is ongoing. We rarely rip CDs anymore, and almost never have to haul out an MP3 encoder. The music just lives in the cloud (whether it's Amazon's, Apple's, Google's, or Spotify's). Music has made the transition from overt to covert. So have books. Will you have to back up your self-driving route-optimized car? I doubt it. Though that car is clearly a data product, the data that drives it will have disappeared from view.

Earlier this year Eric Schmidt said:

Google needs to move beyond the current search format of you entering a query and getting 10 results. The ideal would be us knowing what you want before you search for it...

This controversial and somewhat creepy statement actually captures the next stage in data evolution. We don't want lists or spreadsheets; we don't want data as data; we want results that are in tune with our human goals and that cause the data to recede into the background. We need data products that derive their power by mashing up many sources. We need products that deliver their results in human time, rather than as batch processes run at the convenience of a computing system. And most crucially, we need data products that go beyond mere recommendation to discovery. When we have these products, we will forget that we are dealing with data. We'll just see the results, which will be aligned with our needs.

We are seeing a transformation in data products similar to what we have seen in computer networking. In the '80s and '90s, you couldn't have a network without being intimately aware of the plumbing. You had to manage addresses, hosts files, shared filesystems, even wiring. The high end of technical geekery was wiring a house with Ethernet. But all that network plumbing hasn't just moved into the walls: it's moved into the ether and disappeared entirely. Someone with no technical background can now build a wireless network for a home or office by doing little more than calling the cable company. Data products are striving for the same goal: consumers don't want to, or need to, be aware that they are using data. When we achieve that, when data products have the richness of data without calling attention to themselves as data, we'll be ready for the next revolution.


August 19 2011

Publishing News: Amazon lands "4-Hour" author Timothy Ferriss

Here's a few highlights from this week's publishing news. (Note: Some of these stories were previously published on Radar.)

Timothy Ferriss signs with Amazon Publishing to "redefine what is possible"

AmazonLarry Kirshbaum is not sitting on his hands. Amazon hired Kirshbaum in May to head its New York operations and this week he signed his first best-selling author, Timothy Ferriss, and acquired rights to Ferriss' new book "The 4-Hour Chef."

In Amazon's press release, Ferriss made it clear that he feels Amazon, as a publisher, has a better hold on digital publishing than its competitors:

My decision to collaborate with Amazon Publishing wasn't just a question of which publisher to work with. It was a question of what future of publishing I want to embrace. My readers are migrating irreversibly into digital, and it made perfect sense to work with Amazon to try and redefine what is possible.

A few feathers were ruffled by the announcement. As noted by The Guardian, Victoria Barnsley, chief executive at HarperCollins UK, voiced concerns over Amazon's aggressive moves into the publishing sector:

Amazon's foray into book publishing ... is obviously a concern. They have very deep pockets and they are now a very, very powerful global competitor of ours ... They are very, very powerful now — in fact they are getting close to being in a sort of a monopolistic situation. They control over 90% of physical online market in UK and over 70% of the ebook market so that's a very, very powerful position to be in. So yes, it is a concern.

Amazon will publish "The 4-Hour Chef" in April 2012.

TOC Frankfurt 2011 — Being held on Tuesday, Oct. 11, 2011, TOC Frankfurt will feature a full day of cutting-edge keynotes and panel discussions by key figures in the worlds of publishing and technology.

Save 100€ off the regular admission price with code TOC2011OR

RR Donnelley's latest acquisitions position it for digital success

This week, publisher RR Donnelley acquired both LibreDigital and Sequence Personal. With these moves, RR Donnelley is doing something about the digital situation that so many bemoan — it's repositioning to give its customers what they want, how they want it. That's the root of what the publishing business is all about, after all.

In a post at The Bookseller, novelist Kate Pullinger said, "I think the big publishers have got themselves into a difficult situation with the stranglehold that Amazon, Apple and Google have on bookselling currently." One could argue the situation is more disruptive than difficult. Instead of fighting against the stranglehold, perhaps it's better to focus on the unlimited potential the disruption brings. Embracing change might be more work than staying the course on a sinking ship, but the publishers who do — like RR Donnelley — will be the ones who remain in a position to succeed.

The roles of advertising and sponsorship in the future of book publishing

This segment was written by Joe Wikert

Felix Salmon recently wrote an article talking about how the New York Times paywall is working because it's porous. He contrasts that to other paywalled sites that haven't enjoyed the same success as the Times. As I read Salmon's article I was thinking less about porous vs. rigid paywalls and more about DRM'd vs. DRM-free books.

There are definitely some similarities here. At O'Reilly we believe in a DRM-free world because we trust our customers and we believe they value our content enough to pay for it rather than steal it. It would be naive of us to think this philosophy totally eliminates the illegal sharing of content though. We just happen to believe those situations shouldn't cause you to penalize all your customers. Shoplifting happens from time to time at your local grocery store but that doesn't mean the store manager should put everything under lock and key.

But it was only when I read Fred Wilson's follow-up post to Salmon's article that I realized what other connection this has to book publishing: advertising, sponsorship and other revenue streams. As Fred points out, the Times doesn't necessarily have to charge for each online page view since they run ads on every page served.

I'm not suggesting we can suddenly give away book content and make the exact same amount of revenue with advertisements. But what I am saying is that advertising and its close cousin, sponsorship (e.g., "This book brought to you in part by..."), can and will play a role in the future of book publishing. Every publisher won't necessarily experiment with that model, but many will.

This story continues here.


  • What investors are looking for in publishing companies
  • Books as a service: How and why it works
  • A premium layer for web-based content
  • More Publishing Week in Review coverage

  • June 29 2011

    Two lessons from Pottermore: Direct sales and no DRM

    This post originally appeared on Joe Wikert's Publishing 2020 Blog ("Harry Potter and the Direct, DRM-Free Sale"). It's republished with permission.

    PottermoreIt took her a while, but J.K. Rowling now apparently believes in the future of ebooks. Last week's Pottermore announcement featured two important publishing elements: a direct sales model and a lack of DRM.

    Harry Potter is one of those unique brands that dwarfs everything associated with it. Most Potter fans can name the author but few could tell you the publisher without looking at the book's spine. Although that's often true with other novels, Harry Potter is much more than a series of books or movies. It's an experience, or so I'm told. (I'm not a fan, have never read any of the books or seen any of the movies, but my house is filled with plenty of diehards who have told me everything I need to know.)

    Rowling realizes the strength of her brand and knows she can use it to establish direct relationships with her fans. And so via Pottermore, the author doesn't need any of the big names in ebook retailing. Why settle for a 20% royalty or a 70% cut of the top-line sale when you can keep 100% of it? And why only offer one format when some portion of your audience wants MOBI for the Kindle, others want EPUB for their Apple/Sony devices, and maybe a few more would prefer a simple PDF?

    It's not surprising that J.K. Rowing is forging ahead with a well thought-out direct sales plan. What blows my mind is that more publishers aren't doing the same. Sure, you'll find publisher websites selling PDFs. Some even offer other formats. But rarely do you find a publisher's website with all the popular ebook formats. Regardless of what type of device you have, it sounds like you'll be able to purchase a Harry Potter ebook for it on Pottermore. I hope they take the extra step and include all the formats in one transaction like we do on

    The other smart move by Rowling is the exclusion of DRM from Pottermore ebooks. Here's an important question for authors and publishers everywhere: If Harry Potter doesn't need DRM, why does your book?! If you ditch DRM you'll be able to offer all the formats. You'll show your customers you trust them and you'll also make it far easier for them to actually use your content.

    TOC Frankfurt 2011 — Being held on Tuesday, Oct. 11, 2011, TOC Frankfurt will feature a full day of cutting-edge keynotes and panel discussions by key figures in the worlds of publishing and technology.

    Save 100€ off the regular admission price with code TOC2011OR


    May 10 2011

    Mobile carriers crack down on tethering

    iPhone and eee pc tether. by paul_irish, on FlickrIn mobile communications the term "tethering" refers to sharing a cell phone's data connection with another computing device. It's most commonly associated with linking a laptop to a data-enabled phone so a user can take advantage of the phone's Internet connection.

    Tethering can have a significant impact on mobile networks since tethered computers tend to use up a lot more bandwidth than mobile phones. This was likely one of the reasons AT&T was so slow to allow the tethering functionality of the iPhone on their network (it appeared more than a year after Apple announced the iPhone's tethering capabilities.) And when AT&T finally did open up their network for tethering, users had to pay $20 extra per month for the feature.

    Now we're seeing AT&T sending warning notices to users it suspects of using unauthorized tethering apps and Google allowing mobile carriers to block these kinds of apps from appearing in the Android store. While many have complained about these actions, it's not surprising that the carriers would try to limit tethering to sanctioned apps and users who have opted to pay more for tethering functionality. Allowing jailbroken iPhones to tether or Android apps like PdaNet, Easy Tether, and InternetSharer that facilitate tethering without the carrier's approval is seen by the carriers as taking money out of their pockets. (Google appears to be allowing carriers to filter the Android store to block these apps for their users, but the apps are still available and installable).

    The cat-and-mouse game between carriers and unauthorized tethering is still going strong, however. For example, here's a recent tweet from the folks behind PdaNet:

    3.0 adds a feature to hide tether usage. Most users do not need to turn it on unless you have received a message from such as T-Mobile.less than a minute ago via web Favorite Retweet Reply

    This really all boils down to the fact that "unlimited" has a unique meaning in terms of mobile data plans. Cellular carriers have long been criticized for liberally using that word in situations where the service isn't truly unlimited, and their desire to make users pay more for tethering is another example of how that unlimited data plan you thought you had, really isn't unlimited. Many have faulted AT&T for their network issues and how long it took them to officially allow tethering, but at least they've clarified their data plans and language — you won't find the word "unlimited" in their rate plan descriptions anymore.

    Where does this lead? Even though customers have made it clear they feel tethering charges are unfair, the carriers have made it equally clear they plan on monetizing this feature as much as possible. With both AT&T and Verizon clamping down on unauthorized tethering, the climate for people who want to tether for free will continue to get more restrictive. But it's important to note that unlike the situation with AT&T and iOS devices, where you must jailbreak the device to get around the carrier's tethering restrictions, on the more open Android platform there will likely always be free tethering options for those who are willing to go outside of the official store for apps and don't mind breaking their carriers' terms of service.

    Photo: iPhone -> eee pc tether. by paul_irish, on Flickr

    Android Open, being held October 9-11 in San Francisco, is a big-tent meeting ground for app and game developers, carriers, chip manufacturers, content creators, OEMs, researchers, entrepreneurs, VCs, and business leaders.

    Save 20% on registration with the code AN11RAD


    April 28 2011

    ePayments Week: What does the attention around tracking mean?

    Here's what caught my attention in the payment space this week.

    3 levels of awareness about geolocation

    iPhone trackBefore last week, many mobile users likely weren't thinking about their location data; that's changed for some. Apple, while maintaining that it hasn't been tracking users, promises to change the way location data is stored and transmitted on its mobile devices. But even as Apple's problem dissipates, the discussion has put the issue of location on the radar screen, and reporters and bloggers are digging into what it means.

    Watching the discussion over the past week, I've come to think there are three levels of awareness that mobile users have about location data. The first is simple awareness, the understanding that your cell phone and the network it's connected to have to know where you are. The second level is a sort of bargaining that comes from that realization, the idea that you should be getting something back from this data or from the people who store it. And the third level, more proactive still, is that you yourself should have access to this data so that you can do something constructive with it.

    Regarding awareness, it may have been an eye opener to insiders that so many consumers weren't aware their cell phones were tracking their movements. A report on NBC's "Today" show began with the line, "It sounds like something out of science fiction: a phone that tracks your every movement." Of course, long before smart phones with maps and check-in services, cell phones had to communicate with nearby cell towers in order to obtain service. GPS and Wi-Fi data have made the pinpointing more precise. But somehow the knowledge that Google Maps on your iPhone can place you at a certain point on the road and show your progress as your bus moves up the street didn't translate, for some people, to an understanding that the phone and your wireless carrier must know where you are and are able to keep a record of it. That's been made clear to a much larger audience now.

    In the wake of this understanding, some reporters and bloggers took up the consumer angle, wondering what we get back for giving up this data. This is an excellent point and one on which, I believe, the future of mobile commerce rests. The communication of data must be a two-way street where each party benefits. I give the navigation service my location and pace and, in turn (and for free) it repays me by displaying traffic data for any major city in 70 countries. I tell Foursquare where I'm checking in and it rewards me with information on where my friends have most recently checked in (and perhaps with a few meaningless points and badges, too). I tell Shopkick I'm near the big box stores in town and it sends me coupons for cleaning products. As mobile and location become more integral to purchases, the connection between my giving up location in exchange for another's profit will become more clear — and consumers will get more vocal about negotiating it.

    The promise was also floated that, in the hands of the right folks, this data could be analyzed to gain deep social, political, and medical insights. Robert Lee Hotz's excellent article in Wednesday's Wall Street Journal (in front of the paywall) described several academic research efforts exploring how much they can learn from smart phones. Researchers are seeing what they can learn by not only tracking movements, but when equipped with apps that help users record eating habits, social interactions, moves, and gestures, they claim to be learning how political and social opinion are shaped and how users are influenced to make decisions. The marketing implications of this are obvious — that's the other edge of this sword — and Hotz notes that telecom carriers may already be using this data to assess who is most at risk for switching their contract to another phone company.

    Some people have taken the case to the third level (being proactive about personal data) and suggest that consumers should be able to access their data, either to see what it tells us about ourselves or to do something else with it. Richard Thaler in The New York Times proposes that wireless carriers make a version of your geolocation data available for your use, so that you could either analyze it or port it to a third party who could. One could argue that this is essentially what Apple got called out for doing, albeit without telling anyone, but Thaler's point is still valid. He cites a simple example of being able to analyze your phone usage for a better plan, though this is a service some carriers already offer if you call them up and ask.

    While it's easy to see what merchants and other third parties can provide of value in exchange for your location data (coupons, deals, reviews, directions), we've just begun to explore what we might be able to discover that's greater than a discount. As we've learned, once you provide the dataset, people do amazing things with it. In his follow-up post a week after noting the iPhone tracking details, Pete Warden points to one such use: Maria Scileppi's Living Brushstrokes project, which uses location tracking to create artistic views of people's movements.

    And the payment deals keep coming

    Facebook DealsReturning to the nitty-gritty of the payments world ... Facebook and Google jumped into the daily coupon business over the past week. Google Offers is promising a debut in Portland soon (the same place Google first tested Hotpot, which was recently folded into Google Places) and Facebook Deals rolled out in five cities. The moves may remind you of the ways both of these companies jumped into the check-in business last fall after seeing the successful uptake of check-ins via Foursquare, Yelp, and Gowalla.

    Groupon said it didn't need Google's $6 billion last November, so Google's coming after them. Sharise Cruze on BusinessReviewUSA reported that Google will display its offers on maps, so don't be surprised if the Offers service (like Hotpot before it) gets folded into Places.

    Mobile banking 2.0

    Bank of America said it's embarking on a series of changes to revise its mobile banking. Writing in American Banker, Andrew Johnson noted that the move is a reminder that early adopters are into a second wave of refinement, applying what they've learned in the first few years of mobile banking to the next wave. Bank of America, Chase, and Wells Fargo are the top three mobile banking apps in the US, according to a report published in Still, there's a lot of room for growth: comScore's annual online credit card report finds that 20% of mobile phone cardholders use their phone to access a bank account, and 13% are doing that via a mobile app. Since mobile and online are a far cheaper way for banks to manage their customers, we can be sure they'll be at work revising the apps to make them easier to use, more appealing, and more capable.

    Got news?

    News tips and suggestions are always welcome, so please send them along.

    If you're interested in learning more about the payment development space, check out PayPal X DevZone, a collaboration between O'Reilly and PayPal.


    April 15 2011

    4 ways DRM is like airport security

    This post originally appeared on Joe Wikert's Publishing 2020 Blog ("Why DRM Is Like Airport Security"). It's republished with permission.

    Airport securityWhile flying home from TOC Bologna, I couldn't help but think about some of the similarities between digital rights management (DRM) and airport security. Here are a few common points that come to mind:

    False sense of security — Does anyone still believe DRM systems are hacker proof? Heck, even books that have never been legally distributed in any e-format are out there as illegal downloads. Just search for the phrase "harry potter ebook downloads" and you'll see what I mean. Scanners are everywhere, so if physical books can be illegally shared, what makes you think a DRM'd title will never appear in the wild? On the airline side, I feel like we're always focusing on the last attack (e.g., underwear bomber, shoe bomber, etc.) and not focusing instead on what the next idiot will try.

    Treats everyone like a criminal — It's hard not feeling like a convict when you're going through airport security or coming back through immigration/customs. The assumption is you're guilty until proven innocent by way of x-ray machines, full-body scans and patdowns. On the book side, the fact that I can't treat my ebook purchase like I can my print book ones (e.g., can't be resold or lent to a friend indefinitely) makes me feel like the retailer and publisher simply don't trust me.

    Highly inefficient — I now have two Kindles and an iPad, and in order to move content from one to the other I have to go through Amazon so they can make sure I'm not breaking the rules. What if I don't have a web connection at that moment? I'm stuck and can't shift that book from my battery-depleted iPad to my Kindle. What's wrong with just connecting the two devices via Bluetooth? Not an option. How's this relate to airport security? Look at the crazy and inefficient lines at the airport and the inconsistencies from location to location (e.g., take your shoes off here but not there, remove your iPad here but not there, etc.)

    Introduces silly limitations — The best airport example is the simple bottle of water. Remember when all you had to do was take a swig of your water bottle to show TSA it's a harmless liquid? I miss those days. The bottled water industry must be laughing all the way to the bank as we toss half-full bottles on one side of security and then have to buy new ones on the other side. In the book world, DRM means that lending a copy, something easily done in the physical world, comes with way too many restrictions in the e-world (e.g., two-week max, can only be done once in the life of the title, etc.)

    I admit that I don't have a solution to offer the airline industry. I don't want to board a plane with a terrorist any more than you do. A pilot friend of mine made an interesting comment about this awhile back, though. He pointed out that one of the results of 9/11 is that passengers are no longer willing to be helpless victims. The shoe and underwear bomber events are examples of just how true this is.

    In other words, passengers are stepping in to fill the holes that will always exist in airport security. I suggest we follow a similar approach in the publishing world, but take it a step further: Eliminate DRM and trust our customers to not only do the right thing, but also ask them to turn in anyone they see making/offering illegal copies.

    Photo: Airport Security by glenmcbethlaw, on Flickr


    April 06 2011

    Amygdala FarmVille

    Our Brain by perpetualplum, on FlickrYou may have received a recent email that started something like this:

    We have been informed by one of our email service providers, Epsilon, that your email address was exposed by an unauthorized entry into that provider's computer system.

    Did you rush immediately to Facebook or Twitter and ask "Who the heck is Epsilon?" I hope so because that would be super meta-ironic.

    What you just discovered was what I hope will someday be known as Jim's First Law of Personal Data Privacy, which states: "The people that know the most about you are the people you know the least about."

    The thing is, people have been trying to figure you out for a long time — long before you got your first email address or bought your first Michael Graves toaster at Target online. Many of the companies doing marketing services have roots that go back to before most of you reading this were born.

    The difference is that in the old days they had to painstakingly amass their dossier on you one psychographically informative tidbit at a time. It might have taken a decade or more, but eventually by combining data from a bunch of the places where you did business they could more or less figure you out. Having accomplished that feat of data hoarding and analysis, they might make you 5% more likely to open a piece of direct mail, or discover that people that vote for third-party candidates might also respond to Hummer ads — and these companies would get paid handsomely for that information.

    These databases were the crown jewels. If FASB accounting rules were rational, these information stores would appear on the books the way GM lists plant equipment. And they were big data before people were using the term "big data." These folks were prefixing with peta- while the rest of us were still getting our heads around tera.

    So I can only imagine the reaction in the boardrooms of those traditional firms when Facebook and Google built their Psychographic Marketing Honeypots and disguised them as a social network and a search engine. "All that data we've worked so hard to source! Merde! People just sit there all day giving it to them!"

    And the best part? Need a new field to feed your new and better algorithm? Don't spend months trying to source it from the U.S. Census or a credit card company or, even worse, merge with a frequent flyer program or a phone book to get it. It's way easier than that. Just add a field to the user profile page and they'll fill it out for you!

    I wonder when it occurred to Mark Zuckerberg that he was like a casino operator and was building two companies. The part full of bad carpet and distractions that you think you're doing business with and the part you can't see behind the scenes that does, well, other stuff. Do you think it was part of his plan from the beginning, or was it sort of a mid-stage epiphany? Never mind, don't tell me, I'm waiting for "The Social Network" to be free on Netflix. And I don't want any spoilers in the comments, please.

    The world has changed though, hasn't it? We have entered the Matrix, but it's not our body heat they want. They want the preference model encoded in our amygdala and a list of all the people that might influence that model tomorrow.

    At some level, the relationship between Facebook and one of their advertisers isn't all that different from any other marketing services firm and a company like General Motors. But the way we participate in generating our own profile while we think we're doing something else is fascinating.

    I think most people think their bargain with Facebook is like the one they had with broadcast television. I sit here a few hours a day sidestepping drudgery and you feed me ads. But as you know (or know now), that's not it at all. And it's definitely not just for those ads being served on the site.

    The bargain we make collectively with "the web" might be Faustian if it were in fact a bargain between two parties. After all, what we are actually doing is trading our most intimate selves into the ether and in return we get creepily prescient ads for erectile dysfunction medication and whatnot. But our bargain is too often an implicit one with the back room we don't know exists. Faust at least knew the terms of his agreement.

    Here's what you need to know: Your mind is advanced enough to experience a self, a self that you think has intrinsic value. But that's just a construction in your head. Your actual extrinsic value, I'm sorry to say, is just the sum of your known behaviors and the predictive model they make possible. The stuff you think of as "your data" and the web thinks of as "our data about you — read the ToS," is the grist for that mill. And Facebook's shiny front room is just a place for you to behave promiscuously and observably. While you're farming, well, fake carrots or something, they are farming your amygdala.

    Illustration: Our Brain by perpetualplum, on Flickr


    February 24 2011

    Buy where you shop gets a little easier

    SearchReviews.jpgConsulting crowdsourced product reviews has become a common step in many consumer shopping processes. The practice, though good for the consumer, might not be so great for the brick-and-mortar retailer. If a customer comes into a store to touch and inspect products, then returns home to compare reviews before making a final selection, an online purchase becomes far more likely than the customer returning to the store.

    A new app by Search Reviews may alter these shopping habits. Consumers can install the free app on a smartphone and use it to scan in bar codes from any of more than four million products. The app then shows a list of aggregated consumer reviews.

    As Alexia Tsotsis from TechCrunch points out, the app has a ways to go in terms of its ability to sort by date or store, and it doesn't yet incorporate geolocation. But having aggregated product reviews in hand could encourage more of those buy-where-you-shop consumer experiences.

    March 25 2010

    Web-TV convergence is already here, just not the way we expected

    Remember WebTV? It was supposed to combine the web and television experiences into a media consumption firehose. Or something like that.

    Web 2.0 Expo San FranciscoWebTV didn't work out, but that hasn't kept other companies from pursuing similar dreams of web-television utopia. Yahoo is building TV-based widgets. TiVo is banking on search across media types. Boxee wants to entwine browsing and viewing. Even Google is getting in on the act.

    But new data from Nielsen suggests they're all headed in the wrong direction. The convergence between television and web has already happened, but it's not occurring in a standalone box:

    In the last quarter of 2009, simultaneous use of the Internet while watching TV reached three and a half hours a month, up 35% from the previous quarter. Nearly 60% of TV viewers now use the Internet once a month while also watching TV.

    There's a couple things I find notable about this:

    1. Manufacturers of televisions and set-top boxes are only now catching on to the inherent disconnect between television's lean-back experience and the web's lean-forward positioning. If you've ever entered a password through a TV remote, you know how clunky this can be. Yet, input is just the beginning of the problem. The web is a pull environment, so by default we're more engaged. Television, beyond the guide grid, is a push environment. I believe a big reason why web and TV haven't yet converged in a broadly adopted super device is because it's uncomfortable -- physically and mentally -- to rock between push and pull. It's a heck of a lot easier to prop a notebook on your lap while the TV plays in the background.

    2. The Nielsen post includes another notable conclusion:

    The research shows that Americans watch network programs online when they miss an episode or when a TV is not available. Online video is used essentially like DVR and not typically a replacement for watching TV.

    This is yet another example why "killer" technologies are ridiculous. Consumers are filling in their consumption with online video, not replacing it. That's a huge departure from the web killing cable or killing broadcast or killing ... whatever.

    3. Extending on that: It matters if content is consumed, not how it's consumed. All kinds of effort has gone into bridging the web and TV worlds through brute force. Yet, the most successful cross-media efforts are the ones that let consumers interact through the tools they already use. What makes more sense: integrating Twitter into a television's hardware or helping users tweet during a show?

    (Via Mashable.)

    March 21 2010

    "Am Schauplatz - Bärenjäger": finanzielle und organisatorische Beiträge, um ein Ereignis der Berichterstattung herbeizuführen


    [BKS-Zitat] "Der Bundeskommunikationssenat geht wie auch der Antragsteller davon aus, dass ein Beitrag finanzieller wie organisatorischer Natur zur Herbeiführung eines Ereignisses, welches Gegenstand einer Berichterstattung sein soll, dann mit dem Objektivitätsgebot vereinbar ist, sofern eine Beeinflussung der Authentizität der Berichterstattung ausgeschlossen werden kann und durch geeignete Maßnahmen sichergestellt wird, dass der Anschein von Parteinahme oder der Verzerrung der Dimensionen des Ereignisses hintangehalten wird. ..."


    Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
    Could not load more posts
    Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
    Just a second, loading more posts...
    You've reached the end.
    No Soup for you

    Don't be the product, buy the product!

    YES, I want to SOUP ●UP for ...