April 10 2012

State of the Computer Book Market, part 5: Wrap-Up and Digital

In this final post, (posts 1-4 are available here), I will provide a summary of the first four posts, provide some insight into a view of top authors, and include data on electronic books and how parts of the digital world are surpassing the print world.

Here is a quick summary of posts 1-4.

In 2011 the book market, as a whole, saw about -9.25% fewer units sold than in 2010. The tech book market was up by 2% in 2011, so it out performed the whole market. Yet our data, which is based on the Top 3000 titles for each week, shows only 0.7% growth. This means that the majority of the growth was generated by the titles that produced very few copies and may not have made it into a weekly report for a given week in 2011. The market continued to follow its seasonal pattern, getting off to a fast start in 2011, taking its typical nose-dive downward in July, and recovering in the fall. Yet there were some anomalies with higher peaks and valleys in the trend-lines for 2011. These anomalies were caused by Borders Books (BGI) going out of business.

There were 21 weeks in 2011 that were ahead of the same week in 2010. In 2010 there were only 11 weeks that were ahead of the prior year's unit sales. There were 442 more titles (from all copyright years) that made it into the Top 3000 reports during 2011, and 268 more in 2010 than 2009. This demonstrates that the threshold to make a Top 3000 report was lower than any other year. The average units per title, for all titles not just new, increased slightly from 37.95 in 2010 to 37.96 in 2011. As far as new 2011 titles, there were 349 fewer titles published that made the dataset, but they averaged 3.4 more units per title and averaged 1 less page per title, and on average cost $0.80 less than 2010. Again, these titles had a publish date during 2011.

The biggest winners in growth order are: Tablet, Mobile Programming, Windows Consumer, Security Topics, Hardware Topics, Social Web, Computers and Society, Cloud Computing, Information Technology, and Data Topics. The areas with the largest drop in units were, in descending order: Web Page Creation, Digital Photography, Mac OS, Flash, Web Programming, Web Design Tools, Personal Computers, Linux, Software Project Management, and Personal Database. In the top performing area of Mobile Programming, iOS was nine times as large as Android in 2009, and roughly 2.5 times as large of a category in 2010, and today sells only 1.2 times as many copies of Android books to developers.

From a publisher's perspective, Pearson regained the second spot at the end of 2011, behind Wiley and slightly ahead of O'Reilly. The two imprints of O'Reilly and Dummies continue to have the most diverse publishing programs due to their strong performance in all six tech categories.

The number one title, from a dollar perspective, was PMP Exam Prep, Sixth Edition: Rita's Course in a Book for Passing the PMP Exam and from a unit perspective, My iPad 2. From a dollars perspective, the PMP book has ranked in the top two since 2005. The number one programming language for three years running (2009, 2010, and 2011) was Java, with JavaScript and VBA also showing continued strong growth in 2011. That's the quick review.

Now let's turn our attention to the most important ingredient in publishing — authors. Authors are the entities that create all types of content. And there are all types of authors. Some are really like small publishing houses with "co-authors" doing most of the heavy lifting. Then there are those who do all the lifting: editing, writing, testing, and coding of the content themselves, and then move on to help promote, market and sell. These latter activities are what contribute to what we call an author platform. Some authors have an inherent platform by who they are or what their 9-5 job is, while others have to work hard to cultivate their platform. The most successful authors in our dataset have figured out both the upfront creation of content and the end-game of helping with marketing and sales. The table below shows the top 15 authors for 2011 and what their rank is for both 2011 and lifetime units. I'm also showing what their % Units '11 was so you can see the percentage of their lifetime units that they sold in 2011. A few did really well in 2011 and yet lifetime are not a top 10 author. Scott Kelby and David Pogue did not have outstanding numbers in 2011 but are the top two lifetime authors from a units perspective. Gary Rosenzweig and Patrick Kanouse both had outstanding sales in 2011 but are nowhere near top 15 authors from a lifetime perspective.

Author_List 2011 Rank All Time Rank % Units '11 Paul McFedries 1 6 17.68% Andy Rathbone 2 3 9.18% Gary Rosenzweig 3 61 59.19% Nancy C. Muir 4 24 25.26% David Pogue 5 1 7.57% Greg Harvey 6 4 9.10% Edward C. Baig, Bob LeVitus 7 44 41.49% Patrick Kanouse 8 148 90.02% Brad Miser 9 30 28.52% Dan Gookin 10 5 7.75% Stephen L. Nelson 11 8 9.49% John Walkenbach 12 10 10.17% Wallace Wang 13 13 11.10% Scott Kelby 14 2 4.55% Peter Weverka 15 19 11.41%

The noticeable change is that Scott Kelby takes the number one spot from a dollar perspective even though David Pogue sells more units. Books with slightly higher prices enable this movement in the position/rank. Notice that Rita Mulchay does not make the Top 15 for units sold, and yet as an author her books are ranked number seven in dollars generated. Another interesting observation is that there are about six authors that are in the mix for the top spot all-time, yet there is a significant drop-off after the top six. In the dollars view, the drop-off is even more significant after the top two authors.

When you look at the data for the top 15 authors (basically, who has produced more units and dollars), you get the following two charts, showing lifetime sales (2004-2011).

Units Dollars LifeTimeAuthorUnits.jpg

In 2011, Paul McFedries had his name on 56 different books (ranging from 2001 through 2011) that made our list, for an average of 1,937 units per book. His books sold the most units in 2011 but his average was the lowest of the all time top-five authors. His total was about 22,000 more units than David Pogue who saw 16 of his titles make the list with an average of 4,789 per title.


Electronic distribution and sales

Now let's move past print sales in 2011 — or at least partially away from traditional channels of distribution — to discuss e-distribution. The three charts immediately below are from Bowker, which has recently released its Results Of Global eBook Research. The charts show three interesting graphs about awareness of for-pay content, digital consumption by gender, and digital consumption by age. What is interesting to me is not that Indian males lead the way in both digital downloads and purchasing for-pay content, but rather that more women than men in the U.S. and U.K. are consumers of digital content. It is also no surprise, at least to me, that the 25-34 age group is the most active in consuming digital content.

Click on each image to view a larger version.

Awareness of Paid Content Digital Consumers by Sex Bowker Digital Paid Awareness Bowker Penetration of Buyers by Sex Digital Consumers by Age Bowker Penetration of Buyers by Age

Now to take you into the tech book digital market, let's look at what has happened in the past few years with O'Reilly products. The chart immediately below shows our digital products aggregated into one number and then plotted by year and month. This gives you a perspective of how things are changing. What it does not show is that early digital copies were all PDF files that were pretty clumsy and not as useful. Now we offer our content in virtually any form our readers prefer. So with Mobi and EPUB, we are seeing the less useful PDF decline significantly. But the chart below groups all digital forms together. Only two months in 2011 were not ahead of 2010. Those two months were June and July, which coincidentally coincided with the Borders' liquidation of physical products.


I also think it is important to look at what O'Reilly customers purchased when visiting The chart below shows our content mix for the previous two years. The only thing declining in 2011 was our total sales for print products. Rough Cuts are early access editions of our content that are accessible on Safari. As you can see, our ebooks outsell everything by nearly 4 to 1.


What I am not sure exists is a good indicator of what the startup community uses for technical content and books. But if I had to bet, I would wager on ebooks direct from the publisher would be the preferred format in the startup world.

The four charts below show O'Reilly revenue and units growth through The reason I am showing these is because the same content that goes into our print books is available in various digital forms. It is quite obvious that our customers prefer to shop on for digital copies.

The two charts on the left are showing revenue (top-left) and units (bottom-left) for 2011 exclusively. The two charts on the right are both revenue and units but are showing the trend for the previous four years. From talking with other publishers, this high-growth trend for digital books is indicative of what is happening in the market. The ebooks are just digital versions of our print products. We have not come to a point yet where the digital edition is a native creation that is a blend of live, editable code, video, text, images, links, assessment tools, and other resources all working together. At this point in time, most digital products are typically print books with a few links and some color for good measure. But don't blink because the tech book market will change quickly to these more blended content types.

O'Reilly Product Mix - Revenue 2011 O'Reilly Print vs. eBook - Revenue Trend Ecom_1.jpg Ecom_2a.jpg   O'Reilly Product Mix - Units 2011 O'Reilly Print vs. eBook - Units Trend Ecom_3.jpg Ecom_4.jpg

Again, this data is taken from direct sales for O'Reilly and, and may not represent the whole computer book market. One point that was recently brought in discussions at O'Reilly by our VP of online is that O'Reilly is selling more copies of digital editions than any other distributors that carry our digital copies. I think that may be due to the fact that we have DRM-free content that allows you to move your copy of your purchase to another device. For another perspective on DRM, I wrote this for our author newsletter a while back and I still believe that the ideas are sound. Have a look here.

Another key ingredient to understanding what is happening in the digital world is to look at Safari Books Online. Safari is a subscription service with more than 500,000 users. Its main focus is its B2B service that allows developers from many of the largest companies in the world to have access to technical books from most of the major publishing houses and imprints. One notable difference is that the categories with consumer-oriented titles, including the Digital Media titles, do not perform as well in Safari. Developer titles rule in Safari; so as a proxy, Safari may be one of the better predictors of a tech book market. As you can see from the chart to the left, our content in Safari is growing at a nice steady rate. In fact it is safe to say that Safari represents the second largest distribution channel for O'Reilly, with Amazon still occupying the top spot and O'Reilly direct battling for third. It will be interesting to see how the distribution of technical content unfolds in the coming years.


If you look at word clouds for the titles published in 2011 for all books, and the ones found on Safari for O'Reilly during 2011, you notice some similarities. Notably that "Development" and "Programming" are big in both images, but slightly larger on Safari. I was initially not sure why "Control" was so large on Safari, but after a bit of digging I found that Tidbits Publishing has a series called "Take Control" and O'Reilly Media is a distribution partner for them in Safari.


All print titles

Safari for O'Reilly

Thank you for reading these posts. If there is something that you are itching to see / understand more clearly, please let me know and I will try to help. I plan to excerpt updated pieces of these posts on Twitter or Google+ throughout the year. They'll come from @mikehatora or +Mike Hendrickson and will likely get re-tweeted by @oreillymedia or +O'Reilly on Google+.

April 04 2012

State of the Computer Book Market, part 3: The Publishers

In this third installment, (see Post 1 and Post 2; Posts 4 and 5 to come soon), we will look at how publishers fared in 2011, as compared to 2010. The chart below shows our dashboard view of the large publishers' results for 2011. The most notable piece of information is that Wiley continues to hold the leading spot as the largest publisher (with 32% market share of units sold), while Pearson and O'Reilly both lost 1%, which is picked up by Cengage and McGraw Hill. (We'll look at revenue share later in the analysis.)


// oAnth - via RSS not usefully presented - please go to the linked article.

Next up, Post 4 will contain more analysis of programming languages. Post 5 will look at digital sales.

April 02 2012

State of the Computer Book Market, part 2: The Categories

In this second installment (the first post can be found here), we look at computer book sales in specific technology categories.

Remember that we've organized the data into six "Category Families" — Systems and Programming, Web Design and Development, Business Applications, Digital Media Applications, Consumer Operating Systems and Devices, and Computer Topics.

Within each of these Families are category group, super-category, category, and atomic category, in a five-level hierarchy. For example, Systems and Programming includes the category groups programming languages, databases, software engineering, general programming, security, and so on.

In the rest of this post, we will contrast 2011 with 2010.

As a refresher, here are two treemaps of the Category Families, with their sub-areas for the final quarters of 2011 compared to 2010. The map on the left shows the growth of the count of titles in each area and the map on the right shows the growth in units for each area.

Count of Titles Units 12_Cat_QTR_TitleCount_PrevYear.jpg

The Treemap on the left shows the number of new titles entering the Top 3000 in 2011. Security General (upper-left center), Data Analysis (left-bottom center), iPad-consumer (middle-bottom center), MacOSX (middle-bottom center) and HTML5 (upper-right corner) where the brightest green growth areas in 2011.

The Treemap on the right shows the top growing areas from a units perspective. The same areas are the top performers, but they have moved around a bit and are larger in some cases which reflects their market share. Again, this is comparing the last quarter of 2011 with the last quarter of 2010. This time period reflects the holiday shopping season and usually the best for consumer topics and not necessarily for the more technical titles which peak early in the new year.

In the next two images, you can see how our Category Families stack up. The image on the left shows the number of titles that made the Top 3000 in a given year. Contrast that with the image on the right, which shows the number of units sold in each year. What you will notice is that the number of titles in Systems and Programming went up in 2011 to its highest level since we began tracking, yet the units sold for the Category has been going down each year. Consumer Operating Systems and Devices and Computer Topics are the two areas that went slightly up in both the number of titles and units sold in 2011. Systems and Programming still is the largest category and is a chief indicator for the health of the computer book market, and it's been in a consistent decline — for print books. You'll see some more positive indicators in my upcoming post on digital distribution.

Count of Titles Units Family_count.jpg Family_units.jpg

The table below shows each Category Family's compared growth between 2010 and 2011 (YoY Growth), 2010 and 2011 ranking (10Rank/11Rank) and 2010 and 2011 percent of market share (10Share/11Share).

Category Families YoY Growth 10Rank 11Rank 10Share 11Share Business Applications -00.45% 2nd 2nd 21.00% 20.60% Computer Topics / Other 15.78% 6th 6th 03.15% 04.11% Consumer Operating Systems 04.22% 3rd 3rd 15.44% 17.27% Digital Media 09.29% 5th 5th 17.27% 18.58% Systems and Programming -00.64% 1st 1st 34.62% 35.02% Web Design and Development -02.58% 4th 4th 14.32% 13.72%

Before we look into categories further, let's first take a look at the words that make up all the computer titles for 2011. It's an interesting view of the words that the publishing industry puts on the front of books, online searches, and anywhere there is metadata about content. A note about this data: I threw away the stop-words like "the," "and," "it," "with," etc. I also disregarded "Microsoft," since it is a descriptor used for various products and is redundant. Here is the "title" view of the market. What obviously pops to me is Programming and Development, but Data came from nowhere to being a discernible word on the image [located @ 10:00 on a clock].


As the market keeps declining, the response of many publishers is to increase the number of titles published, in an attempt to gain market share. Immediately below are two bar graphs showing the trend for how many titles made it into the Bookscan dataset in a given year, and the average units sold is for all titles. So this is the non-obvious point here: There are not necessarily more titles being published, but more titles making it into the dataset. This could be attributed to a lower threshold to get in. In other words, some weeks the threshold to make the Top 3000 list can be as low as 1 unit sold. It is a relative measure. The last couple of years have had lower thresholds, and thus more titles made the list but with worse average units. When the market is healthy, the threshold moves up and only the solid-performing titles make it into the Top 3000. The lower threshold barrier is resulting in a significant decrease in the average units per titles for all publishers.

Number of Titles Average Units num_titles.jpg Avg_Units.jpg

When we drill into the category families a bit, we see that seven of our 10 top categories (known as super-categories) sold fewer units in 2010 than in 2009, for a net loss of -244,936 units for just the top 10 areas. In other words, our bigger and typically more stable areas were selling significantly fewer units in 2010. In the first half of 2010, there were 49 super category areas that were ahead in the sales over the first half of 2009, yet six of the 49 categories slowed down and ended up losing enough ground to show a year-over-year decrease in units. We ended up with 43 super-categories producing more units in 2010 than they did in 2009.

The biggest winners in growth order are: Tablet, Mobile Programming, Windows Consumer, Security Topics, Hardware Topics, Social Web, Computers and Society, Cloud Computing, Information Technology, and Data Topics. The Tablet super-category went from roughly 15,000 units in the first half of 2010 to an additional 100,000 units in the second half of the year. An increase in titles fueled this growth — output tripled from 7 titles in the first half of 2010 to 22 titles by the year's end.

The areas with the largest drop in units were, in descending order: Web Page Creation, Digital Photography, Mac OS, Flash, Web Programming, Web Design Tools, Personal Computers, Linux, Software Project Management, and Personal Database. The category that surprises me the most is Web Programming. Sixteen fewer titles in the Web Programming area made the list in 2010, and only 7% of the titles sold more than 1,000 units, as compared to 11% in 2009.

The table below provides a view of the market's erosion. The Average Min value represents the "low threshold" weekly average during a given year. The Average Max is the high-range weekly average for a given year. Number of Titles is self-explanatory. You will notice that the years with the highest min had fewer overall titles represented in the data. The bottom line is that as the market erodes, it appears as though we are seeing a watering-down — more titles producing fewer units on average.

Year Average Min Average Max Number of Titles 2004 9.2 1,133 7,451 2005 9.6 1,099 7,123 2006 9.6 1,315 6,881 2007 9.4 1,348 7,092 2008 8.2 1,534 7,310 2009 7.3 1,057 7,557 2010 6.7 1,112 7,792

So it could be said that we've been in a bit of a tech innovation slump. But in my opinion we are in a distribution slump or holding pattern. By that I mean that we have print books, digital versions of the same thing, and yet have you seen any really innovative format for a tech book hit the market lately; something like what Khan Academy has done with other parts of education. They certainly have a long way to go to build out a Computer Science Curriculum. I think before publishers say we are in a tech slump, we need to look inside our own walls first and realize that we may be in a publishing slump as our consumers want different educational experiences.

Now let's look at the categories that comprise each category family. Below are some individual trend charts from our dashboard showing the 24-month period from January 2010 to December 31, 2011 for the major categories. By looking at a 24-month pattern, you get more insight into whether or not a particular area seems to be hit by seasonal factors, and if there is a steady decline/increase for the category. It is important to look at scale on these charts because it visually shows you the relative market size. Another way to think about it is if the trend line is high in the individual box, the category is big, and if it is low, it is a smaller category. What is interesting to note is that Consumer Operating Systems, Digital Media, and Business Applications and Devices all have a January spike, which is likely due to individuals buying "how to" books for their new computers, devices, and operating systems. This is a consistent seasonal pattern.

Systems and Programming Business Apps Consumer Ops and Devices   sys_prog_dash.jpg bus_apps_dash.jpg con_ops_dash.jpg   Web Development & Design Digital Media Computer Topics web_dev_dash.jpg dig_med_dash.jpg com_top_dash.jpg

The Categories (24-month rolling, January 2010 — December 2011)

Clicking on the charts below will produce a larger view. When viewing the charts below, keep the reference charts above in mind. Viewing these jointly provides more context on the size of market and seasonal patterns.

Category_Family: Consumer Operating Systems and Devices

Here are the trend lines for the five main categories (cat_family) that make up Consumer Operating Systems and Devices.


This category is a medium-sized area and was the one of three Category Families to show growth year-over-year. This category's growth is driven by the iPad, the iPhone and the Nook in the Portable Devices sub-category.

The consumer operating systems and devices market shows ups and downs each year and pretty closely reflects what is going on in the whole market. If you compare the growth of Mac OS X with Microsoft Windows, the Windows books had in increase in 2010 but both declined in 2011. The chart below shows how these two are stacked up against each other. Foreshadowing the 2012 results, I believe that the Windows category will be up because Windows 8 will ship and be a significant upgrade for most. I believe> that Apple will continue to decline as they roll out $29 upgrades that are minimal. The iPad, iPhone, and Android devices will continue to soar.


Category_Family: Business/Office Applications

When comparing the Business Apps area for 2010 and 2011, there were 12 super_cats (one level below cat_family) that performed ahead of the prior year and 21 that underperformed compared to the prior year. The 21 underperforming super_cats only lost 2,090 more units than the 12 positive areas had gained, for an overall -0.44% growth rate.

The three healthiest super categories were Office Suites at 7.49% growth, Collaboration Technologies at 12.43% growth and Social Network (Facebook) at 11.73% growth, while Presentation Topics at -11.88%, Accounting at -8.21%, and Search at -15.30% saw the biggest drop in units for this Business category. It is interesting to see that Spreadsheets is pretty much the same as the market. A very slight uptick in growth, 88 more units in 2011 that 2010, and is still a large super category in rank. Spreadsheets trails only Digital Photography and Tablets for the top spot as the biggest super category.

Here are the trend lines for the eight categories that make up Business/Office Applications.




Notice how much bigger of a category "office" is than the other two ("gen bus app" & "design"). But the news in this category is that Office titles have slightly stabilized, having gone from -4.66% decrease last year to a -0.48% decrease this year. This decline mirrors the overall market. The category has been dominated by entry level user books. These sort of entry level books are driven by Series that have consistent promises and both Dummies and Microsoft Press each held four spots in the top 10 best sellers list for this category. This does make sense when you think about it. I said last year that it looked like Dummies have a bit of a book dynasty, so to speak, but in 2011 Microsoft Press rocketed into this space well. The category chart Web Apps is mostly dominated by books on Facebook. Who would have thought you'd need a book on how to use Facebook? These are not programming Facebook APIs, but rather how to use the Social Network. Foreshadowing 2012, I expect that this Category Family will continue to do well as Windows 8 will undoubtedly create more demand for Office books in 2012.

Category_Family: Web Design and Development

Web Design and Development is down -4.36% from 2010 to 2011. Another 37,438 fewer units were sold in this category in 2011 than in 2010. And remember, 2010 was one of the worst years we've seen in awhile for this category. There were eight sub areas that showed growth in this category — HTML5 at 74.60% growth and Social Web at 9.36% and JavaScript at 17.32% growth led the way in 2011 for this area. If we combine HTML5 and JavaScript because they are very closely related, the combined growth rate is a healthy 41.39% growth and 45,559 more units sold in 2011. O'Reilly has three of the top five books in this area with Learning PHP, MySQL, and JavaScript leading the category in unit sales for two years in a row. Head First HTML with CSS & XHTML and JavaScript: The Good Parts also cracked the top five for us.

The areas that surprised me the most, though, were Web Programming which saw ~25,152 fewer units sold in 2011 than in 2010 or a -23.61 growth. And closely behind was Web Design Tools that produced -16,534 fewer units for -23.67% growth and Web Development producing -11,684 fewer units and -28.54% growth. Yet HTML5 and JavaScript are growing. This is a bit perplexing but could be attributed to developers wanting more specific topics rather broad reaching topics and tools. In Web Design Tools it is mostly Dreamweaver's fall that puts this category down. In Web Development it is Website creation type of books for "beginners" and "dummies" that have fallen the most.

Here are the trend lines for the eight categories that make up Web Design and Development.




Obviously the big sub categories here are "web design" and "web development." It is dominated by titles that talk about performance, scalability, reliability, and tuning. Similar to what you will find at our Velocity Conference. Foreshadowing for 2012, the area to watch is JavaScript. Doesn't everyone need to know and learn JavaScript?

Category_Family: Systems and Programming

This is the largest of our top-level category families. It is the place where most of the programming language, database, and software development titles reside. There are now 73 super_cat subcategories (super category) in this area and in 2011, 46 of the areas were negative year-over-year and only 27 areas had growth. There were -68,0295 fewer units sold in these areas during 2011. This is only a -3.14% decline, so this large family of titles actually performed slightly worse than the overall market. Mobile Programming and Data Analysis were the two biggest growing areas. Mobile Programming produced 30,636 more units for a 38.84% growth rate while Data Analysis produced 22,925 more units for 22.42% growth rate in 2011.

The top five performing categories, in order, were Mobile Programming, Data Analysis, Security Topics [+9,648 units / 5.53% growth], Java [+7,316 units / 7.33% growth], and Python [4,886 / 10.55% growth]. The categories with the worst performance, in order, were IT Certification [-19,078 / -31.50% growth], Windows Administration [-14,852 units / -14.13% growth], Microsoft Programming [-13,491 / -27.70% growth], C# [-12,993 units / -20.26% growth], and Network General [-11,234 / -19.04% growth].

In the top performing area of Mobile Programming, iOS was nine times as large as Android in 2009, and roughly 2.5 times as large of a category in 2010, and today sells only 1.2 times as many copies of Android books to Developers. Again, this is developer books, not consumer-oriented titles. For more on how the mobile developer market is shaping up, it seems like a two horse race with iOS and Android. Windows Mobile is a blip along with cross-platform solutions like PhoneGap as you can see in the image directly below.


This chart shows the number of units (sum of Unit in blue bars) and the Average units per title (AvgUnitsTitle in red line) for the mobile area. Android has a higher unit average whereas iOS has more units sold because more titles made the list. This all makes me wonder about the Windows Mobile blip and whether Microsoft should just jump into the Android space too or continue to make more from licensing it than their own platform.

Here are the trend lines for the 12 categories that make up Systems and Programming.





Next up, Post 3 will be about the publishers, winners and losers. Post 4 will contain more analysis of programming languages. And Post 5 will look at digital sales.

March 29 2012

State of the Computer Book Market, part 1: Overall Market

Since last year's State of the Computer Book Market posts, the Tech Book market has been going through some major changes, but none more profoundly affecting our industry as Borders Group Inc (BGI) going out of business. Much of what you will see in the 2012 trends are directly related to BGI's demise, though the faint signals that the book market provides to other industries are still evident.

You can get a quick refresher on how we see Computer Book Sales as a Technology Trend Indicator and our other posts on the State of the Computer Book Market.

The data in the posts that will follow, are from Bookscan's weekly top 3,000 titles sold. Bookscan measures actual cash register sales in bookstores. Simply put, whenever you buy a technology-oriented book in the United States, there's a high probability it will get recorded in this data. There are now two major Retailers in Tech books and they are: Barnes & Noble, and Amazon and they make up the lion's share of Bookscan's recorded sales.

Overall Book Market Performance

Before we get to the specifics of the computer book market, let's get some context by looking at the whole book market for the week ending December 25, 2011. Everything that is printed, bound, and sold as a book, from Steve Jobs and Heaven is for Real: A Little Boy's Astounding Story of His Trip to Heaven and Back to Diary of a Wimpy Kid: Cabin Fever and StrengthsFinder 2.0 is represented in the table below.

Overall Book Market - EVERYTHING - Week Ending: 2011-12-25

All Books, All Subjects Category Share YoY Adult Non-Fiction 40% -8% Computers 1% 2% Adult Fiction 19% -11% Juvenile Non-Fiction 6% 11% Juvenile Fiction 25% 10% Other 9% -30% Total 100% -9.25%

As you can see, the computer market is up about 2% from last year which is much better than the whole market being down more than -9%. It should be noted that the computer book market makes up only about 1% of total unit sales in bookstores and online retailers. Also, for this report and the following four posts, I am using a data that is aggregated from weekly reports of the Top 3000 titles in Bookscan. So the numbers for all the long-tail titles are not included here. In other words the data in the table above shows a 2% growth and yet our top 3000 data shows a 0.7% growth. This actually means that a considerable number of long-tail titles sold in 2011 yet did not make it past the minimum threshold to make a weekly top 3000 report. If you would like to see the performance of the major book categories, this table shows percentage growth. I find it interesting that the Religion/Bibles category is the largest-growing category in an otherwise depressed market. Only 10 out of the 47 overall categories showed growth in 2011 and the computer category was one of them. We'll look at precisely what fueled the growth for the computer category in future posts.

Now on to the technology book market. The four charts below provide some perspective into how each year stacks up against prior years from a Units and Title Count as well as Average Pages per book, and Average Price per book. The third and fourth charts provide context to how many new Titles were published during a given year and what the threshold was to make the Top 3000 report on a weekly basis. The chart on the top left shows the overall units sold per year with the Red Line showing the number of distinct titles that made the Bookscan Top 3000 weekly report. Looking at the two together provides a better insight into what happened. Basically as fewer units were sold, it looks as though more titles were published. Yet as you can see in the chart at the bottom of the four charts, there were fewer new titles in 2011 that contributed to the total units sold. So this leads me to believe that the numeric threshold to get into the dataset of the Top 3000 titles sold on a weekly basis was lower than previous years. And in fact the bottom right chart shows this was the case as 2011 had the lowest threshold in all years measured. So fewer books made the data with fewer of them being new and they had fewer pages and were priced lower on average than ever before. And yet the market showed a slight increase, so this must have come as a result of a few stellar performers that were likely titles published before 2011. We'll explore this in the coming posts, so hang on to this thought.

Units and Distinct Title Count Average Pages and Average Price MarketUnits_Count.jpg
MarketPages_Price.jpg Percent of New Titles Minimum threshold for DataSet


Immediately below is the weekly trend (from the Top 3000 titles reported weekly) for the entire computer book market since 2004, when we first obtained reliable data from Bookscan. Please remember that the data represents all publishers, and not just O'Reilly. The slightly thicker red line represents the 2011 data.

Click to enlarge

The clear seasonal pattern that we've pointed out before still exists, but with more extreme fluctuations. That is, we have a strong start that declines through the summer, spikes for the fall "Back to School" season, and finishes the year strong. And each subsequent year closely mirrors the year before but with usually a percent tick or two lower. But the biggest news for 2011 was that we had some weeks that were clearly above prior years and one week that was the highest we've seen for all the years. However, this boon comes through the unfortunate Borders fire sale, and then the eventual liquidation of their remaining inventory. In my opinion, it is too early to predict what long-term effect we will see with Borders gone, because as you'll notice as soon as their doors closed, the market hit bottom. Not only did it hit bottom, but the market tanked further down than we have ever seen it. But the reason I still think we are not in a pattern of predictability yet is that the December climb up was higher than the four previous years. So in a nutshell, 2011 was a year of our highest highs and our lowest lows and that was likely due to the Borders situation.

What you won't see on this chart is that the computer book market suffered its biggest losses in 2001 and then began shrinking 20 percent a year for 3 years, until it stabilized in 2004 at about half the size it was in 2000. (We have consistent and reliable data going back to 2004.) You can now see a second cratering in the market that started in the second half of 2008 and continued through 2010 until the more volatile and unpredictable 2011. The overall market growth rates for the previous seven years are: 2005 = 1.48%; 2006 = 3.17%; 2007 = -2.00%; 2008 = -4.27%; 2009 = -15.31%; 2010 = -4.29%; 2011 = +0.7%

So what about that market was news in 2011 other than Borders? In 2011, There were 21 weeks in the year that were ahead of the same week in 2010. In 2010 there were 11 weeks that were ahead of the prior year unit sales and in 2009, there were only two weeks that were ahead of the prior year. I am not willing to state that we are seeing a recovery in the market because a .7% growth is not exactly a strong signal in the right direction. To the optimists in the crowd, it appears as though we have seen bottom in October through November of 2011 — but pessimists will believe that they've seen this before: the market looks as though it hit bottom, but then takes another big hit downward. 2009 (the green line) was a turbulent rollercoaster ride to the bottom as well. So at this time the market is really too unstable to predict whether it will move down again or continue to recover.

Another way to look at the market is with the Treemap visualization tool. This tool helps us pick up on trends quickly, even when looking at thousands of books. It works like this:

The size of a square shows the market share and relative size of a category, while the color shows the rate of change in sales. Red is down, and green is up, with the intensity of the color representing the magnitude of the change. The following screenshot of our treemap shows gains and losses by category, comparing the fourth quarter of 2011 with the fourth quarter of 2010.

Click to enlarge

So what are all the boxes and colors telling us? First remember that this is compares the last quarters of the previous two years.

There were quite a few bright spots (bright green) during the last quarter of 2011. Take a look at Android Programming (in the upper-left box), the bright-green box showing growth from the fourth quarter of 2010. Right next to Android is iOS which is larger in size, but Black which means flat growth. You will also see Android again in bright-green (upper-middle box) — the difference is that the upper-left is for programmer-oriented books and the upper-middle is for Android consumer books [using your Droid]. Both had impressive growth in 2011 compared to 2010. In the upper-middle area is iPad and iPhone for consumers which where flat or down. I wonder if the iPhone 4GS was as big of a hit as the iPhone 4. The iPad, the upper-middle big black square, continues to impress with how big the box is at this early point in its evolution, and is flat because both years have netted lots of units sold. Look at Windows 7 just below iPad and notice that the iPad size is about 3/5 as big as Windows 7. That is amazing to me, as the iPad has grown to be almost as big as the business desktop operating system.

In 2010, Windows 7 was the number one growth area for units, followed by iPad, then Android (for consumers), and Android programming. This is unit growth, and a bit of the success for these technologies is that they are fairly new and do not have large market shares as a base to be measured against Looking at longer-established technologies, Security and Network Security and Digital Photography had strong unit growth.

I find it useful to organize the trends into classifications that are High Growth Categories bright green, Moderate Growth Categories dark green to black, Categories to Watch all colors, and Down Categories red to bright red. Most of these descriptions are self-explanatory, except perhaps Categories to Watch. This group contains titles that we've found are not typically susceptible to seasonal swings, as well as areas on our editorial radar. If there are categories you want to get on our watch list, please let me know.

The table below highlights and explains some of the data from the chart above, although the data is for all of 2011. The Share column shows the total market share of that category, and the ROC column shows the Rate of Change (RoC = (current_period - prev_period) / prev_period). So, for example, you can see that Mac OS books represent 2.79% of the entire computer book market, and were shrinking by -7.49% (RoC).

High Growth Share ROC Notes iPad 04.46% 65.57% From no presence in 2009, to a 1.74% market share in 2011, to the the largest market share for all topics in 2011. Ironically, physical books about the iPad, sell. Nook 00.86% 100.00% This category had no 2010 presence, but is now the 29th largest category in market share. Again, the irony of this — content about a book reader sells well in physical form. iPhone 01.97% 21.65% This category has seen steady increases each year except in 2010. 2011 almost matched its high point in units which was in 2009. Of all categories it is ranked 6th in 2011. JavaScript 00.98% 18.79% 2011 saw nearly 12,000 more units sold than 2010. 2006 was the high mark for JavaScript with 323 more units sold than 2011 but last year had the biggest growth year-over-year. Computers and Society 01.11% 32.53% This area reached its highest ever point for unit sales and moved to the 16th ranked area in 2011. It ranks 45th on the all time list.   Moderate Growth Share ROC Notes Java 01.32% 08.27% A nice steady pattern for Java now. Growth in each of the previous three years. It is the 12th largest category overall and reached that same rank in 2011. Web Design 01.03% 05.17% This category pattern is like a roller-coaster, up and down. Overall it ranks 13th, but in 2011 it was 22nd even though it showed modest growth compared with 2010. C++ 00.93% 06.87% A language that was built to last, and used to build things that last. The overall sales pattern is inconsistent. Its overall rank is 19th and in 2011 it was 25th. CSS 00.78% 06.34% The pattern here has been downward since 2007 but in 2011 there was modest growth. Overall it is ranked 24th and in 2011 it was 31st. Game Development 00.73% 08.96% Growth in the previous two years is fueled in part by new books on mobile game development. About 15% of the new Game Dev books were covered mobile.   Categories to Watch Share ROC Notes Office Suites 2.71% 6.88% Our fourth largest category for all years measured and usually consistent. The fact that this category is up may be attributable to new Laptops and Desktops being purchased. Digital Photography 05.71% -3.17% A very large category with five 2011 titles producing more than 10,000 units sold. 2011 had 6 additional titles making the list, yet the category sold -11,533 fewer units than 2010. Spreadsheets 04.53% 0.41% The third largest category, with 3 titles selling more than 10,000 units; 13 fewer titles made the list in 2011 yet produced 1,172 more units than 2010. Software Project Management 01.97% -04.74% A good-size and consistent category, though it is down in units sold. 31 additional titles in this category produced 5,941 fewer units in 2011. PMP and Agile still drive the category from a units perspective.   Down Categories Share ROC Notes Windows Consumer 04.36% -49.63% All time the second largest category, yet the saw 21 fewer titles make the list in 2011 and -137,958 fewer units sold. Perhaps a signal that Tablets are becoming more important than the desktop? Web Programming 01.68% -17.77% Historically, the 4th largest category yet in 2011 37 fewer titles made the list and -19,083 fewer units were sold. This loss is attributed mostly the PHP & MySQL books that used to dominate the web programming space. MacOS 02.79% -07.49% This category is ranked #5 for all time, yet the past two years have seen significant decreases. 18 more titles made the list in 2011, but they produced -13,309 fewer units. Could it be that Snow Leopard and Lion were not significant enough releases? Or are tablets eroding this category too? Flash 00.77% -45.30% Overall ranked 14th, this category has dropped in market share from 2.01% to .77%; it ranked 34th in 2010 and now 45th in 2011. HTML5 seems to be the clear influence in this space. Web Design Tools 01.12% -21.35% All time ranking of 8th, yet this is the second year in a row this category takes it on the chin with -15,196 fewer units sold and 8 fewer titles making the list in 2011. The decline in this category is mostly attributed to Dreamweaver.

Post 2 in this series will provide a closer look at the technologies within the categories. Post 3 will be about the publishers, both winners and losers. Post 4 will contain more analysis of programming languages, and Post 5 will look at digital sales.

