Colleen Doran spills the beans gives some metrics on what everyone knows—Nielsen’s BookScan Numbers are way low. Especially, it seems, where comics are concerned. She cites several books she has worked on, comparing her royalty statements and BookScan numbers:

According to Bookscan, it has sold 542 copies in hardcover. Ouch. What a bummer! This is accurate as of yesterday.

Except I got a royalty statement on this thing. And according to my royalty statement, this book sold 7181 copies by end of the accounting period, which was last summer. As of now, it has sold over 10,000 copies in hardcover. Respectable numbers. Not tearing up the charts, but enough to issue a new edition.

So, the accumulated Bookscan numbers are a good 93% off my actual reported sales from my publisher, as of my last royalty period.

Doran posts several other anonymous books with similarly inaccurate BookScan reports—among her body of work is books with Neil Gaiman and Warren Ellis, so you can guess some of these actual sales are high. Jim Ottaviani chimes in in the comments with his own stats which are similarly low.

This is something I’ve heard repeatedly from comics folks—BookScan is generally considered to be about 60%-75% of a book’s sales, but for comics it seems to be lower still. For instance, even this verified million seller has to note that its BookScan numbers are low.

So should we just ignore BookScan? Not entirely. It’s still a useful metric for a given set of sales outlets. Once again, as with the Diamond charts, the number are NOT absolute. But they do reflect placement and as the annual leaking of the numbers approaches, we should keep in mind that they do reflect sales that never show up in the Diamond charts.


  1. Similar story with Bookscan here in the UK. I was hugely disappointed when I got my Bookscan numbers at first. Then when my publisher’s statement came in – a vast difference, and things were looking much better (in fact I’ve gone to reprint twice). I’d say about 15% of sales showed on Bookscan! I agree – it is still useful though.

  2. One question I’d want to ask out loud (as, y’know, the guy who writes up the annual BookScan numbers) (Assuming I get them each year… I don’t have 2012 yet!), that doesn’t appear to be at all clear here is if Colleen is accounting for the channel difference between BookScan and The DM? (or even library sales, book fairs, book clubs, etc etc etc)

    By this I mean that (to the best of my knowledge), NO Direct Market retailer reports to BookScan whatsoever, and that BookScan ONLY represents the stores that report to it. Therefore, NO DM sales are included whatsoever, and that could, in some/many cases, account for that 93% difference.

    Example: Colleen cites “Spider-Man: Died in Your Arms Last Night” as only selling 188 copies, in all formats, in the last two years. Except…. the DM charts show that there’s ALSO 1557 copies in HC in Nov ’09, and 1747 in TP in April of ’10. (just linking to the latter because of the Beat’s spam filter — http://www.icv2.com/articles/news/17450.html) — there’s your 94% right there.

    (Plus? DCD only reports sales that meet each month’s sales threshold [in the link above it was 332 copies for the month] — which means it’s mathematically possible for a book to sell another ~3-5k copies a year in the DM [or the equivalent of a single month’s top 5 placing], and be entirely invisible from DM reporting.)

    In other words, I think that it *is* possible that “Spider-Man: Died In Your Arms Tonight” sold something close-ish to 188 copies, in all formats, in two years, THROUGH THE “BOOK MARKET”. That seems entirely plausible to me, actually.

    I’m less sure that the problem is with BOOKSCAN, as it is with people’s UNDERSTANDING or REPRESENTATION of what BookScan *IS*, if that makes sense? BookScan ONLY represents sales to retail stores that report to BookScan…. BUT, since Amazon reportedly reports to BookScan (along with all known book-focused chains) (but not WalMart, which could shift the needle, propitiously, in some cases), it really should be reporting a pretty meaningful percentage of sales through the “bookstore channel”

    However, THAT number is absolutely ADDITIVE to several OTHER numbers, including DM numbers, library numbers, book fair numbers, small independent specialty store numbers, warehouse club numbers, and so on and so forth. For many projects — especially super-hero focused ones, or for DM-primary authors/illustrators — DM numbers *can* be the lion’s share of your volume, and you should be sure that your representation is communicating that to decision-making publishers.

    (I’m fairly happy to look up year-by-year BS numbers and DM numbers for any author who’d like to get a better understanding of where their market splits might be coming from — just contact me)


  3. Thank you for all that Mr Hibbs, but after reading your response, I’m not at all sure you read my article.

    I stated repeatedly that these numbers were lo and were probably due to the direct market:

    “And, in my experience, not even close, especially if a chunk of your sales went to the comic book market, or to libraries, or digital. ”

    The point of the article is that Bookscan is repeatedly represented as an accurate depiction of book sales, and I linked to a Canada Globe and Mail story in which publishing clients flatly state that authors get picked up and dumped on the basis of Bookscan data. Which is not even remotely accurate if the majority of your sales aren’t in Barnes and Noble. And many of us have books that sell outside that venue.

    “The takeaway for me is that people who do a lot of GN work that sells in comic shops are going to find their numbers way WAY off. In every case, my royalties report higher numbers than Bookscan. But if Bookscan numbers are the numbers that publishers use to judge the commercial viability of a creator, then that’s bad for many people in comics whose primary market is the direct market.”

    I think I was pretty clear.

    Thank you for your time.

  4. “Mr. Hibbs”? Boo! I don’t want to have to call you “Ms. Doran” in response!”

    Honestly, Colleen, I don’t think your piece is that clear about the distinction between channels, because you write things like this: “Is this discrepancy unique to graphic novels because of comic book market distribution? Is Author Central not getting all the Bookscan data? Or is Bookscan really that off the mark?”

    This makes it SOUND like you don’t “get” what BookScan is — There isn’t “more” data to get somewhere, the dataset is what the dataset is. Where the problem is is that there are fifty OTHER sets of data that aren’t counted in BookScan… because they’re not *meant* to.

    I’m sorry if I misinterpreted your meaning.

    I’m not sure Heidi got it either? This piece is titled “Just how accurate is BookScan anyway?” not “By how much does BookScan miss strong sales outside of the ‘bookstore market’?” Thems two different questions,

    Now, if the compliant is that people are REPRESENTING BookScan to be something that it isn’t (ie: a record of all sales, in all channels), then those people aren’t very wise — and if executives at Book Publishing companies are making decisions on whether or not to publish an author, using incomplete information, then, as I said, the creator’s representation should be providing the accurate information as clearly as they can.

    The DM portion of this information isn’t especially secret (http://www.icv2.com/articles/news/1850.html is a good bookmark to file for those of you who don’t have it), but I don’t know of any public way to get data for any of the other channels, whatsoever.

    If book publishers are working with an expectation that “BookScan is it”, then it is up to authors to explain all of the places they have audiences, but I can’t imagine there are many book publishers who *would do a good job of actually selling comics work* who could be that myopic and misunderstanding of how the markets work? Or, at least ones who have more than a season or two of comics in the first place….

    I actually think that it is at least as likely that said un-named publishers are using “BookScan performance” as an excuse, rather than an actual reason.


  5. Perhaps BookScan is just more useful to marketers of products than it is to the producers of the products (books). As you state, BookScan measures the sales of books through channels. It’s not as though the publisher of a book doesn’t have the sales figures for a book if the author needs them. If the book sells better through some channels than others, that’s useful information, regardless of whether BookScan is used for those channels.


  6. Here’s a general rule in bookselling:
    A book gets three months to sell. (Regardless of format.)

    That’s the general cycle for a new book. Reviews and publicity will generate sales, and after that, it’s all word-of-mouth (and possibly another news cycle due to bored journalists noticing a trend).

    After three months, sales are evaluated, models are adjusted, and stock returned. It is VERY rare for a book to sell well in hardcover for a year (when the paperback is usually scheduled to replace the hardcover), either as a bestseller, or just enough to generate an automatic reorder of one copy in stock.

    That’s pretty fair, as each season has a new slate of books, store real estate is limited, and the rent must be paid. (Want a curated bookstore that keeps the best titles in stock? Visit your local library.)

    Of course, a store can ignore dictates from corporate and build a selection which feeds a clientele and generates a positive reputation for the store. I did that with the graphic novels when I was a bookseller, scanning for new titles, keeping older volumes in stock, and convincing management to let me merchandise titles creatively.

    Publishers Weekly once published annual lists of titles, sorted by ballpark ranges of sales. Has a 2012 list been collated, and how does PW tabulate sales?

  7. Yes, Mr Hibbs, you didn’t get my meaning and thank you for letting me know. I am sorry I was unclear to you.

    I won’t belabor this much longer, except to point out that I’m not sure you yourself quite get how publishing works in terms of how an illustrator like me might get a job. I don’t think you realize just how little many people in traditional publishing houses understand about the Direct Market. And that if a client is looking for an illustrator, they are not necessarily going to go to my “representation” first. They are going to check out my website and scope out my track record, possibly without even contacting me.

    You incorporate the assumption that major book houses have any interest or any need to understand how selling comics works. They believe in selling books. And what sells as a comic does not necessarily sell as a book, and vice versa. There are many projects, including some my name is on, which bombed in the Direct Market and did well elsewhere.

    And since “it is up to authors to explain all of the places they have audiences” I believe my little blog post was a part of that. I have more than a season or two in comics, thanks. I understand how this works.

    Thank you for taking the time to explain to others.

  8. I think Brian’s question regarding what BookScan *is* is a good one.

    BookScan has been a going concern since 2001, and as of 2003 claimed to capture 65-70% of sales.[1] The received wisdom is that it’s now close to 75%. By “received wisdom” I mean two things. First, almost every article I could find that talks about BookScan gives that number. I don’t know where it comes from, but it’s a safe pick, since it conforms nicely to the Pareto principle (aka the 80-20 rule). When in doubt, people default to percentages like this.

    Second, I think there is considerable doubt to be had here, because even though it’s the number quoted in article after article about BookScan, it appears that there’s no independent verification that this 75% figure is accurate. I didn’t dig extensively through databases, but I did do some searching, and I found nothing. It looks like nobody’s watching the watchmen…I mean, Nielsen.

    As an aside, even if you assume that 75% is correct, it’s still not impressive. How is it that BookScan has improved so little in the 10 years since it launched?

    Given what you can easily find from a quick stroll through author and agent blogs, there’s more than enough reason to question BookScan’s accuracy. I only have a few datapoints to compare, but they are actual data so I’ll share them. For the books I’ve published, BookScan reports only 13-26% of actual sales…and the 26% is an outlier. As I mentioned to Colleen on her blog, even if you correct for direct market sales, which I can do for most of my titles, BookScan fares poorly. To Brian’s point about channel differences, if I assume no direct market store is included in Bookscan numbers their accuracy improves a little, but they still never get closer than 40% of actual sales. (Note that I don’t think my books have a particularly strong direct market orientation, other than that their graphic novels.) Further, the title-to-title variation in the undercounting is so large that if I only had Nielsen’s numbers, I couldn’t even correctly rank my books in order of sales.[2]


    [1] “A means to measure” by Rhalee A. Hughes, Publishing Research Quarterly, Fall 2005, Volume 21, Issue 3, pp 12-28.

    [2] The exception is FEYNMAN, which has indeed sold more than the others. First Second did a great job with marketing and publicity, for one thing (thanks again, and again, Gina!), so starting life as a #1 NYT Bestseller — on the graphic novels list, natch — is something I recommend for everyone to try. :)

  9. “I’m not sure you yourself quite get how publishing works in terms of how an illustrator like me might get a job.”

    I absolutely do not, you’re 100% correct there!

    ” I have more than a season or two in comics, thanks. I understand how this works.”

    Yeah, sorry, that bit there wasn’t really even slightly addressed to you — it’s addressed to Beat readers who may or may not click through to your piece. Sorry if you took that as some sort of attack or something.


  10. BookScan might have flaws, but what alternative is there? A Harvard librarian didn’t know of one:

    Strange as it may seem, we know of no reliable, publicly-available way to get comprehensive statistics for book sales at this time. The only database with reasonably accurate information is Nielsen BookScan, which reports point-of-sale data, but even that claims to represent only 75% of all retail sales. BookScan is a recent (last ten years), very expensive subscription service, used primarily in the industry. Harvard does not have a subscription. [. . .]

    The bottom line is, only book publishers have comprehensive sales data, and they don’t usually make it public.

    Also, there have been complaints about BookScan for years. From 2004:

    Most in the industry say BookScan is better than a typical bestseller list but that it remains a far cry from a royalty statement or a definitive gauge like companion Soundscan. “It’s not perfect. But it’s a usable tool. The more numbers you have, the more likely you are to find the truth between them,” says agent William Clark. In other words, the only reliable thing you can say about book sales trackers is that none are fully reliable.

    When a book’s creator has it published commercially, at least he doesn’t have to market the book and track all the figures himself.


  11. Jim:

    Just to ask the dumb question aloud: when you correct for DM sales, you’re also correcting for library sales, etc.? I ask this because I’d think they’d be significant for your books.

    Interestingly, Amazon’s Author Central seems reasonably clear on what the BookScan data is and is not — https://authorcentral.amazon.com/gp/help?topicID=200580390

    Sales figures from retailers include:

    Print sales by more than 10,000 retailers, including:
    Amazon print sales*
    Barnes & Noble
    Deseret Book Company
    Follett College stores
    Returns from retailers (For example, if in one week, you sold 10 books and 2 were returned to a retailer, BookScan would show 8 books sold.)

    Sales figures do not include:

    Sales from Wal-Mart and Sam’s Club
    Sales to libraries
    Purchases by wholesalers such as Ingram [<- or Diamond!]
    Sales of used books
    Books published through CreateSpace
    Fulfillment by Amazon (FBA) sales
    Pre-orders—orders for a book before the book is released

    If a disproportionate number of your books are sold by stores that do not report to Nielsen, your sales information may underestimate your total sales.

    * Note about Amazon print sales: Sales reported depend on which retailers selling your book participate in Nielsen BookScan, and whether your book is registered with one of the companies from which Nielsen derives its list of reported ASINs. If your book is registered with the Ingram Company, for example, you will see sales info. If your book is Print on Demand, your publishing company may not report ISBNs to Ingram and you may not see sales information.


  12. “Sorry if you took that as some sort of attack or something.”

    Not at all, but I’m not sure who knows my work history. I have to assume anyone I’m communicating with on the internet doesn’t.

    It is very good of you to go into such detail, and also very good of Jim Ottaviani and Synsidar.

  13. Hi Brian…

    “Just to ask the dumb question aloud: when you correct for DM sales, you’re also correcting for library sales, etc.?”

    Not a dumb question! The numbers quoted above don’t in fact break out library sales (etc.). I was being very binary, as in direct market vs. non-direct market, which is what your yearly analysis (and the discussion above) usually focuses on. For what it’s worth, from where I sit I consider all sales as equally good…which is another problem I have with BookScan’s methodology.

    The other reason for not doing so — besides laziness/haste! — is that I can’t break things out for all years or for all titles, and the data I do have are not clean: categorization is not 100% consistent across the available years, and a few of the numbers look really crazy, so much so that at least one has to be a typo…if only I’d sold that many copies of that title!

    But taking into account the numbers that do look reliable, library sales account for 25-30% of the total, non-direct market sales I can track.

    So, using round numbers, even if you call it 30% (to give them as much benefit of the doubt as possible) BookScan is still capturing only 25-50% of the non-DM sales on my books. That’s a long way from the alleged 75%.


  14. I formerly worked at a publisher and worked with Bookscan numbers daily.

    They do NOT cover the direct market, so graphic novel sales are always going to look low on Bookscan.

    They do NOT cover Baker & Taylor, which is the country’s largest wholesaler to libraries, so library sales (which can be decent numbers for several genres, esp. childrens’ and graphic novels).

    My usual casual commentary on Bookscan: don’t take the numbers so seriously. Ask your editor at your publisher how many copies they have SHIPPED. This is a gross number and does NOT include returns from trade bookstore megas (Amazon, B&N, BAM) or independent bookstores. Since the direct market is non-returnable, those will never show up as returns against you. Generally within twelve months of publication (usually sooner) the cycle of returns has been pretty much completed, so you can then ask your editor again for a shipped number. This shipped number will now consist of books that have shipped out that have either A) sold to customers or B) are still sitting on bookshelves and by this time are less likely to be returned to the publisher.

    It’s never been an exact science, which always made doing restock predictions for major accounts a little tricky. (I’d like to think I got pretty good at it!)

  15. There’s a couple factors for the Bookscan numbers, once you back out the DM:

    1) Do graphic novels sell disproportionally higher in the independent bookstores that don’t report to Bookscan?
    2) How much of that discrepancy is library sales?

    I suspect the library sales for the better reviewed stuff is a big part of that. You could make a really rough estimate on WorldCat of what that market is like, but I’m not sure what the % of actual library sales would be. The last time I had a look at WorldCat, it seemed like a well-reviewed GN could have 1-2000 copies in the library system, IIRC. Particularly if it was of a literary bent. And graphic novels are more popular in the libraries now, so the number is probably higher.

    I would suspect Mr. Ottavianni, in particular, has good library sales. Let’s see.

    For Feynman, that’s 863 libraries carrying it. Some will have multiple copies. Chicago has 3. San Francisco has 10. NYC has 25 copies.

    Without doing a ton of digging, that’s already got it to just under 900 confirmed copies. Likely over 1000. Possibly significantly over 1000.

  16. Such a breath of fresh air to read Colleen’s and Jim’s thoughts on this — the emperor, in this case, has no clothes. It’s ridiculous to me that we rely on Bookscan as if it were accurate but short by a consistent percentage (thereby encouraging people to ignore its missing data). Bookscan doesn’t record every book’s sales minus 25% — that would mean every book sells equally well in every channel and every retailer (where’s Nate Silver when you need him?!). For a title that has 40% of its sales in libraries and 30% in comics stores, Bookscan is already down to 30% before you even start to subtract the data from the who-knows-how-many bookstores who don’t report to Nielsen.

    We have this idea that the numbers are slightly but predictably off, but they’re ORDERS OF MAGNITUDE off, plus wildly inconsistent from title to title and publisher to publisher, making any decent business analysis and reporting completely impossible. At Papercutz we have series that have sold 1,000,000 copies that show up in Bookscan with 60,000 units sold. I mean, even if you take out a big Scholastic order and Wal Mart, that’s still 400,000 copies unaccounted for.

    And it’s not true that Bookscan numbers have no weight — in marketing we have to include comps in our internal sales materials. These are Bookscan numbers for books that we think are similar to the book being pitched. But good luck finding the right comp there! If I need a middle grade fantasy graphic novel that has sold 5,000 copies, I’ll find instead a bunch of great comps that supposedly only sold 300 copies, and then Amulet and Bone, which have sold way too much. This seeds doubt among the sales reps, lowering our announced print runs, and then the spiral. And that’s just my part of the job; I’ve been in acquisitions meetings where a book was passed over because Bookscan numbers for the author’s previous sales were too low. We really need an alternative ASAP. I think journalist surveys of publishers’ sales numbers like PW does is a good start.

    Oh, and I thought my list of things I love about Jim Ottaviani, my favorite comics writer, was complete. Then I learned that he footnotes his blog comments. Love that guy!

  17. So why does anonymous internet troll post straw men comment to pick fight where there isn’t one as far as I can see? Grow up, you ass.

  18. Sure, it’s true that mainstream newspaper reporters and other ill-informed people state that BookScan covers 75% of the book market as if that means all kinds of books everywhere — but why should we listen to ill-informed people?

    BookScan only covers the US, for starters — many books will have a substantial life outside of that territory. And it always has been very clear about which accounts report to it and which don’t — it covers most of the largest retailers of books (Wal-Mart is joining as of the beginning of 2013), but does not include the mass of small bookstores, nor does it cover all of the other outlets that sell books sometimes, or sell a few specific niche books. And, as others have said, it doesn’t do a good job of covering sales to libraries, either.

    So there are publishing areas where BookScan is less useful: craft books, some travel categories, anything that gets picked up in large numbers by non-traditional retailers (like those books in Starbucks), and, yes, graphic novels. What it *does* cover well, not surprisingly, are the core areas of major publishing: fiction and narrative non-fiction, the kind of books that dominate sales and keep most of the big publishing companies afloat. For the vast majority of those titles, the BookScan rule of thumb (add about 25% to account for libraries and indies) works well, and it’s thus the only way publishing people can accurately gauge how books from other publishers are doing. (Nobody in publishing uses BookScan to check out their own books’ *real* sales; we all have internal systems for that.)

    The smart publishing people who work in the areas that are less well covered by BookScan know that (and complain about it, of course, since it makes their jobs harder), and they make allowances for those other markets. (In the case of the DM, relatively accurate numbers are available separately from Diamond, so anyone looking to sign up a comics project with any hope of hitting a comics-shop audience would have to be an idiot not to check the numbers in that channel. I’m not claiming that those idiots don’t exist, of course.)

    I think what Colleen Doran has really unearthed is that there are stupid, badly informed people — some of them (at least momentarily) in positions of power at publishing houses. (And many of them writing for major newspapers like the Globe & Mail.) Those people don’t understand what BookScan really measures, and use it wrongly. That’s not the fault of the tool; it’s the fault of the idiot using the tool.

    Since she came out of the DM world — and most of her projects are focused at that world — it’s only to be expected that they would sell primarily in that world. Since BookScan does not, and has never claimed to, include sales figures in the DM, her numbers are vastly more inaccurate than in the genres that BookScan was made for. That doesn’t mean that anyone is lying or misrepresenting things; it means that BookScan is a very bad tool for calculating the sales of products that mostly sell within the DM. And we all should have known that already, since that’s not what it measures.

    This all seems analogous to looking at a thermometer in your living room and complaining that it doesn’t tell you how cold it is outside.

  19. This may be obvious, but it’s entirely possible (even likely) that Bookscan captures 75% of all book sales in total, and still be way off for any individual book (or segment of books) depending on the sales of that book (or segment) through any non Bookscan captured outlets.

    I haven’t been in publishing for more than 6 years, but back then the main thing our imprints used the data for was to compare our sales against our competitors, and to look for any “me too” publishing opportunities. The underlying assumption in the analysis was that a competitor title would be represented proportionally across all outlets as our title. Probably fairly reasonable for consumer computer publishing at that time, but as Jesse Post points out above, not at all reasonable for other segments in other times.

    Also, Jim Ottaviani rocks.

  20. Todd: Thanks for pointing that out regarding FEYNMAN. The percentages I quote here don’t include that book, though, since (a) it’s (happily!) an outlier for me, (b) I don’t have the breakdown of the sales to do any analysis, and (c) even if I did, it’s not my place to share them. Regardless, I suspect you’re right that it has sold a bunch of copies to libraries, and I’m delighted about that.

    Jesse: You made my day, so thank you. I’ll keep footnoting. (But not here…sorry!)

    Andrew: Thanks for this. I still question the 75% coverage reported about BookScan, at least in part because I’ve never seen it verified. Just repeated.

    I’ll buy it for (quoting you) “core areas of major publishing: fiction and narrative non-fiction” but note that there are three qualifiers there: core areas, major publishing, and fiction and narrative non-fiction. So when you say a few lines later that it’s the “only way publishing people can accurately gauge how books from other publishers are doing” I question the use of the word “accurately”! Sorry if this sounds like I’m picking on a specific choice of words, but I point it out because it’s the way BookScan numbers seem to be used all the time, by everyone: “You can’t really trust the numbers in this context, but hey, look: Graphic Novel X sold much better/worse than Graphic Novel Y (or John Grisham Novel Z). Publisher X is no doubt very pleased/disappointed, etc…”

    Regarding your discussion of the direct market, I’ll repeat myself here briefly, with a clarification to make up for the missing context: BookScan is capturing only 25-50% of the non-DM [and non-library] sales on my books. That’s a long way from the alleged 75%.

    As mentioned above, my books are not particularly DM-oriented. I think Colleen and Jesse would say the same about many of their titles as well. So it’s not just the DM that’s producing these huge error bars in our BookScan numbers.


  21. Re the 75 percent claim:

    Articles routinely state that BookScan tracks about 75 percent of book sales, but some say that BookScan is used by about 75 percent of book vendors. In a 10/28/11 press release, Nielsen stated:

    Nielsen BookScan, which monitors the English-language book industry worldwide, gathers point-of-sale book data from about 12,000 locations across the U.S., representing about 75% of the nation’s book sales. Print-book data providers include all major booksellers and Web retailers, as well as food stores (excluding Walmart and Sam’s Club). E-book data providers include all major e-book retailers.

    Meanwhile, there were reportedly about 2.57 billion “units” sold in 2010. Twenty-five percent of 2.57 billion is a big number–but a publisher should know what outlets are selling its books.


  22. @Dean — Exactly right: the ratios between two publishers will always be different, depending on the title. Marvel could sell a bonanza of books during an author tour to stores that aren’t reporting, and Archie could get a huge win at WalMart. The 900 Archie units Bookscan reports could have been 273,785, while the 5,000 Marvel units were actually 6,712; and then, maddeningly, the next two compared titles will have a different level of inaccuracy.

    @Andrew — You reminded me about foreign sales, often a huge part of a book’s (and a publisher’s!) success, yet for some reason the publishing trade press (including comics blogs) don’t seem to care too much about it. Hollywood trades often cover a film’s foreign gross receipts right in the lede. Not Bookscan’s fault, just weird.

    Also, I don’t think Bookscan is more accurate for some types of books because the problem of not all missing data points being equal remains. If you’re trying to understand how the latest Michael Collins book sold (which is mass market fiction, one of your qualified safe areas), not having WalMart and airport sales really puts a dent in that. Comics suffer the same potential for misrepresentation as Stephen King novels. And, as Jim points out, not every graphic novel dominates in the DM. It’s our weakest channel, to the point of being marginal (and I’m working hard every day to fix that, honest!). As a representation of our general trade sales, it fails.

    The idea that Bookscan is a generally useful but not-100%-accurate guideline is commonly held, but it’s not even a useful guideline; it’s just random data. It’s like looking at a thermometer outside in the snow, reading 34 degrees, shrugging and saying, “close enough,” then looking at your neighbor’s thermometer and seeing 86 degrees. It’s a mess.

  23. Thanks, Synsidar. I should have tried to find that myself. (And thanks for the compliment, Dean!)

    Taking Nielsen’s press release at its word and combining it with what Andrew tells us, I’d bet that BookScan is indeed useful and accurate in what Andrew called the core areas of major publishing: prose fiction and narrative non-fiction. If I had to bet, I’d wager that they now capture even more than 75% of sales in those two specific areas, but either way any under- or over-count is probably consistent enough to make BookScan’s numbers useful for estimating actual, total sales in those areas.

    What that implies, however, is this: Since prose fiction and narrative non-fiction sales dominate the Nielsen charts[1], then they can miss a whole lot of the remaining sales and still claim an overall accuracy of ~75%. Even if their numbers are wrong for e.g. graphic novels.

    To paraphrase Jesse, outside of prose fiction and narrative non-fiction, BookScan’s numbers can be both wildly and, even worse, inconsistently inaccurate and still not affect Nielsen’s overall performance.

    Since this matches my experience and the only actual data we have in this discussion, I of course assume that this line of reasoning is correct, and that this is what’s happening! But unlike above, where I talk about numbers I have a lot of confidence in, I am making a bunch of guesses and assumptions here, so join me out on this limb at your own risk!


    [1] Looking at last year’s Nielsen numbers, they break adult fiction into 12 types (which includes “graphic novels” as a single lump; no distinction between Joe Sacco and Brian Bendis, apparently!) and non-fiction into 16. Juvenile has 8 types for each. My best guess is that in total, ~419M books sold and tracked by BookScan fall into Andrew’s two categories. Out of a total of 591M sold last year, that’s ~70% of the market Nielsen tracks. (It may be higher — I tried to be conservative here.)

  24. Jesse:

    “The idea that Bookscan is a generally useful but not-100%-accurate guideline is commonly held, but it’s not even a useful guideline; it’s just random data.”

    That seems like a fairly dramatic overstatement — it appears to be generally useful data in the context of what the data is: sales at Amazon, B&N, Hastings, etc. I’ve yet to hear a compelling case that it isn’t accurate within its own data-set?

    To me this would be like arguing that the Diamond data is “useless” because, man, all it counts is comic book shops.

    For the value of publishers-who-publish-comics, it seems to me that [BookScan + DCD] is going to be reasonably close to the actual retail sales of most GNs — it’s going to be very far off in some categories (like… books that are listed in Scholastic Book Fairs, for one example) — but your stone average GN from most publishers doesn’t seem like it has a tremendous number of sales opportunities outside the major chains, Amazon and the DM? Or am I wrong?


    Looking back, you’ve made two statements that I can’t jibe together exactly. First, you said that “if I assume no direct market store is included in Bookscan numbers their accuracy improves a little, but they still never get closer than 40% of actual sales. “, then you said “But taking into account the numbers that do look reliable, library sales account for 25-30% of the total, non-direct market sales I can track. So, using round numbers, even if you call it 30% (to give them as much benefit of the doubt as possible) BookScan is still capturing only 25-50% of the non-DM sales on my books. ” Those two numbers don’t seem to add up, exactly?

    Sorry if it seems like I’m asking dumb questions!


  25. Hi Brian,

    Again, not dumb! The reasoning and math looks like this:

    First, the direct market and that 40%: Say I sold 1000 books this year. Without correcting for the DM, I said BookScan reported 13-26% of actual sales. Let’s use their best performance (26%), so that’s 260. I do know DM sales, and they’re 10-15% of the total over all sales channels for most titles. (This varies, both by title and by whether I’m looking at a new release or a backlist title, but that’s the average.) Let’s give BookScan the benefit of the doubt again and use 15%, so in our 1000 book example, that’s 150 sold to the DM. So, BookScan can’t be expected to know about those, meaning the total sales they should have counted is 1000-150 = 850. They reported 260, and 260/850 = 30%.

    So why did I say 40%? Well, it’s true that all books are different, so if you want to be charitable you can say I gave BookScan further benefit of the doubt by rounding up generously. Or, you could say that I made a math error, which is what really happened. Sorry about that, and the resulting confusion!

    Now, libraries and the final 25-50% figure: Again, say G.T. Labs sold 1000 books last year and I sold 150 of those to the DM, so the non-DM sales BookScan should have reported was 850. Some of those remaining books were sold to libraries, which you wisely prompted me to correct for. My estimate for library sales is 30%, or in this example, 300 books. That leaves 850-300=550 copies sold last year that BookScan should know about.

    Going back to the 130-260 copies BookScan says were sold, do the division (130/550 and 260/550) and you get a final result of BookScan reporting 24-48% of what it should know about, which I rounded up to 25-50%.

    Hope that makes sense!


    p.s. Since it was addressed to him I should probably let Jesse answer this, but I’m going to give you my take on this statement anyway:

    “To me this would be like arguing that the Diamond data is “useless” because, man, all it counts is comic book shops.”

    Not really. What we’re saying is BookScan somehow misses a significant percentage of sales it purports to know about.

    Never mind calling out the specific retailers they name. They say they’re capturing “about 75% of the nation’s book sales.” We’re saying that when we look at numbers we know are reliable, we find they capture much less than that and that BookScan’s numbers aren’t consistent enough — they miss 50% to 75% of my sales — to even compare sales between one graphic novel title and another.

  26. I think when Bookscan says they’re accounting for 75% of the book sales, they’re not factoring the DM and comics into that. Granted, that’s a sector that’s grown in the last several years, but its a small piece of the bookselling pie.

    I’ve always liked to refer to the Diamond estimates + Bookscan as “confirmed kills.” You can add in the Worldcat if you like. You can bump the Diamond estimates 10% (for what actually shows up in the top 300) and you can probably bump the Bookscan numbers 25% if you want to get a little closer. I’m not sure what a realistic adjustment for bookscan would be.

    The sales channels are fragmented and that’s the data we have to work with. Comics aren’t a genre, so you’re probably not going to have similar percentage sales in the unreported part of the market for Spider-Man and Maus. And children’s comics are an entirely different world.

  27. Yes, precisely what Jim said; Booklist isn’t as accurate as it should be, given the channels it represents (I’ve been avoiding wild guesses until now, but I do think it’s because of the many bookstores missing from the “hundreds” reporting in.) I’m not doing any of the actual math Jim is with his books (though maybe I should when I’m next in front of a real computer), but when you take away all our non-Bookscan channels (DM, school and library, Walmart, Scholastic, and direct sales) Bookscan is still missing our remaining sales by miles, depending on the day.

    Or sometimes not, and that’s the second hit against its usefulness: sometimes, depending on the title, Bookscan’s missing data points MATTER more. I’d say 90% of our Three Stooges books sell in school-and-library, a good portion of Classics Illustrated moves through educational wholesalers, Ninjago kills at Walmart, and Smurfs do fantastically well in comics stores. That’s the randomness of the data — missing data on those channels for Nancy Drew doesn’t matter much, but it matters big for Three Stooges. How is a passerby to know the difference when researching sales figures? And if that’s the point of the service — to help researchers understand a book’s performance within a margin or error — I’m not sure how it helps to have a data hole of unknown size. Now, I grant you this may not be a gross failure of responsibility on Bookscan’s part since they have never claimed to capture everything, but I do wish we consumers of the data understood just how deeply wrong the figures can be, rather than thinking of it as “Mostly right, minus Walmart.”

    (By the way, I’m not sure most folks realize just how drastically an order from Walmart’s 3,700 stores can be — missing data from a chain that can quintuple B&N sales is a really, really big miss!)

    With DM numbers, I at least know they’re capturing data from 100% of the stores that have a Diamond account, and then they’re skewed by that weighted-average-against-Batman thing that I’m too feeble to understand, which at least consistently skews every book the same percentage. I actually think ICv2 numbers are super helpful and a model of how Bookscan should work.

  28. @Brian — School and Library would be the other big channel for an average GN (if by average you mean straight-up literary fiction and narrative nonfiction GNs, then they have the same sales opportunities as prose, plus the beautiful bonus provided by DM comics experts). Mass market friendly media tie-in GNs also have the drug and grocery channel, on which Bookscan is mum.

    Of the ones that are left — Mass Market chains and Bookstore Trade — I think the data problem lies in the “etc.” you mentioned. Bookscan may address those channels but not all the stores/chains within it.

  29. This is a great discussion, thanks Jim and Jesse, but it’s going to scroll off the front page any minute now.

    Ultimately, I think that sales to consumers are dramatically different than sales to libraries/schools, and should not be conflated, despite “a sale is a sale” from the publisher side.

    I too wish we had fully consumer data, but since we don’t, I’ll take partial, flawed data any day of the week.


  30. This is one of the areas I’ve been digging into since my days at Brodart Co(library wholesaler). My quest began with determining the formula for getting a book onto the NYT Bestsellers lists which can be quite the task. This led me to Bowker to see if I could get an accurate count of how many new titles were registered in the graphic novel category each year. This led me to the Book Industry Studies Group, and the Association of American Publishers, neither of which has graphic novels officially recognized as a category. Even when I was at Diamond we had pretty good data but we were also aware of gaps in information coming from the traditional book trade market-especially when it came to the educational market.
    As you have all figured out by now BookScan is a marketing tool run and promoted by Nielsen and since they are the only reporting source covering big box retail, they dont seem to have any problem ignoring challenges to the accuracy of their data. It’s basically the same sort of smoke and noise that Bezos’ uses when he talks about the incredible sales of the Kindles and oddly enough, I’ve only seen three Kindles on any of the planes I’ve flown on since these were produced while iPads were everywhere. So, BookScan and Kindle are quite a bit like PT Barnum in the way they present ‘factual’ information.
    As for the way DM figures into the math, in general, traditional publishers pretty much overlook comic shops as channel because they are viewed as a niche market that speaks a totally foreign language. They, as Jim and Colleen have shown, hold the overall numbers close to the vest. If you are a publisher/author like Jim, you learn the best ways to sort these things out. Baker & Taylor, Brodart, Ingram, and Follett all provide their sales data to the pubs and the distribution partners and that is probably going to be your best source for the real numbers.
    A couple of the ‘outliers’ I tend to study is the publisher’s catalogs and their booth displays at the trade shows(Book Expo America, American Library Association, National Council of Teachers of English…) and I’m seeing a pretty strong increase in the number of graphic novel titles publishers are adding each season. I also canvass the booth personnel to see how strong their knowledge is. It used to be “huh-oh, these things?” where now they can tell you what genres and age ranges they have in the graphic category.
    So for me, I study the Diamond reports, Amazon, BookScan and also take into account stories in the book trade publications that talk about overall sales or initial print runs for books like Smile, Bone, Diary of a Wimpy Kid-even the print runs for GNs by James Patterson and Stephanie Meyer help to inform the picture as their successes create more of the Me Too reaction by the traditional houses.

    Basically, it’s a long response saying that BookScan aint even close as it actually represents a much smaller snapshot of the overall market.

  31. Hi Brian,

    “Ultimately, I think that sales to consumers are dramatically different than sales to libraries/schools, and should not be conflated, despite “a sale is a sale” from the publisher side.”

    I accept that they’re different. Retailers don’t make any money, for one thing, and we both like it when retailers make money! (I’m not just saying that: the only store I go into more than once a month is my local comic shop. I want it to thrive, and to be there next month.)

    But why do you still think we’re conflating these? I’ve corrected for the library/school sales above. I’ll repeat the results, hoping that the third time is the charm: for my work, BookScan captures only 25-50% of the sales to consumers it purports to capture. Jesse has noted that he thinks he’d see similar results if he ran the numbers…and he’s dealing with bigger sales numbers and many more titles than I am.

    “I too wish we had fully consumer data, but since we don’t, I’ll take partial, flawed data any day of the week.”

    Why? I don’t know what you think you’re learning about graphic novels from it. Jesse and I (as well as Colleen and others…and now John, speaking from the distributor side) have pointed out that there’s every reason to believe that the data is worse than flawed. It’s unpredictably and inconsistently inaccurate.

    I’ll take one last stab at illustrating what that means, via some made up health numbers, using my experience of BookScan capturing only 25% of actual retail sales as my analogy’s benchmark.

    So, you go to the doctor’s office and they tell you this:

    * Brian, your weight is 100 lb. … though it may be as high as 400 lb.
    * We measure your height as 2 ft. … though you may be as tall as 8 ft.
    * And hey, your cholesterol = 100 mg/dL … which is great, though it may be as high as 400 mg/dL, which is really (really) bad.

    Do I need to go on? (Note that the exact same numbers apply to me, and we both know that I’m shorter and scrawnier than you, though our cholesterol levels may be the same, for all I know.) Anyway, if your doctor told you the above, would you trust any subsequent information you were told, much less a diagnosis?

    I know what I’d do, every day of the week.


  32. Sorry, Jim, that was mostly in response to Jesse’s second post in front of my last one: “School and Library would be the other big channel for an average GN “. These ARE sales, no doubt, and they (yay!) make money for publishers, “complaining” that a retail sales chart doesn’t include them doesn’t seem valuable in a situation where people (or at least those who make the decisons) should know we’re talking apples-to-apples. As the person who tries to generate the BookScan list each year, I work really really hard to make sure people understand the data’s limitations and sources — Christ, my boilerplate seems to grow by 2-300 words every year I’ll probably be referencing this thread this year, as well!

    “Why? I don’t know what you think you’re learning about graphic novels from it. … there’s every reason to believe that the data is worse than flawed. It’s unpredictably and inconsistently inaccurate.”

    What I think is that there’s some relative things that can be learned from any slice of data, as long as the observer is aware of the limitations of that data. As a for example, I recently posted the top 100-ish books of 2012 for Comix Experience. I think this is interesting and valuable data, though it doesn’t begin to encompass “sales of GNs in the US” or in California, or, Christ, probably even in San Francisco. It’s probably even off by significant numbers for “the Haight-Ashbury” because Booksmith (an indy book store) almost certainly sells a reasonable amount of GNs, and on 1-2 titles, most likely topped my sales significantly.

    I don’t know, maybe because I spend too much time thinking about it, looking at it, sifting through it, I don’t exactly look at sales data for declarative truths (well, any more at least — I used to a lot 10 years ago!), but for the underlying trend of action. But then, I’m looking at multiple streams of data, including from my own individual walled laboratory.

    There: perfect example of the value of BookScan to me — each year, when I write the report, I find 10-20 books that I wasn’t aware of, either literally, or from pure sales potential. “Oh, Amazon and the chains are selling a lot of [x]? I should get behind that book more”. I also discover things that (gasp! Shock!) Diamond never even carried in the first place. Other retailers have also told me they do the same when they see the data like that.

    Finally, I think that even if we’re talking about wildly different and swinging percentages from book to book (which I still think, if we were fanatically scrupulous in talking about exactly the same way of measuring and reporting that measurement from observer to observer, still only truly applies to a relatively small number out of the ~30k SKUs out there), I truly wonder if the actual number that we’re discussing is a meaningful enough one to draw a different conclusion from in the first place.

    By this I mean, I looked up a few titles discussed in this thread, and we’re sometimes talking about BookScan numbers of a few hundred copies. Even if you quintupled the BookScan number, you’re still talking about a sales band of “nearly too small to sweat”, because any annual number below maybe 3k is almost certainly indicating books without broad-based support, interest, or awareness, where a significant number of those copies spend 11 months of the year keeping your rack warm, or are otherwise simply “that’s what we got special orders for this year.”

    I mean, in a field of 30k comic book related SKUs, I’d be stunned if even 10% of them *actually* were racked in more than the barest handful of venues, and that for a goodly majority of them Amazon and/or special order pretty much IS the sole means-to-market. (certainly, I only have about 8k SKUs at this point, and I’m “a bookstore that specializes in comics”, because I’ve moved to insisting a book turns once every 18 months) On the 2011 BookScan full chart, there are nearly 18k (!) SKUs that have reported sales of 100 copies or less (!!)

    The biggest “solve for x” problem I see is figuring out just how many independent bookstores that there are — I can’t find anything that feels even vaguely reliable to me. Physical chain bookstores locations are now, I think, down under 1000 — Barnes & Noble has, I think 700, and BAM under 50? But are there 500 indies? 2000? 10k? More? Is there any way to know, other than counting on the equivalent of a yellow pages? That has a pretty significant difference and implication for the market, and, frankly, how one markets books.

    (I mean, SF in 2013 has 16 bookstores, or so Yelp tells me — but not a one of them is a chain. Downtown real estate is WAY too expensive for bookstores. I’m told Borders was paying well more than a million dollars a month in rent for their Union Square location, nutsy!)

    (One last parenthetical — 2013 SF also has 12 comic book stores, is it *possible* that there are an equal number, or possibly superior number of “comic book stores” as there are “bookstores”. Freaky.)

    I really wish a few publishers would consistently track the same data points in the same way, correct for other markets, and publicly talk about specifics of the “missing” sales so we could at least try to understand the extent of what we’re missing in a systemic way. I’m still not certain that even you and Jesse are approaching the data in the same way, and you’re the only one dealing in even somewhat specific numbers.

    Hopefully I’m getting the 2012 BS numbers in the next week or two, and I will be keeping all of this thread in mind when I write my annual report.


    Anyway, I’m babbling now.

  33. Thanks for the clarification, Brian, and it’s great to hear some of the intangible — or at least not obvious to me until you spelled them out — benefits you and others get out of your analysis of the BookScan numbers. I hadn’t thought of those, and they make good sense.

    I’ll still wince, and maybe even complain out loud, if you conclude your BookScan analysis with claims that I don’t think the data are good enough to support, but honest and true: I look forward to reading it, as always!


    p.s. You’re too polite to say it, but some of those low sales numbers you saw were probably for my books. That’s in part because G.T. Labs has been all backlist for the last few years as I focused on writing for larger (much larger!) publishers. Those new projects have done well, but a rising tide hasn’t lifted all my metaphorical boats as high as I’d like. Most titles do make money, but have not yet made me and my artist collaborators rich. And that’s the other part of it. I don’t think my books are tailor-made for huge mass market appeal and getting rich. (Your typical comics reader: “A graphic novel biography of Niels Bohr. Awesome…I mean, wait. Who?”) I don’t find this shocking, and even anticipated that, when, in 1997, I wrote the company tagline: “Comics about scientists? What a crazy experiment!”

    I mention this not just in the interest of transparency, but because I think it actually helps the argument re. BookScan’s accuracy, in the sense that the folks who’ve shared their experience exhibit good diversity. My self-published (non-fiction) graphic novels are midlist/low-midlist. Colleen’s (fiction, with big name creators involved…in addition to hers!) do much better, selling in the tens of thousands. Probably higher, for some titles. I hope so, anyway! And Jesse’s (which include well known licensed properties) in some cases sell over a million. We’re all seeing the same unpredictable, but large, inaccuracies, though.

    And now I’m babbling too, probably!

  34. Let me just say that I’ve enjoyed the “babbling” on all sides, and am looking forward to Brian’s annual look at the BookScan numbers even more than usual. Thanks for the thoughtful discussion, everyone!

  35. Hey Brian,
    I checked the American Bookseller Association’s website http://bookweb.org/index.html
    and see they have 1760 bookstore owners/members as the audience for their marketing efforts. Keeping in mind that quite a few of these owners have multiple locations like Books and Books in the Miami/South Florida area, Powell’s in Portland, OR, or Anderson’s Bookshops in Chicago land. I’ve sent a query to see if I can get more definitive data on this.
    Also, you may want to consider adding the ABA’s bestseller data to your resource list as there is more loyalty in their reporting (from their members) than you will get from BookScan.
    I agree with your position that BookScan, as poor as a source as it may be, is still a source.
    Jim, your analogy is freakin hilarious! I’ve always hoped to be around 6’2″ and 185 but my doctor keeps telling me I’m 5’8″ and ‘around’ 185.

  36. Totally agree that this is a great conversation! And Brian’s last post made me realize that I don’t even know what my offered alternative solution would be — it’s true that sales numbers will never be totally accurate unless publishers publicly post their data every week, which would have a terrible, terrible impact on trying to bring new books and authors to market. But I do think that somewhere out there is a way to get the numbers more REASONABLY accurate, and a lot of that could come from Nielsen improving their reporting, maybe incentivizing big chains and retailers and distributors to report — I don’t know, I’m armchair quarterbacking a part of the business I know nothing about.

    On the School and Library question, Brian, I guess the answer to “Why does that data even matter?” depends on what reason you have for looking at the data in the first place. If you’re trying to measure a book’s success, audience reach, and the general health of a publisher, I’d say that channel is a huge factor (especially for kids’ comics, my area — Lerner’s comics sales are almost entirely from this channel). Library circulation is a way to measure a book’s popularity with its audience, and the more books circulate, the more they’ll be ordered. That’s why libraries order 40 copies of the new Stephen King book instead of one or two. I guess the main question library sales don’t answer is, “How is the bookselling industry doing?” But I think most people use Bookscan as a kind of scorecard for the giant football game of the publishing industry: “X publisher is so much stronger than Y publisher in Z category,” or “X author’s sales have really plummeted since moving to Y publisher,” etc.

    I also agree with you that a lot of the beauty of this data is in the eye of the beholder, and thank you for pointing that out. If people are USING it correctly (and I think the situations you described are correct), Bookscan data is certainly useful as a kind of hint at things, something that removes a little bit of the smudge on the window. Our Ninjago series (the one that’s sold over 1M copies) is certainly “generally doing well” with its 118,000 copies reported in Bookscan, and if that’s the question being asked I think it’s fair to say Bookscan delivered the correct answer. It’s a high enough number to keep us on the best-seller lists and interest retailers in promotions, etc. A book that has 300 copies reported could, in some cases, be “generally not doing well.”

    But even by this measure, I think, the numbers can be misleading. A book that appears to have sold 200 copies may not have actually sold more than 3,000 copies, but a book that appears to have sold 2,500 copies may have sold over 10,000, and 10,000 is enough to be a best-seller these days for literary fiction. Missing data points will never turn a long-tail book into a blockbuster, but the missing points are so unpredictable that ALL the numbers should be questioned, and the unpredictable inaccuracy should be taken into account by anyone trying to answer a more detailed question with the figures.

    @John — you are a data superhero! I feel bad that you have spent all this time trying to reverse engineer formulas and compile data from a dozen different sources, and I can’t even be bothered to open up a new tab in my browser and look up some sales figures to support this discussion. You win!

Comments are closed.