I’ve recently been harking back to my roots as a veteran Business Intelligence architect, which was pretty much the entirety of my working life before joining Bibliocloud (leaving aside the brief Merchant Navy adventure, MFI warehouse work, and extensive Baskin Robbins experience, all of which threatened my life in one way or another).

And something that has been nagging at me for the past few years is how best to make the most of the data that we are storing for publishers. What insights could we deliver beyond the usual sets of static reports and (shudder) pie charts?

One challenge to overcome is the wide variation in the data that publishers store in the system. Clients using the royalties functionality are pretty well placed – in general they will have good quality book metadata, contact information, sales, advances, and royalties calculation results. That’s definitely something to work with.

For clients who do not need royalty functionality, or who have not got to that stage in their adoption, we might have inventory data fed directly from a distributor from which some estimation of sales might be gathered.

Or we might simply have metadata – products, prices, subject codes, publication dates, contributors, audiences, etc. That’s not too bad – we can visualise the publication schedule in terms of how many works and products are in the pipeline in comparison to historical trends, and dive into metadata completion.

The second challenge has been to find the right technology for the task. In the initial stages of exploring functionality I do not want to spend the time setting up a server on AWS, and performing data migrations. So I’m looking for a price-and-time-economical starting point that has the potential to scale later on, and needs the minimum of data preparation.

I settled on Tableau Desktop Personal Edition, which is currently priced at USD35 per user per month, and is an easy install (on Mac at least – I don’t see it being harder on Windows). Tableau also provide a free Reader application, which can be used to explore data exported as a package from Desktop, allowing me to try out some data analysis and share it with clients.

Lastly, how to access data? At Bibliocloud we are rolling out very fast and efficient CSV-formatted downloads from the system for all clients (I just dropped 200MB of daily inventory data to my desktop in about 30 seconds). And a CSV format, done correctly, is a great start.

Onwards, to the visualisations …

I looked at sales data for a client who has implemented the royalties functionality, and has loaded aggregated sales data for quite a reasonable historical period of about eight years.

Here is a very small taste of the work so far.

In What Part Of The Year Do Different Formats Sell?

Looking at the seasonality of sales by product form, the stand-out here is the relatively high level of hardback sales around the Christmas sales period – November sales are three times the June sales. Is that a surprise to anyone? We all know that everyone loves a hardback for Christmas.

Screen shot 2017 11 15 at 12.58.58

And it’s trivial, of course, to look at this by subject, and to look for confirmation that biographies of football managers might be more prone to this effect than romance fiction.

On the face of it, that looks like an open-and-closed case, but there’s quite a list of caveats with this. Not the least of those is which month of the year the products are most likely to be published in.

When Do We Publish Different Formats?

Screen shot 2017 11 15 at 12.59.08

Which would not be relevant if it weren’t that the post-publication sales profile for hardbacks was so peaky…

Do Hardback Sales Decline Faster Than Paperback?

Screen shot 2017 11 15 at 12.59.19

Yes, they do. And that has interesting implications for inventory management – how closely do individual product sale profiles cleave to this average, and how much inventory are you holding ten months after publication?

And do we deliberately publish Christmas-friendly hardbacks in October?

More analysis required…


And just jumping sideways for a moment, here is a simple-to-generate total sales forecast.

Screen shot 2017 11 15 at 13.04.23

Note how the linear trend and seasonality have been picked out here.

Early Learnings

So here are my conclusions with this mini-project so far.

Firstly, having thought about the range of publishers and their data that we currently store, data exploration of this kind is going to follow a very particular path for each of them. We have non-GB based clients with only a peripheral interest in BIC codes, or who only publish eBooks, or who publish only a few very high value products, and others who have to consider both trade and academic audiences (and who consequently have very different predominant sales channels for different works and products).

Secondly, systems such as Bibliocloud are sitting on an absolutely vast range of data that should be very amenable to analysis, covering metadata, data sharing, sales, inventory, royalties, and when you combine these data sets together you have extraordinary potential for insight. At what stage prior to publication are you sending ONIX to different partners, and how often do different products have data changes? What’s the correlation between values of high-level BIC and BISAC main subject codes? And so on.

But make sure you are using a system that enables you to access that data.

Thirdly, it is entirely possible to do quick and efficient Business Intelligence exploration projects, when you have found the right tools. I’ve enjoyed working with Tableau so far, and the range of free learning materials on their support site has been excellent. If you’re trying to do non-trivial analysis in Excel, stop right now – we have at least two clients who are already using Tableau successfully, so find some time on your schedule to go and get yourself a trial.

And finally, no pie charts.


    Most popular

  1. Ruby code and why you should care
  2. A quick look at data visualisation and analysis
  3. Learning how to code, the long way around
  4. It's us in the industry who need to be able to code
  5. Menial publishing jobs are destroying our future
  6. A manifesto for skills
  7. Company news

  8. New website
  9. 2018 Customer survey report
  10. 2017 in review
  11. Prizes galore
  12. And now we are five
  13. Sara O'Connor to join the team!
  14. Product news

  15. 'Continuing to solve real problems': Futurebook 40, London Book Fair 2018 and the Works page
  16. How many authors is too many?
  17. Better ONIX fragments
  18. Advanced advance information!
  19. Schedules page
  20. Publishers hack their own bibliographic data
  21. Case studies

  22. Burleigh Dodds Science Publishing
  23. Zed Books
  24. IOP Publishing
  25. Code

  26. What publishers need to know about Ruby on Rails
  27. A publisher’s guide to APIs
  28. A day in the life of a programmer
  29. How APIs can make publishing more efficient
  30. eCommerce

  31. To go direct, publishers must mean business
  32. Why publishers must use direct sales
  33. Inbound marketing
  34. Don’t outsource your publishing business away
  35. Who has the balance of power over data?
  36. The business case for going direct
  37. ONIX

  38. A hidden benefit
  39. Thema Subject Codes Update November 2017
  40. ONIX. Not very standard
  41. Three ways to do more with ONIX
  42. A non-technical, beginners’ guide to ONIX for Books
  43. How to create a catalogue automatically using ONIX and InDesign
  44. ONIX Changes
  45. BIC, Thema and artificial intelligence...
  46. Skills

  47. Publishers can learn a few things from programmers
  48. Mechanical sympathy
  49. A taste of code
  50. Embrace the code
  51. Strategy

  52. The right tool for the job
  53. The search for publishing's holy grails
  54. Decisions, decisions
  55. Rejuvenation
  56. No computer system can fix a broken publisher
  57. Responsibility, Authority, Capability
  58. Start with Why – How to refine your publishing mission
  59. Creative industries and the division of labour
  60. The real price of a strategy shift
  61. Technical debt
  62. Why ‘easy’ publishing solutions hardly ever are
  63. Five things I've learned since moving into enterprise product management
  64. Managing expectations
  65. A company of one's own.
  66. Sometimes, size matters