The Digital Divide in Grocery and CPG Data

The Digital Divide in CPG Data

We’re delighted to have guest contributor Keith Anderson, vice president, strategy & insights for Profitero, sharing his expertise in ecommerce intelligence.  Please direct comments and questions directly to Keith. His contact details can be found at the end of this article. 

Ecommerce growth is exploding. But what data is really available to CPG companies accustomed to data-driven strategic planning and execution?

While total retail growth barely outpaces inflation, online retail continues to grow at 3-5x the pace. According to the US Census Bureau, in the fourth quarter of 2014 (the latest period for which data is available), ecommerce sales increased by 14.6% year-over-year, while total retail sales increased 3.7%.

Note: For the sake of this discussion, “online retail” and “ecommerce” includes purchases transacted online, including those picked-up at or fulfilled from a brick-and-mortar store.

  • Walmart and Target are accelerating investments in programs like online grocery pick-up and subscriptions for everyday essentials.
  • Amazon, ever-evolving, launched its national Prime Pantry program for shelf-stable CPG products and is expanding AmazonFresh, its full-basket grocery delivery model, to new markets.
  • Well-funded entrants like Instacart, Boxed, and Jet are poised to lead another wave of innovation.

Though grocery and CPG products are among the most under-penetrated categories online, some industry analysts believe that 2015 is an inflection point for the industry. A joint study released by IRI, Google, and Boston Consulting Group in August 2014 projects that online share of CPG sales will follow a “1-5-10” trajectory, from 1% of sales in 2014 to 5% by 2018, with 10% share a realistic possibility as growth compounds. Online growth will account for about 50% of total CPG growth over the period.

With around 1% of all CPG products currently purchased online, much of that share shift is still ahead. For context, the study cites categories like Toys, Sporting Goods, and Small Appliances, each of which has online penetration rates over 10% and whose online growth accelerated after reaching a “digital tipping point”.

Even within the broader grocery and CPG landscape, some categories are likely to accelerate online faster than others. Shelf-stable products with higher price-to-weight ratios are likely to migrate more quickly. Amazon is reported to be a top five customer of P&G’s Pampers brand, for example.

Other sub-categories like perishable products create logistical complexity that may lead to a different growth curve.

Still, markets like the UK and France and the growth of click-and-collect and models like Instacart illustrate the increasingly viability of online grocery, even for perishable goods.

A separate Deloitte study estimates that 29% of US grocery sales are digitally influenced, amplifying the impact.

Data: The Digital Divide

Despite the growth and assumed superior measurability of online retail, most CPG companies are at a data disadvantage online compared to offline.

Fundamental data on the total online retail market like category size, growth rates, and brand shares are largely unavailable at the level of detail and frequency that CPG manufacturers have come to expect for brick-and-mortar retail.

From a shopper data perspective, the situation is better – online behavioral panels (which monitor shoppers’ actual online behaviors through various tracking methods) are inherently more scalable than self-reported panels commonly used for in-store retail (which simply ask shoppers to recall their behavior). But these too have their limitations.

There are three types of data available now for assessing CPG online retail. I’ll cover each in turn.

1) Retailer direct data

As Sally explained here, retailer direct data is supplied to manufacturers directly by individual retailers and is typically limited to that manufacturer’s own products, sometimes with category totals.

Compared to retailer direct data for offline sales, there’s a lot of maturing still to do for online sales.

Ideally, all online retailers would provide daily, product-level data on unit and dollar sales (among other metrics. Unfortunately, some retailers only provide data aggregated to the brand level; others on a monthly or quarterly basis; and still others share nothing at all.

For manufacturers, the utility of this data is mostly limited to analyzing their business with an individual online retailer, since synthesizing and standardizing data from various retailers is a major challenge.

Still, there are signs of improvement:

  • Peapod has invested in its data and analytics capabilities and now provides data to manufacturers in both free and premium versions, including alignment with data for Ahold’s stores
  • Walmart is reportedly enhancing its Retail Link direct data platform to improve its ecommerce reporting capabilities
  • Amazon has set the gold standard for online direct data with Amazon Retail Analytics (ARA), available in free and premium versions

Just as specialty data and analytics firms like Dunnhumby emerged to leverage brick-and-mortar retailers’ direct data, vendors like One Click Retail (which specializes in Amazon analytics) are emerging to help manufacturers unlock value in online retailer direct data.

2) Syndicated data

As in the offline world, syndicated data comes in two primary flavors: shopper data and sales data.

Shopper Data

Household panels operated by leading syndicated data providers like IRI and Nielsen already integrate self-reported online purchasing attitudes and behaviors.

Additionally, online behavioral panel operators like ComScore and WPP’s Millward Brown Digital provide analytics on shoppers’ observed behaviors on desktop and mobile devices.

Many of these players have partnerships, so it’s worth asking each vendor what’s available.

Sales Data

Unlike offline, the largest online retailers have not been persuaded to pool their transaction data for their (and the industry’s) benefit.

However, demand for a total-market view is growing by the hour as ecommerce growth and share gain continue to accelerate. I expect progress this year and encourage commercial leaders, analysts, and shopper marketers to ask retailers and data vendors what to expect and when.

3) Ecommerce intelligence

Online retail storefronts themselves are another key source of data. This data is similar in-store audit data, without ever having to set foot in a store.

Companies like my firm, Profitero, monitor the “digital shelf”—essentially anything a shopper would see on an online retailer’s homepage, category page, search result page, or product page.

Analytics focus on critical performance drivers in areas like pricing and promotion, product and assortment attributes, product content (images, titles, descriptions, ratings & reviews), and search and category ranking.

Because this data is collected independently of retailers and is driven by technology, it has some interesting characteristics:

  • Data can be collected for any retailer, in any language
  • Competitive data can be collected for benchmarking and analysis
  • Data can be collected daily (or more frequently) and locally for online retailers with local assortments, pricing, and stock (like Peapod and AmazonFresh)
  • Digital shelf data can be aligned and integrated with existing Nielsen or IRI hierarchies and retailer direct sales data (where available)

For an example of this kind of intelligence, check out Profitero’s free monthly Amazon FastMovers reports, which benchmark best-selling products in 10 categories at Amazon in the US and UK including Grocery & Gourmet Food, Baby, Beauty, Chocolate, Health & Personal Care, Pet Food, and more.

By itself, ecommerce intelligence is a rich new source of insight for CPG manufacturers. But we see even more long-term potential in tying this data to complementary sources of syndicated and retailer direct data.

Getting Better Every Day

There’s no question that CPG companies deserve better, more actionable data on the fastest growing channel of their industry.

But there’s a long way to go, and retailers, syndicated data providers, and manufacturers will each play a role in advancing the industry. 2015 should be a year of progress.

About the Author

Keith Anderson is vice president, strategy & insights for Profitero, a venture-backed global provider of ecommerce intelligence for retailers and brands. He leads Profitero’s product strategy and premium analytics services.

He can be reached at, @KeithAnderson, or on LinkedIn.


PrintFriendly and PDF

What’s Your Data Focus? Retail Store Data or Shopper Panel Data?

Store vs. Shopper Data

For my last couple of posts, I’ve been talking about the four types of retail sales data, formed by a combination of two parameters: data source (syndicated vs. retailer direct) and data focus (store vs. shopper).  In this post, I’ll discuss data … [Continue reading]

What’s New, Part II: What to Know About New Items

new items table

This is the second in a series of posts that show how you can use your IRI or Nielsen POS database to conduct an annual analysis of all the new products introduced in your category.  In this previous post I showed a way to identify the new items in a … [Continue reading]

What’s New, Part I: How to Identify New Items in Your Category

new icons

Manufacturers are constantly introducing new products. Although you know which new products your own company is introducing and when, it is also important for you to keep informed about your competitors’ activity. Depending on the resources at your … [Continue reading]

The Top Posts From Our First Two Years

Second Birthday

Robin and I have thoroughly enjoyed our first two years as bloggers! Thanks so much to the many readers who have entrusted us with their questions and added their own expertise to the conversation. We now have nearly 1500 email subscribers and over … [Continue reading]