Analytics are for small retailers too - an excel tool with no more cost or staff

By Swathi Jeedigunta, PhD

Executive Summary

  • According to Deloitte research, the gap between smaller and larger retailers in their ability to spend on “big data analytics” tools is substantial, particularly in an industry with already razor thin net profit margins.

  • Research published in the Journal of Business Research shows “persistence modeling” future sales can be achieved with far fewer variables and expertise than traditional algorithms require.

  • With detailed historic sale records, and limited to specific goods, the autoregressive model can be implemented immediately by smaller retailers with no additional costs for staff or programs.

The prohibitive costs of data analytics for small retailers

The use of big data analytics in retail can increase the operating margins by more than 60% (“Big Data: The next frontier for innovation, competition and productivity”, McKinsey Global Institute, 2011). However, small to mid-sized companies will have to set aside a considerable 2-6% of their total budget to implement big data analytics (“Data Analytics: How much does it cost for a small/mid-sized company”, Octolis, 2022). Given that, on average, retailers (general or food) have 2-3% net profit margins, investing in big data analytics is simply not very feasible for all businesses (“What is a Good Profit Margin?”, Fast Capital 360, 2022).

The positive financial impact of persistence modeling future sales, as a subset of big data, has many simple but effective use cases in retail – for instance where Walgreens analyzes the trends of product sales with weather patterns and identified an increased demand for Pantene anti-frizz products during seasons with increased humidity. By targeting digital ads and promotions for these products during humid seasons, Walgreens saw a 10% increase in the sale of those Pantene products (“The Power of Retail Analytics”, Yodlee.com, 2019). 

It should be no surprise that small and medium-size retailers are less able to invest the same money into big data analytics. According to Deloitte’s September 2022 Chief Marketing Officer Survey, 75% of retailer and wholesalers had invested in data analytics to improve their digital marketing in the last year (by far the most responded kind of investment). Yet just 33-45% of companies with sales from $10 million to $100 million had invested in data analytics whilst 71-87% of companies with sales from $100 million to $10 billion had done so. Likewise, just 50-55% of companies with headcounts from 50 to 500 had invested in marketing analytics but 65-90% of companies with headcounts from 500 to 5,000 had done so.

McKinsey research suggests that if analytics were made more accessible, even small enterprises could benefit and generate more than $100 million in new revenue (“Marketing & Sales Big Data, Analytics, and the Future of Marketing & Sales” McKinsey & Co, 2015). Persistence modeling is just a piece of the big data analytics puzzle for retailers, but certainly an important piece.

“Persistence modeling” future sales: a new, easier, way

The presence of unobservable variability in the market makes it difficult to observe trends in sales over the course of multiple years to measure the impact of marketing campaigns. To properly model the future sales, the existing persistence modeling systems require intense computational analysis to predict multiple variables, which has limited its use by managers as most lack this computational expertise. Typically, a series of tests need to be carried out to identify endogenous and exogenous variables (the numbers of which varies by the market) to use in the model. In a recent paper published in the Journal of Business Research, a new, simplified and more accessible approach to persistence modeling termed the autoregressive model with drift (ARD) was developed.

The paper explains a system reducing the number of variables needed to perform persistence modeling by substituting variables in the equation with more simplified equations. The proposal is the use of an approximate maximum likelihood estimation procedure to predict the values of three variables: the persistence coefficient (φ), the noise covariance term (q) and the market constant term (ω) to ultimately predict the number of goods sold. Unlike other models which typically use other tests like the Granger causality tests and unit root tests which involve determining further autoregressive models and test statistics, this models the market using only historic sales data without need for further statistical tests. Because much of the complex calculations of (φ), (q), and (ω) can be automatically done by Excel, businesses no longer need to invest in hiring new employees trained in data analysis or in the purchase of new software to carry out this analysis.

To test their model, the authors used a large dataset with five years of sales for three categories of products (foods, hobbies and household items) from Walmart to assess the ability of the model to accurately predict how many items will be sold over a timespan of 84 days of sale.  They then compared the accuracy of the predictions made by ARD to four widely used, more complex models: (a) the automatic ARIMA modeling algorithm, (b) the exponential smoothing approach, (c) the THETA method and (d) the Forecast combination method which combines the previous three models. Using the Mean Absolute Scaled Error (MASE) and the Root Mean Squared Scaled Error (RMSSE) to calculate the accuracy of the market values forecast by the models, it was shown that the ARD model outperforms the other models. It showed the best accuracy using RMSSE and was the best for most, but not all, of the categories using MASE. Certain subcategories of goods such as Foods and Household items were better predicted by other models according to the MASE score, but there was only a minor difference in the MASE and RMSSE scores between the best model and ARD.  The ARD appears to at least be comparable to other competitive models and provides the benefit of being simpler to implement.


Computation within the power of Excel

Despite the promising benefits of this approach to persistence modeling, it is important for retailers looking to implement big data analytics to take the following into considerations:

  1. Day One: An Excel template to easily forecast the market for your business is readily available in the article, so managers should go read the paper for themselves and explore the ease of the model in practice.

  2. After: Strong data management is key to ensure an accurate record of sales. Detailed records of the categories of items and their sales throughout different seasons of the year are required for the analysis to identify potential seasonal trends to assist with identifying effective marketing strategies. Long-term trends can be better identified with larger datasets to feed into the model and will be most beneficial for companies with many years of sales records to use to determine the values of the three variables more accurately in the model. Consistent comparisons of the actual state of the market real time with the values predicted by the model will also be key as managers begin implementing this system to guide their business.

  3. Finally: While demonstrated for basic consumer goods, it has not been thoroughly tested for specialty retailing such as luxury or clothing items. For those considering applying this model to guide their marketing decisions, it is advisable to first apply the model to a sub-category of products or services the business offers rather than using it to guide all marketing decisions in the company immediately. Sales predicted from persistence modeling can then be compared to a different (but ideally comparable) category where persistence modeling was not used to guide marketing decisions to determine if market trends for the latter are accurately predicted.

____________________________

Swathi Jeedigunta earned a PhD in Molecular Genetics from the University of Toronto and a member of the Graduate Management Consulting Association (GMCA Canada). The research applications proposed in this article are solely the views of the author and do not necessarily reflect the views of the original academic journal article authors nor any individual member of our Editorial Board

Previous
Previous

The spin-off brain drain, when it is [and isn't] worth it

Next
Next

What some CMO's get wrong when picking a spokesperson for a social stance