How Much “Signal” is there in Your Brand Tracker? Part I
Posted: 15/02/2017
Opinion polls didn’t exactly have a great reputation in the first place. Then during the Scottish referendum, Brexit vote and Trump election, the polls got it exactly wrong. For such knife edge votes, the polls were just too noisy to be useful. And this is despite the comparatively high level of accuracy being reported by the poll providers.
The problem for Marketing Directors and Brand Managers is that this same polling noise also wrecks the accuracy of their brand trackers. From month to month these trackers bounce around, undermining their credibility and teaching the business not to take them too seriously. The first step towards a solution is to understand the various causes of this noise. So imagine our excitement when one of our clients gave us access to data from two parallel trackers with overlapping question sets, providing a unique opportunity to explore this very question.
The graphic shows how the reliability of the overlapping questions contained in these two trackers deteriorates as we introduce different sources of noise. First, we divided the participant-level data from just the larger tracker into two samples. Given the size and identical survey conditions, the R-Squared between these two sources of brand ratings, a measure of the amount of agreement between the two samples, was reassuringly 100%. Next, we altered the analysis by changing how we sampled the data to measure the impact of different types of noise:
- External Priming: The second bar relates to the images generated using data drawn from consecutive days. The observed 10% drop in the R-Squared is not from movements in the actual underlying brand image 1. The image estimates across the two days aren’t perfectly correlated because people’s responses are being distorted by extraneous factors that vary from day-to-day, such as general mood. This so called ‘Sunny Day’ effect explains why all the ratings in our trackers showed statistically significant increases during the 2012 London Olympics.
- Sample Size: Instead of using thousands of respondents, we then lowered the sample size to a few hundred. This introduces the type of sampling error cited by pollsters and provides a useful benchmark for comparing the other two noise sources. Basically, when sample size is lower, there’s a risk of collecting an unrepresentative set of opinions. Accordingly, the observed R-Squared between the brand ratings falls by another 20%.
- Survey Priming: Finally, the datasets were drawn from the two different trackers. So there are now three sources of discrepancy. The two datasets relate to different days, contain the reduced sample size and come from two different surveys. These two surveys were sort of similar but not identical (e.g. one includes a section on media exposure and the other includes a section on product experience). This final change delivers the biggest hit, with the R-Squared falling 36%. The different survey contexts manipulate the participants and deform their responses to otherwise identical questions.
Sources of Tracking Error
So there are two main headlines from this analysis. First, as a rule of thumb, whatever estimation error you are offered on a brand image (i.e. the sample size driven noise), double or triple it to get a better sense of the actual image uncertainty (because you have 10% and 36% unobserved noise sources on top of the 20% sampling error). In many cases this revised error margin will be so large as to render the observed brand images, and their implications for the business, nearly meaningless.
Second, we’ve shown with this example that when you ask exactly the same question in two different surveys the responses you get only has an R-Squared of 35%. That means the typical tracker response comprises 35% real opinion and 65% noise. Accordingly, how to improve the signal-to-noise ratio of your brand tracker, and other field work, is the subject of our next post.
1. Indeed this approach was probably excessively cautious given most brand images hardly move from one year to the next, let alone within 24 hours. The order in which the days were sampled was also randomised between the two samples, so there were equal amounts of first and second day in the two samples.