Not so long ago, when Nielsen was a research organization, I had a series of conversations with top executives of the company on a variety of research issues. We touched on a number of topics, including some of the audience measurement techniques being used by other research companies.
I was assured that Nielsen would never do IAG type research or model demographic data (as Rentrak was beginning to do). Obviously, we all agreed, both were just bad research. Although these are two completely different audience measurement products, I mention them together because they indicate how far Nielsen is starting to stray from its long-standing position as the research gold standard.
Of course, Nielsen has since purchased IAG and re-named it TV Brand Effect (apparently without making any improvements). Now Nielsen has announced it will soon be modeling demographic data and incorporating it into its marketplace currency standard – namely its national peoplemeter ratings.
I’m reminded that at a Nielsen national client meeting six or seven years ago a top Nielsen executive got on stage and actually proclaimed that validating its research was no longer Nielsen’s number one concern. Rather, getting products to market would take precedence (and glitches would be dealt with as they arose).
And here we are. It’s a vastly different media world than just a few years ago, and I find myself for the first time considering if modeling is the way to go. I’m leaning against it.
The Problem With National TV Samples in the 21st Century
Twenty years ago, everybody pretty much had the same viewing platforms and devices. When something new came out to either add additional channels or enhance the TV viewing experience, most people eventually got it. This remained true through 2000, as VCRs reached about 90% of TV homes (and DVRs were just being introduced).
In today’s media world, everyone doesn’t get everything anymore. Nearly half the country still does not have a DVR. Television is a fundamentally different medium among DVR owners, even when they are watching live TV. In a few years, a third of all viewers might be heavy mobile TV viewers (mostly under 30-year-olds), while the rest of the country may watch TV that way only occasionally. Similarly, viewing TV content online will be heavily done by a segment of the audience, but the growing 45+ population might actually be watching more TV on a television set.
Those who watch (as opposed to have access) to TV content on multiple platforms will vary by content provider, platform, device, demographic group, income, life stage, and numerous other factors.
As viewing options expand, samples become less capable of measuring them.
The whole purpose of Nielsen’s national TV sample is the idea that you and your cohorts, based on a dozen or so categories that Nielsen has determined to impact TV viewing (income, geography, presence of kids, etc.) have similar viewing patterns.
This worked quite well from the 1960s (when three broadcast networks accounted for more than 90% of all TV viewing) through the 1990s (when the average household could still only receive about 30 channels and could only watch TV content on a television set). If I watched Seinfeld, and came into the office talking about being the master of my domain, virtually everyone (at least everyone Nielsen said was just like me) knew exactly what I was talking about. My viewing was indeed remarkably similar to my “cohorts”.
These days, simply because of the number of viewing options available, differing device ownership, and varying levels of streaming and DVR usage, examples of you and your “cohorts” watching different things (or the same things at different times) are commonplace.
I recently noted to some of my friends that the woman who played Sarah Connor in the new Terminator movie was one of the stars of Game of Thrones. None of them watch that show and didn’t know who she was.
One of my favorite shows is TNT’s The Last Ship. A friend, who even 5 or 10 years ago watched almost all the same shows as I, has never seen it.
If I, and one of my cohorts, watch the same program, but he time-shifts it to tomorrow, and I watch it live today, how is a sample supposed to project that?
Today there are just too many niche and time-shifting viewing options for anything less than huge (I mean really huge) samples to accurately measure program ratings. I might love sci-fi programming and comic books (which I do), while my neighbor who Nielsen says fits my profile in every measurable way football and basketball junkie. Well, we both might be watching the same number of TV hours per week and even the same amount of live versus time-shifted viewing. But I might be watching Walking Dead, The Flash, and old Star Trek The Next Generation episodes, while my neighbor has ESPN, The NFL Network, or the NBA League Pass on every night. Nielsen might accurately project that men of a certain age, income, and family size are watching television, but may no longer be able to distinguish what they are actually watching.
Nielsen samples are uncanny in their accuracy in measuring total television usage for broad demographic segments. I was involved in spearheading the Council for Research Excellence’s landmark Video Consumer Mapping Study as co-Chair of its Media Consumption and Engagement Committee. One of the study’s lesser-known findings was that Nielsen’s overall usage levels of households and broad demographic segments, such as Adults 18-49 were remarkably similar to the observed behavior. Once you started to look at narrower age groups, however, the reported Nielsen data started to stray significantly from the observed viewing data. And this was just overall TV usage levels, not individual program ratings.
More recently, when I was head of Research at ION, we would often ask Nielsen how it was possible that an hour-long show like Criminal Minds can possibly lose half of its adult 25-54 audience in the second half hour and gain 40% among adults 18-24? The next day it might be the exact opposite. They would diligently analyze the data and then tell us that 10 people in the sample switched to something else, and that resulted in a reported rating decline of 48%. There was really nothing we could do about it other the make Nielsen do the same analysis every time we noticed some audience fluctuations that defied common sense. This is something certainly not unique to ION. If you scrutinize the daily cable ratings of virtually any network, you’ll see many “illogical” fluctuations. Small ratings and small samples contribute to greater statistical error. If I were at a cable network now, I would favor almost anything that would dramatically increase sample sizes.
The Problem With Modeling
The problem’ with modeling, is that is not good research for measuring viewing behavior no matter how good the modelers. And calling them “scientists” doesn’t make them better researchers. We are supposed to accept that modeling viewing behavior from its national 25.000 peoplemeter sample to 13,000 local TV set meters will effectively double the national sample because Nielsen tells us the “math works out.” It’s not quite just making up numbers, but it’s close. You can model overall reach of a network or group of networks, but not individual average program ratings or time spent viewing. In other words, overall viewing is much more predictable, specific viewing much less so.
If the sample itself is not adequate in projecting viewing behavior of the population at large, what’s the point of modeling data based on that sample? Well, if we think that the modeled data won’t be any worse than the current sample’s data (I’m not sure), then effectively increasing the sample size is a good thing. That’s a big if.
I would rather have Nielsen’s experts model demographic data from its existing samples than have some other company do it. For example, I recently was evaluating Rentrak on behalf of ION Media Networks, and I questioned how they were modeling their demographic ratings (since with set-top ratings there is no persons data). I questioned them because the numbers they were producing did not look right to me. When I saw what they were doing, it was so ridiculous, and produced results that were so absurd, it made me question everything else they were doing (even though most of their other products seemed fine). The fact that they told me I was the first person to question their methodology led me to believe that no one else was actually using Rentrak’s “demographic” ratings (other possible reasons were too depressing for me to consider).
At least with Nielsen, they will be continuously examining the data to make sure there is some logic to the results.
Why No Industry Consensus?
So why couldn’t the Video Advertising Bureau (VAB) arrive at a consensus as to whether modeling is a good idea? I wasn’t in those meetings, but I suspect the reason is no one has any idea how this will impact Nielsen’s reported ratings. Traditionally, the networks have favored any methodological change if they believe their ratings will go up. But the unknown is scary and could negatively impact their business. Improvements to TV research methodology often cause broadcast ratings to move in one direction and cable ratings the other (the higher rated tend to decline, the lower rated tend to rise). Now that broadcast and cable networks are part of the same organization, those natural divisions may have been highlighted in their discussions.
When I was on the agency side (I was EVP of Research at Magna Global), it was traditionally the media agencies that decided what research would be used as buying and selling ratings currency. That no longer seems to be the case. Ten years ago, Nielsen would have been laughed out of the room if they proposed modeling demographic data (of course, 10 years ago they never would have even considered it).
I was one of the founding members of the Council for Research Excellence, which should be advising Nielsen on how to proceed here. This type of thing is the CRE’s entire reason for being. Let’s not forget it was created to pre-empt government intervention in Nielsen’s measurement process. The purpose of the CRE to quote from its website:
“Methodological research is concerned with the accuracy of audience measurements and the effects of possible changes in methods. It provides the foundation for valid, reliable and credible audience measurement.
The Council for Research Excellence is intended to give Nielsen's client base greater voice in the design and execution of methodological research.”
I haven’t heard anything from the CRE on this yet, but a POV distributed to the industry would be a great idea. CRE members are from all sides of the industry, and its seal of approval would give Nielsen’s plan credibility that no matter how much “impact data” it releases, will otherwise be lacking. And if the CRE says that Nielsen needs to reconsider their plans, it is the one thing that might influence them to stop, or at least pause.