Some ad copy-testing lessons for content copy-testing

Copywriters of ads have long been wary of how well ad copy-testing works–and not just because, like the rest of us, they are sensitive to criticism of their work. Should the writers of informational digital-content copy also be wary of copy-testing? Do the problems with ad copy-testing apply to content copy-testing?

Even top researchers (e.g., Arthur Kover, Journal of Advertising Research, 1996, 36:2) have acknowledged that the ad copywriters had legitimate concerns about how well ad copy-editing can indicate which version of an ad will be more effective or even whether the copy needs improvement. Many of the copywriters’ concerns can be boiled down to two major concerns: the survey environment is (1) too distraction-free and (2) too rational compared to the environment where ads are consumed. Let’s look at the thinking behind those two concerns.

  • Copywriters have said that “the survey environment of copy-testing is too different from the distraction-filled environment in which the copy appears in real life.” This was unquestionably an issue when most of the copy being tested consisted of ads. In real life, ads often appear peripherally in cluttered settings or as undesired interruptions to the content people have opted to consume. So copywriters have to go to great lengths to make the interruption grab the attention of viewers. Early copy-testing methods rarely re-created these cluttered settings and, thus, underestimated the value of attention-getting ads. Later research methods partly solved this problem, by leading respondents to believe that they would be asked about the TV or magazine content they were to view (thus distracting respondents from the ads), but then asking them about the ads. Regardless, insufficient distraction is almost a non-issue when copy-testing digital content. In the real world, consumers opt to read digital content, usually by clicking a headline link or a link in an e-newsletter. This assures some level of attention will be given to the content. By the time the person has opted in, attention-getting has already been achieved (presumably by the content’s headline, which is less easy to test accurately via a survey). So survey pre-testing does not need to distract respondents away from the content, because respondents will generally not be distracted away from the digital content when consuming it in the real world.
  • And copywriters have said that “the research process of filling out pages full of checkboxes evokes excessively rational responses.” This complaint was especially a concern when the copy being tested was full-screen TV ads whose central thrust often hinged on visual- and music-driven emotional appeal. Much of Web content is different. Even though the long-lasting success of Web content also depends on its emotional appeal—its ability to tell a story that deeply resonates—much Web content seeks to appeal as much to reason as to emotion. Also, these days, pre-testing is different. A fair amount of digital-content copy pre-testing occurs not in mall intercepts or telephone interviews, but on the Web. Gone is the experience of having to take pen to paper or having to answer questions directly to an interviewer. Now, the click-and-progress process of completing an online survey is fairly similar to how consumers progress from one piece of content to the next on the Web. In other words, both Web content and the process of getting to that content are already putting users into a relatively rational mode that isn’t all that different from how users complete a survey online. So the excess rationality of surveys relative to digital content is much less than the excess rationality relative to TV ads. Nevertheless, the excess rationality of surveys remains an issue to be guarded against. Researchers are working on improving the online survey environment to counter this, while at the same time insisting on avoiding introducing unusual screen backgrounds and interactions that, independent of the content being tested, would create their own dynamics, skewing results.


In sum, at least one of copywriters’ major concerns with ad copy-testing does not seem to apply to digital copy-testing: the lack of distraction in the survey environment no longer seems a problem, because people opt to read digital content rather than being interrupted by it. The other concern—that the survey process is too rationalistic—seems less severe than with TV ads, because much of Web content hinges on rational appeals. But this latter concern persists. When the content being tested is clearly making emotional appeals, good researchers will know to rely less on merely quantitative results and look toward qualitative-research learnings, perhaps obtained earlier in the content-development process, or from open-ended or oblique emotion-detecting questions included in the formal copy-testing.


Market segmentation for the Web

Market segmentation traditionally assumes that people can be segmented into groups that can be separately marketed to, based on their different product needs. But suppose, as often happens on the Web, the customer arrives in your midst—on your site—before or she has any inkling of his or her different product needs. And suppose what happens next—to that customer on your site—will strongly influence how that customer ends up perceiving his or her particular product needs. Now, what is your segmentation strategy for this Web stage of the game?

For that stage, I endorse the segmentation approach advocated by many theorists of Web design: segment your visitors by (a) what is unlikely to change as the result of their visit, e.g., lifestyle profile and prior level of experience with your product, and (b) their content preferences.

Some aspects of their content preferences will be unaffected by their visit (e.g., a general preference for graphics over statistics), but other aspects will be affected (e.g., if the buyer learns more about which features are important to a product category, their interest in which products possess those features will increase). In particular, segmenting your non-current-customer Web visitors primarily by their content preferences seems to make a lot of sense.

This way, your web designers and copywriters can generate Web pages and content that appeal to the persona, defined with heavy reference to their content preferences, that represent those key segments.


Avoiding multicollinearity through conjoint analysis

The premier research companies have helped to make conjoint analysis and its sequel, discrete choice analysis, into popular survey-design research methods. These methods ask respondents to evaluate a series of products or, in special applications where the attributes center on positioning rather than product features, a set of brands.

Each product is generally presented as a bundle of listed attributes, e.g., a series of notebook computers, each with a different list of memory size, processor speed, screen size, price, and brand. In conjoint analysis, respondents are asked to rate or rank each of the products (not the attributes, but the overall products) on some desired outcome variable, such as likelihood to purchase. In discrete choice analysis, respondents are generally asked to choose (rather than rate) one product each from several sets of competing products.

The analysis of the survey’s results reveals how much the addition or subtraction of a particular attribute would affect the preference for the overall product. After calibration with historical results and inclusion of cost factors, researchers can then use these results to predict the changes in market share or profits that would occur as the result of changing the product’s attributes (or the brand’s messaging).

Research clients like such methods because they seem very “real world” compared to prior methods. Prior methods, still very much in use today, asked respondents to evaluate the importance of each attribute to their purchase decisions rather than asking respondents to choose or rate the overall product as they tend to when preparing “short lists” or shopping.

Nevertheless, from an analytical point of view, the greatest benefit of conjoint/discrete-based survey design is that they allow the researcher to avoid the problem of multicollinearity. If the objective of the research project is to figure out how important each factor (or product attribute) is to a market segment’s perception of a product or brand, it’s frustrating to learn “because the factors have multicollinearity, i.e., generally appear in the company of each other or change levels simultaneously (e.g., people who say quality is important are less likely to say that price is important), we can’t estimate the independent impact of each factor.” Conjoint and discrete analysis almost entirely avoid this problem by asking respondents to evaluate products that are as likely as unlikely to share the same pairs of product or brand attribute levels. For example, to figure out the independent impacts of price and quality, respondents can be shown products that have both high quality and high price, high quality but low price, and low quality but high price. It is this capacity of conjoint and discrete to isolate the impacts of each attribute that makes them such powerful aids to decision making–of the businesses marketing the products, not the respondents taking the survey.


Reading on integrating experimental and observational research design

Because people stumble into the career of market research from so many other fields, many books on research and survey design need to be written at a basic level.

The books listed below, though, are different. Many leap to a level of mental gymnastics that some of us haven’t experienced since college. But the exercise can be good for you! Even if you initially only understand 10% of it, that 10% can set you free to rise above the mundane.

Unfortunately, since I haven’t reading these books myself, I can’t yet give you a full mapping yet of which fit together, which are too tough, and which might be skipped over.

  • Leslie Kish, Survey Sampling (1965) [Classic text for dealing with imperfect sampling (non-response, lack of coverage) and complex sampling (multi-stage and multi-level)]
  • Kerlinger and Lee, Foundations of Behavioral Research (1999) [Comprehensive text that reaches from the past almost into the present]
  • Shadish, Cook, and Campbell, Experimental and Quasi-Experimental Design for Generalized Causal Inference (2001) [The best book, and the one that new methods aspire to beat]
  • Paul R. Rosenbaum, Observational Studies (2002) [The new generation: propensity-scoring…]
  • Donald B. Rubin, Matched Sampling for Causal Effects (2006) [Key thinker of new generation]
  • Judea Pearl, Causality: Models, Reasoning and Interference (2000, 2009) [Coming out of the field of computer science, Pearl writes almost as a mathematical philosopher, challenging both the new generation and the old, and giving you a new understanding of structural equation modeling]

Merging Surveys and the Experimental Method

Designing market research to learn what we want from participants is already hard enough. Why do we have compound the challenge by introducing “advanced methods” or “analytics”? Finish writing one report on a survey with the feeling “we still don’t really know which actions to take,” and you’ll sense the answer. Survey research does a terrific job of gathering descriptive information but, even with regression and correlation, it has a hard time confidently revealing what causes what.

Faced with this apparent defect of surveys, it’s tempting to retreat from the survey to a series of one-off “marketing experiments.” Let’s try this home page. No, let’s try that. How about this change? But even if one of these marketing experiments gets lucky and leads to a jump in some performance metric, how do you generalize the result to other pages? You are left with the nagging feeling that you still don’t know enough about the “why we succeeded” to apply the learnings elsewhere. Suddenly the survey doesn’t look so bad because, by randomly sampling a target population or set of market choices, at least its results are generalizable.

The solution, as others have argued, is neither the descriptive survey nor a series of simple experiments, but research design that permits surveys to behave like generalizable experiments.


Market Research Groups on LinkedIn

Members of LinkedIn have created a slew of “Groups” (moderated bulletin boards).  Unfortunately, the marketing and market-research groups with the largest memberships seem to attract the most entries from those just trying to win business rather than to raise substantive issues or share ideas. 

So far, I’ve found three LinkedIn groups to be potentially useful: Consumer Insights, Next Gen Market Research, and Marketing Science.  

Consumer Insights has, well, some geniunely insightful people.  I am especially pleased to see the interest not just in anthropological approaches, but in behavioral economics.

Next Gen seems to have strong enough management to fend off the promotional entries that clutter other larger groups.  If it keeps up the good work, it could become the strongest.

Marketing Science could get into research-design analytics better than the other marketing and research groups on LinkedIn.  (The “Marketing ROI and Effectiveness” group could be good, but seems to attract too much clutter.  The “Marketing Experiments” group has potential, but its tight connection to may somehow hinder flourishing discussion.)

Outside of LinkedIn, the “Marketing Research Roundtable” ( often contains excellent professional thinking and expertise, especially on the statistical end of things.