• No se han encontrado resultados

MEDIA DE MATEMÁTICAS

COMP MÍN T1 T2 T3 las relaciones semánticas entre estos tipos de palabras

Chapter 3 Starting at the Beginning: Experimenting to Discover What Shape “Wins” 25 Twix Kit Kat Cookie Barz Mounds Snickers Reeses Cups Milkyway Hershey Bar length = 3 3/4 inches length = 4 inches length = 4 inches length = 2 1/2 inches length = 4 inches Diameter = 2 inches length = 5 1/4 inches width = 3/4 inch length = 4 inches width = 1 inch width = 1 inch width = 3/4 inch Height = 1/2 inch width = 2 1/4 inches width = 3/4 inch width = 2 inches Height = 1/4 inch Height = 1/4 inch Height = 1/2 inch Height = 1/2 inch Height = 1/2 inch Height = 1/2 inch Height = 1/2 inch

Figure 3.1 Schematic of the eight candies to be tested, showing their dimensions.

Figure 3.2 Example of the stimulus for Mounds.

All three outcomes make the task just a little less clinical, less impersonal. It ’ s always good research practice to make the respondent feel comfortable at the start of the interview, and at that time sort out ambiguities about the stimuli and the scale.

The respondent evaluated each of the eight different “ packages ” of chocolate candies (really schematics of

packages) in a unique random order. There is a reason behind this randomization. Most research with test stimuli presents them in a random order to avoid prob- lems in the research, which often occur when the stimuli are tested in one single order by everyone. For example, respondents assign a higher rating to the stimulus tested fi rst. This is the “ tried - fi rst ” response bias . When you

researchers do both, beginning with a test stimulus to “ teach the respondents what to do, ” and then test the stimuli in a rotated or randomized order. The data from the fi rst stimulus is irrelevant. Often this fi rst test stimu- lus does not even come from the set of stimuli, but just something convenient to show in that fi rst position.

Each candy appeared on the screen, one at a time, so - called monadically , or more correctly “ sequentially test the same stimuli in the same position, you will intro-

duce this bias. The fi rst stimulus will be “ up - rated, ” which gives a false reading to that stimulus. If it is a poorly accepted product, then you may overestimate how good it is, just because of that boost from the fi rst posi- tion. The most prudent thing is to either have a “ dummy ” fi rst stimulus, whose data you discard (so - called training stimulus), or rotate the order of the stimulus. Many

Figure 3.3 Example of the stimulus for KIT KAT.

All of the concepts you are about to see refer to a:

Please take your time and evaluate each concept (screen) thoroughtly. Once you have evaluated the concept, please enter your rating based on the following question. The entire conept should be rated as a whole. Please use the entire 1–9 scale to express your opinion.

How interested are you in purchasing this product?

PLEASE USE THE ENTIRE 1 TO 9 SCLE. NOT AT ALL INTERESTED

1 2 3 4 5 6 7 8 9

VERY INTERESTED⁄

Ÿ

It is not necessary to press <enter> after your rating.

CANDY BAR

Chapter 3 Starting at the Beginning: Experimenting to Discover What Shape “Wins” 27

group of rejecters? ” In fact when we think more deeply

about the way the individual respondent must approach the scale, we realize that the only way that respondent can answer is on the basis of his own personal opinion. Anything else, average intensity or percent membership in a group must be a subsequent abstraction from the responses of many people.

One consequence of this intellectual heritage from the physical sciences and from psychology is that the appro- priate measure of central tendency across many respon- dents is simply the mean or the median. The scale represents “ amount of ” and the average rating is the measure of central tendency (i.e., if nothing else were to be known about the product, then the average is the best guess). Now with that in mind, look at Figure 3.5 . We see the arithmetic averages on the x - axis in this scatter plot graph. Each of the eight circles in the graph corre- sponds to one of the eight products. The abscissa or x - axis clearly shows that the eight products don ’ t cluster in one region, but rather distribute across the scale. Certainly there aren ’ t any poor candies (nothing averag- ing below 4.5), but on the other hand, that makes some intuitive sense. How can commercial chocolate candies be uninteresting?

Commercial research doesn ’ t really care very much about the intensity of a single person ’ s feeling, except when it comes to making changes to the product. The commercial researcher also doesn ’ t necessarily care about the average degree of acceptance, at least when the monadic. ” The respondent looked at the candy and

pressed the appropriate number to show how strongly he felt about the candy, based only on the picture, schematic of the package, brand name, and, of course, the particular rating question.

Some clarifi cation is in order here. Often you will read about the results of experiments. Sometimes these exper- iments will be done for reasons that won ’ t be totally clear from what you read. You might ask “ Why perform the

studies in such a cumbersome method, when you can get the results more easily by simply doing the study in a different, more direct way? ” Chances are that when you

read the results of the study and feel this way, the study was probably done in the way you are reading for reasons other than what is being immediately presented. And such is the case here. The study was done for a variety of other reasons. That is the reason for the unusual struc- ture of the stimuli — brand name at the top, shapes and dimensions in the body of the stimulus. Here we discuss the study for didactic reasons — to illustrate how to gather and analyze.

Analyzing the Data — What Do We Look for?

Even before we begin to look at the substantive results (i.e., how the different candies performed), we might want to look at the way we measure acceptance. As you will see here and in other chapters throughout this book, we can take at least two different paths when we measure acceptance. We can look at the average level of feeling, the intensity of acceptance. Or we can look at a more “ black and white ” measure: accept or do not accept. The former, measuring intensity of feeling, comes from psy- chological science. The latter, accept or do not accept, comes from sociology. From sociology, this all - or - nothing membership in a group migrated into market research — a more applied discipline with a different intellectual history — a different perspective about what the numbers mean, and indeed a different worldview about the numbers that are really meaningful to look at.

Most scientists coming from either the “ hard sci- ences ” or “ psychology ” look at the intensity of feeling. That ’ s what ’ s captured in the rating scale we saw in Figure 3.4 . The respondent who reads this question is thinking about the “ intensity ” of feeling, the degree of interest in this particular candy bar. The respondent doesn ’ t think of “ groups of people , ” nor “ Do I belong in

this group of acceptors or would I rather be put into a

90 80 70 60 % Top-3 Box 50 4 5 6 Means 7 8 40 30

Figure 3.5 How the percent top - 3 box (percent rating 7 – 9) cova- ries with the arithmetic average. Each circle corresponds to one of the eight candies. Although the two measures of acceptance cor- relate highly, they signify very different things about acceptance, and lead to different things that one can say about a product.

before being counted, but we don ’ t make the level of acceptance so high that we make the data very sparse by having very few acceptors for any test stimulus.

Armed with this “ newer ” way of looking at accep- tance, namely counting the number of respondents who rate a candy as 7 — 9, let ’ s look at how these two mea- sures covary. They should be reasonably correlated. The greater the number of respondents who like a candy (i.e., the more people who rate the candy as 7 — 9), the higher should be the average. We see this happy state of affairs in Figure 3.5 . The abscissa or x - axis shows the mean rating, and the ordinate or y - axis shows the percent top - 3 box. Each of the eight circles is one of the candies. The relation is almost perfect.

study is funded by marketing. A subtle change in focus takes place, but it is a change that is very important to highlight and always to keep in mind.

Although the individual respondent can only rate his or her feeling, the commercial researcher usually searches for the number or, more typically, the proportion of indi- vidual respondents who feel a certain way (i.e., those who exhibit a specifi c, predefi ned set of responses). In our case it is the percent of respondents who are inter- ested in the product. In some cases it is the percent of respondents who are highly satisfi ed with the product, and in some other cases it is the percent of respondents who are dissatisfi ed with the product.

Operationally, there is a big difference between mea- suring intensity of feeling versus assigning a person to the group who is satisfi ed. S.S. Stevens, Professor of Psychophysics at Harvard University, would go out of his way to hammer home this difference, proclaiming that “ Nothing is quite as diffi cult in science as converting

a continuous or reasonably continuous scale into a binary measure. ” It sounds simple, but the thinking has

to be clear. What is the rule by which we can convert this 9 - point scale of a person ’ s interest into that person ’ s membership in the class of “ I like or accept the stimulus ” versus “ I dislike or reject the stimulus ? ”

For our candy study, and as a matter of course, we arbitrarily chose the three high ratings — 7 – 9 to represent a high degree of interest or acceptance. This is called the “ top - 3 box. ” A person who rates one of the test stimuli 7, 8, or 9 is assumed to “ accept ” that stimulus (i.e., to fall into the acceptor group). A person who rates that stimulus 1, 2, 3, 4, 5, or 6, respectively, is assumed not to “ accept ” that stimulus. Thus, for each stimulus a person evaluates, that person can either accept or not accept the stimulus, or perhaps more colloquially like or not like the stimulus. We will not analyze degree of liking, but instead simply tally up the number of people who accept versus not accept the stimulus.

Our specifi c choice, 7 – 9, is arbitrary. The top third of the scale is a fairly stringent measure. Of course we could make the criterion even more stringent, by looking only at the top two scale points (called top - 2 box), or even the top scale point (top - 1 box or top box). The reality is that the more stringent we make the criteria for “ acceptance, ” the stronger the acceptance has to be (which is a good thing), (but the fewer the number of respondents in the pool (which is a bad thing). Therefore, looking at the top - 3 box is a reasonable compromise. We have a reason- ably strong level of interest that a person has to exhibit

The approach of counting membership in the accep- tor group rather than estimating average liking comes from sociology. Sociologists are interested in how many people exhibit a specifi c behavior, rather than being interested in the intensity of that behavior. So, when we deal with measures of acceptance through- out this book, for the most part we will deal with these percents, basing our approach on the sociological way of analyzing data. Market researchers have ac- cepted that sociological approach and incorporated it into the way they think about problems. Parentheti- cally, when we deal with the consumer as a measuring instrument, to assess specifi c aspects or attributes of the package/product, we will revert back to the aver- age or mean as the measure of central tendency.

What We Found

The most important thing in research is, of course, the results. That ’ s why we do the study in the fi rst place. The issue is, however, what do we look for? The question itself sounds a bit strange. After all, most people feel that when doing research one ought to begin the effort with a well - formulated question, some knowledge about the types of answers that one might get, and of course the ability to move from the data one observes toward either confi rming the hypothesis or denying the hypothesis. We are taught this “ linear, ” structured way of scientifi c think- ing from the early grades when we learn about science. The same worldview pervades most of the work we do later on in our professional lives. Of course the reality is a bit different. We often have vague hypotheses, try to

Chapter 3 Starting at the Beginning: Experimenting to Discover What Shape “Wins” 29

Peanut Butter Cups, KIT KAT, and Twix). Two of the candies perform poorly (Mounds, Cookie Barz).

Consumer researchers are accustomed to looking at subgroups of respondents in addition to the data from the total panel. These subgroups are typically defi ned either by geo - demographics (i. e., gender, age, income, market), or by purchase behavior (i. e., category usage, brand used most often, and the like). In our candy study, all of the respondents were recruited to be category purchasers, so there is no question that we are dealing with the correct target sample. The exceptionally high performance of three products and the poor performance of two other products cannot, therefore, be traced to an unusual group of respondents. All respondents were appropriate for the study, and furthermore, all respondents evaluated every single one of the candies.

answer the questions, but always keep our eyes open for new, interesting side paths, results that intrigue and add valuable insights. And, often we don ’ t even really need to have a simple hypothesis to prove or disprove. Rather, we enter the experiment looking to discover patterns. The search for meaning in nature, for patterns, systemat- ics, rather than hypotheses, is what drives us. And that ’ s what happened here.

When we began this specifi c research project on the shape of package designs for candy, we did not begin with a specifi c hypothesis to prove or disprove. Rather, we went in looking for patterns. The patterns that we seek tell us how nature works. We don ’ t know the regu- larities, but we ’ ll know them when we see them. For this specifi c study on chocolate, we are really looking for a simple pattern. That pattern can be described as “ The way

the eight products line up. Are they the same, or do they score differently? ”

With that in mind, let us look at the summary data for the eight products, fi rst for the total panel, and then look at the results from males versus females, and for adults versus teenagers. We are able to look at these “ breaks ” or “ subgroups ” because during the recruiting, we ensured that we had at least 40 teens and that the respondents were more or less divided evenly between male and female.

With that in mind, let ’ s look now at the summarized data in Table 3.1 . We see our eight candy packages, in the order of their “ performance ” (i.e., in descending order of top - 3 box purchase). The results are quite remarkable, not so much for the performance of one or two products but because even within a popular product category we can see an enormous range of acceptance. Three of the candies perform exceptionally well (Reese ’ s

Table 3.1 Summary results from evaluation of pictures of eight candy bars, using a graphical display of the package. The numbers in the body of the table are percent top - 3 box (ratings of 7, 8, 9) for purchase intent, rated on a 9 - point scale. Each respondent rated all eight pictures.

Total Sample Males Females Teens 12 – 17 Adults 18 – 49 Reese ’ s Peanut Butter Cups 81 73 89 89 73

KIT KAT 75 73 77 72 79

Twix 75 72 78 78 72

Snickers 67 64 71 63 72

Hershey ’ s Chocolate Bar 61 55 67 57 65

Milky Way 58 60 57 67 49

Cookie Barz 41 37 46 44 38

Mounds 33 33 33 27 38

By having each respondent evaluate all eight sam- ples, we eliminate any bias that may be caused by the respondent. Each respondent serves as his own control. This powerful but relatively simple approach, having all respondents evaluate all samples, is called a “ within - subjects design. ” You will encounter this strategy many times during the course of the book. It is a simple precaution that, at once, increases the strength of the data by reducing a host of biases.

Now let ’ s look at what the data tell us. We will look at our statistic, the percent top - 3 box. Keep in mind that this statistic tells us the proportion of individuals who we classify as acceptors for each candy, based on how they rate the picture of the product and the schematic of the package.

Moskowitz et al., 2006 ; Meilgaard et al., 2007 ; Meullenet et al., 2007 ). We know that people don ’ t agree with each other, that there is interpersonal variation, and that what one person likes another person may not like. The notion of percent top - 3 box brings that idea to life. Since we focus on the percent of the respondents who rated a candy 7 – 9 on a 9 - point scale, we instantly realize there are the others who did not rate the candy 7 – 9.

We saw no differences in general patterns when we looked at the summary statistics, by candy, across gender and across age (look at Table 3.1 ) So, how should we proceed? One way is to plot the distribution of the ratings for each of our eight candies, to see how these distribu- tions appear.

Let ’ s look at the distributions in Figure 3.7 . We see our eight candy bars, with a so - called density distribu- tion. You can ’ t really see the individual circles (they ’ re very small), but each column comprises a set of fi lled circles, one circle per respondent. The trick here is to see whether the people distribute across the scale, or whether they clump at one location. Distribution means that people differ; clumping means that the people are identi- cal. Of course, if you have close to 80% top - 3 box, you ’ re not likely to have much distribution beyond 7 – 9.

What ’ s quite interesting about these distributions is a sense that there may be different populations in our group of respondents. It ’ s probably not the case for prod- ucts like KIT KAT that score very highly across most of the respondents. It ’ s more likely for those candies that score modestly well, or even poorly. A good example is Mounds. Mounds scored at the bottom of the group, at least based on the picture of the product, the structure of What do we see? Or course we see the data in a neat

tabular form! But, again, what do we really see? Certainly we see some differences. Yet, being realistic, can we legitimately say that we see differences among groups, or are the differences that we see merely the result of random variability that always occurs in scientifi c