CAPITULO 4. Manuales de usuario
1. Plataforma Moodle
1.2 Manual del profesor
The most important reason that statistical independence is required by so many statistical procedures is that really horrible errors can happen when statistical dependencies creep into our methods. The examples of these errors are legion. One of the worst, and most common types of errors arises when we cannot use a truly random sample, because things are happening sequentially in time in an order that we cannot control. We can see an extreme case of this in our example of the safe and unsafe months for eating oysters (in the preceding section). Suppose that the safe and unsafe months were scattered about throughout the year. Then the likelihood of a safe month following an unsafe month would be far greater than it is. The reason that this probability is so low is that (a) safe and unsafe months cluster
together and (b) unless we have a calendar, some darts, and a blindfold, we have no control over the order in which the months appear. The oyster example is an extreme one (by design), but this type of problem occurs all too often in real life in ways severe enough to cause real problems for statistics. Let’s return to our example, from our definition of systematic sampling (in Chapter 2 ‘‘What Is Statistics?’’) of taking every tenth product unit off of the assembly line for testing. If defects in manufacture tend to cluster, then taking every tenth unit may work, so long as the clusters are either much bigger than, or smaller than ten. If defects tend to cycle rhythmically, showing up regularly after every so many good units, then we had better be sure that the cycle isn’t a multiple of ten. For instance, if our stamping machine stamps out molds for our products 100 at a time, and the 47th one is always bad, because that part of the stamp is broken, then testing every 10th unit will not catch the problem. One possible way of avoiding these types of problems would be to wait until the units are in bins and pick our test samples from the bins. This gives us much more control over the order in which things are sampled. Even here, we have to be careful, as units that come off the assembly line one after another may tend to cluster together in the bin. We should pick a test sample from the bin, stir things about a bit, and then pick another, and so forth.
If things are this difficult in a nicely structured, organized place like an assembly line, imagine how much more difficult they are out in the world, where our customers and vendors are. Let’s look again at the example of the Supreme Court. Suppose we are big fans of the Court and want to catch sight of the justices. We go to the court, find ourselves a place to sit in The Great Hall, and wait for a glimpse of each justice. There are some good things about this strategy. The Great Hall is on the main floor, as are all the justices’ chambers. We can expect them to go back and forth while consulting with one another. It is even sampling with replacement, because if one justice crosses the Hall leaving her office, she may also be the very next justice we see, on her way back. In fact, this may be a problem, because having just left one’s office may increase (or decrease) the likelihood that that same justice will be seen again soon. Worse, Republicans may spend more time with Republicans than with Democrats. So, when they gather to go to lunch, seeing a Republican first may mean it is more likely that the next person we see will be a Republican as well. Of course, these statistical dependencies (whether or not they are the results of clustering) are only a problem if they interfere with getting useful information for our decision. A few examples will illustrate how bad these problems can get. First, there is the classic goof (discussed by Huff & Geis,
1954) by theLiterary Digest(a magazine in olden days), which predicted that
Republican Alf Landon would defeat the then-President Franklin Delano Roosevelt in a landslide in the election of 1936. (FDR won in a landslide.)
Part of the problem was that this political opinion poll was taken by telephone. Back in 1936, at the height of the Great Depression, a lot of folks couldn’t afford phones, and those people didn’t much like the Republicans, whom they blamed for the Great Depression. If only people with telephones voted, Alf Landon might have become President.
Another big problem is the post-hoc hypothesis, where someone decides what causal relationship to look for after they check the data. The problem here is that, after the data are collected, the conditional probabilities change. Suppose that we are playing cards and we think that someone is cheating and that the deck is stacked. We predict that if a third party turns over the top card, it will be the Jack of Spades. Someone steps up and turns over the card and, lo and behold, it is the Jack of Spades. If the deck were not stacked, the odds on our making that prediction successfully would be 51 to 1 against. Either we just got very lucky, or else the deck was stacked. Now, suppose, instead of making a prediction, we wait until the card is turned over. It is a
Jack of Spades, which is bad for us and we lose the hand.Then, we say, ‘‘It’s
the Jack of Spades. I knew it! This deck is stacked!’’ Should any of the other players believe us? The likelihood that the deck is stacked given that the card we predict appears is very different from the likelihood that the deck is stacked given that a card we don’t like appears. (Using conditional probabilities, this can be shown mathematically as well, but the calculations are complex and beyond the scope of this book.)
A common form of post-hoc hypothesis is the multiple comparisons problem. The problem here is that, with a lot of data, even when it is properly selected, some patterns are bound to show up. Suppose we measure all our sales in order to see if the weather affects sales. We sell hundreds of products. Instead of checking to see if overall sales change with the weather, we check the sales of each and every product to see if the sales change with the weather. And, lo and behold, the sales of Part #36503 match the weather exactly! This is a case of asking too many questions on just one topic. Ask enough questions and, just by sheer luck, the answer to one of them will be ‘‘yes.’’
The thing that all of these examples have in common is that the sampling procedure affects the probabilities of the events of interest. Sampling from the assembly line in tens increased the probability of missing certain defects. Using the telephone for the political opinion poll increased the probability of sampling Republicans. Waiting until we saw the Jack drawn (dramatically) increased the probability that we would say that the card was a Jack. Sampling each product’s sales separately increased the probability that we would find at least one pattern that matched the weather. This is yet another reason why, when we do statistics, we must be very careful about how we sample and we must document the sampling procedure very precisely.