3.4 Población 57
4.2.1 Comprobación de representatividad de la muestra
These considerations are primarily aimed at ensuring the representativeness of the sample with respect to the property under investigation.
A good sample, we have seen in the discussion of other inductive generalisations, is one which is large enough, and which is representative of the population. We will now consider in more detail the reasons for the importance of these criteria, and the ways we may try to ensure that they have been satisfied.
5.3.2.1 (i) SAMPLE SIZE
Other things being equal, the larger the sample, the better supported the conclusion will be. If we want to conclude that all dogs bark, we should ensure that we have observed a significant number of dogs, not just one or two. If we want to justify a claim that two thirds of voters prefer John Howard to Peter Costello, we will need to have asked a substantial number of voters for their opinion.
The size of the sample will determine the margin of error in a statistical inference. The margin of error, expressed as a percentage, is the allowance you would have to make for inaccuracy in your statistical generalisation. The smaller the sample, the greater the likelihood of inaccuracy and therefore the greater the margin of error.
Suppose, for example, your surveyed a sample of 100 voters, and discovered that 55% of them were
planning to vote for a particular candidate. Even with an ideal sample, a sample of 100 will give you a margin of error of around 10%, so you could not conclude from the research you had conducted that 55% of the population would vote for that candidate. At best, you would be justified in concluding that 50-60% of voters in the population would vote for that candidate.
It is not necessary for our purposes to know the details of how margins of error are calculated. (Anyone interested could have a look at http://nilesonline.com/stats/ : an introduction to statistics designed for journalists, but useful for any non-statisticians ). The important thing to remember is that a larger sample will give you a smaller margin of error, meaning that the prevalence of the property in the sample which is being used as the basis of the generalisation is likely to be closer to the actual prevalence of that property in the population.
The size of the sample is often emphasised as being of great importance, but ensuring that we have a sufficiently large sample is really just a way of trying to ensure that the sample is representative of the population. How large is large enough will depend on the confidence we have that the sample is likely to be representative. The two main influencing factors in determining how representative a sample is likely to be are how the sample is chosen, and the homogeneity of the population with respect to relevant properties.
5.3.2.2 (ii) REPRESENTATIVENESS
To ensure representativeness of the sample to the population, a researcher must make sure that any differences which exist in the population which might be relevant to the property under investigation are represented in similar proportions in the sample. This is easier said than done, but we will look at a few of
the points which need to be addressed. Specifically: Is the population likely to be homogeneous with respect to the property under investigation? and How should the sample be chosen?
5.3.2.3 Is the population homogenous with respect to relevant properties?
A population is homogenous with respect to a certain property, if that property is one which can be expected to occur fairly evenly throughout the population. If the property is one which is subject to a lot of variation in the population, you will need a larger sample to take account of possible differences within the population. A population which has a lot of variation with respect to a property, is said to be heterogeneous with respect to that property.
A population is homogeneous with respect to a property if that property is distributed evenly throughout the population.
A population is heterogeneous with respect to a property if the property is unevenly distributed throughout the population.
A larger sample will be needed for heterogeneous populations, to make sure that the differences in the population are reflected in the sample.
For example, the population "humans" is heterogeneous with respect to a property such as opinions on a particular topic, or favourite colours, or religious beliefs. If you wanted to get an accurate impression of the occurrence of such properties within the community, you would need to have a large and well chosen sample to make sure that all relevant differences were reflected in that sample.
The same population, however, would be homogenous with respect to other properties, like having two kidneys. If you wanted to know how many kidneys human beings had, you would not need to examine very many, since physiological characteristics like this are known to be fairly constant across all human beings.
Note that to make these sorts of decisions requires some background knowledge. It is only because we know something about physiological characteristics of humans that we are able to conclude that a small sample would be sufficient. This can lead to difficulties, because in some cases it is hard to know what properties of a population are relevant in determining the size of sample required for a particular investigation.
What are the 'relevant properties'? This question is impossible to answer in any precise and general way, but in many cases what will be relevant to a particular investigation will be fairly clear. For example, if we
wanted to investigate the voting intentions of Australians, such characteristics as age and socio-economic background might be relevant, but characteristics such as eye colour and singing ability will presumably not be.
Once we have an idea of how large the sample would have to be to reflect relevant differences in the population, we still need to think about how the sample should be chosen, to ensure a representative sample which is not biassed in any way which might compromise the reliability of the research.
5.3.2.4 How was the sample selected?
The manner in which a sample is chosen will often affect its representativeness, because some methods of sample selection are more likely to lead to biassed samples than others.
Examples of good sampling methods are random sampling, and stratified sampling:
Random samples: A sample is random if every member of the population has the same chance of being part of the sample. Assuming the sample is sufficiently large, having a random sample is the best way to ensure that there are no biases built in.
Random samples are particularly beneficial if it is not obvious what properties are relevant to the characteristic you are investigating. The problem with random sampling is that it is impractical for many purposes.
Stratified samples: Where random sampling is not possible, a stratified sample is the next best thing. In a stratified sample, researchers determine what properties they think are relevant to the property they are investigating, and construct a sample which has a similar distribution of those properties to the population.
Suppose, for example, a bank wanted to get an accurate impression of how it was viewed by its customers. It would be advisable to choose a sample which was stratified in such a way that it represented the different kinds of customers (hose with transaction accounts, credit cards, residential loans, investment loans, business accounts and so on) in similar proportions to their actual representation in the population of the bank's entire customer base. If the sample were skewed in favor of home loan customers, for example, or businesses, the results obtained from the research would not be generalisable.
Examples of sampling methods which can be problematic are choosing samples in a way which is relevant to the property under investigation, and self selection.
Was the sample chosen in a way which was relevant to the property under investigation? (Is it likely that the sample are atypical with respect to that property?)
Be cautious of samples which have been selected in such a way that the sample is likely to over or under represent properties relevant to the one under investigation. One famous example of this problem in political polling occurred in the 1936 US presidential election. A magazine sent out a questionnaire to determine voting intention, and received two and a half million responses. Based on this very large sample, they predicted a landslide win to Landon over Roosevelt. In fact, there was a landslide in favour of Roosevelt, who won with 62% of the vote. The accepted explanation for what went wrong in this sampling was that the questionnaires were mailed out to people selected from the telephone book, and club membership lists. But in 1936, only wealthy Americans had telephones, and only the wealthy were club members, so the less affluent voters were significantly underrepresented in the sample. Roosevelt, the Democrat candidate, received a greater proportion of the votes from less affluent Americans, with the result that he won the election, contrary to the poll's prediction.
A local example of this sort of problem was a phone poll conducted by Triple J in 1992, on the subject of whether or not marijuana should be decriminalised. They had over 10,000 responses, of which 96% were in favour of decriminalisation. This is a large sample, but unlikely to be representative, because it might be expected that Triple J's audience is not representative of the population of Australians in their attitude towards marijuana. Another reason that this sample would not be representative, is that it was self-selected.
Was the sample self-selected?
A sample is self-selected if the members of the sample chose to be part of the sample. Phone-in polls are a common kind of self-selected sample. The reason that they are generally unrepresentative is that only people who feel strongly about a particular issue would bother calling in. This means that the views of those who ring in are unlikely to represent the views of the population as whole.
(5a) After Princess Diana's death in 1997, The People, a British Sunday newspaper, conducted a phone poll on the question: "Were Diana and Dodi killed as part of a secret operation?". They had nearly 6000 responses, of whom 98% answered "yes".
In that case it may seem obvious that the people who rang up are unlikely to reflect the views of the population, but this sort of problem often occurs in more subtle ways. These are examples of poll results from polls conducted on Channel Nine's Sunday program:
(5b) "Do you believe you have ever been sick or suffered illness because of bad air on planes?" (4/6/00) Result: Yes - 63.5%; No - 36.5%
(5c) "Should the Reserve Bank intervene to ensure the banks deliver lower credit card transaction fees?"(25/3/01)
Result: Yes - 92.5%; No - 7.5%
(5d) "Would you donate a kidney for a friend or loved one?" (26/11/00) Result: Yes - 83%; No - 17%
None of these results are likely to be representative of the views of the population. In all of these polls, people who would vote "yes" are likely to feel more strongly about these issues, and are therefore more likely to vote at all. (Even given the option, people don't tend to spend time and money ringing up to say "I don't know" or "I don't care".)
In the first case above, it seems unlikely that many people would bother to ring up to say that aeroplane air had not made them sick. This poll could certainly not be validly used to infer that 63.5% of people had suffered illnesses because of bad air on planes.
In the second case, again, those people who wouldn't agree with this proposal would probably mainly be those who didn' t think that credit card transaction fees were an important enough issue for the Reserve Bank to become involved. They would also be unlikely to think it was an important enough issue to cast their vote on.
In the third example, the issue is not a trivial one, but it seems unlikely that people would ring up to say that they would not be prepared to donate a kidney. Again, the people who would be most likely to vote are the people who would be going to vote yes. There is also the possibility that this result would not reflect the proportion of people who would actually donate a kidney, because it is far easier to ring in and say that you would donate a kidney, than to actually commit to doing so.
Self-selection will quite often be an issue to some extent, because in most cases people cannot be forced to take part in a study, so respondents will have selected themselves by agreeing to take part. This will be something to think about with market research conducted by telephone, for example, because such a large proportion of people who are invited to take part will not do so.
When evaluating research, think about whether there are likely to be relevant differences between the people who respond, and those who don't.
These are the sorts of issues which need to be considered in relation to the adequacy of the sample, when evaluating statistical arguments. It is also important to think about the adequacy of the research methods used.