IntroductIon
The laws of physics in connection with mathematical models and tools are quite capable to deterministically describe natu-ral phenomena. But the more complex the situation to be de-scribed gets, i.e., the more influences need to be considered if we wanted to describe them accurately, the more we find our results to be erroneous. And in a scale where quantum effects also play a role, we have to deal with true uncertainty.
Statistics gives us a means to deal with probabilities or errors in measurements. With the help of statistics, we can (among others) describe data, calculate estimates of its distribution, and decide whether to reject a hypothesis in a rational and re-producible manner.
data
Data may come in discrete steps, as for instance, the number of patients that benefit from a certain therapy, or they may be continuous, meaning that the values can have infinitely many manifestations, even in a finite interval. An example for con-tinuous data are, for instance, the masses of tablets.
In reality, however, all our measuring devices have a limit-ed measuring accuracy, so in principle all experimental data comes in discrete quantities. This is important when statisti-cal methods demand continuous data as a prerequisite, like the Mann-Whitney U test. We have to deal carefully with values that occur multiple times – statisticians call them “ties”. (Ties are not supposed to occur with continuous data because there will always be a difference, however small it may be. But in gather-ing real data, ties become real, too, because of the limited preci-sion of the measurement.)
data VISualISatIon
In statistics we want to explore data and make inferences from it. An important first step is to visualise the data. The human brain has developed extraordinary capabilities for pattern rec-ognition, and thus we can grasp important statistical param-eters like location and spread of the data, or spot correlations, clusters, outliers, and so on, with a single glance.
Stem-and-leaf Plot
A simple means to display data is the stem-and-leaf plot. It puts the data in order and provides visual information on the loca-tion, spread, and form of the data. It retains a least two signifi-cant digits of each data point.
A stem-and-leaf plot is constructed as follows:1
1. Split each score or value into two sets of digits. The first or leading set of digits is the stem, and the second, or trailing, set is the leaf.
2. Draw a vertical line, and list all possible stem digits left to the line from lowest to highest.
3. For each data point write the leaf values on the line labeled by the appropriate stem number.
If appropriate, you can list each stem digit twice and put leaves starting with digits 0 to 4 on the first line, and leaves starting with digits 5 to 9 on the second.
SamPleS and PoPulatIonS
Many experiments have as an objective the definition or com-parison of two or more groups of data. For example, one may wish to compare the efficacy of two antihypertensive agents or a new antipsychotic drug versus a placebo. Or it may be desired to estimate the average drug content and variability of a batch of tablets. In virtually all such experiments, it is not realistic to observe all possible experimental units. In fact, sometimes the entire population of conceivable observations cannot be identi-fied completely. The potential experimental material for a clini-cal study comparing an antipsychotic drug to a placebo would include not only patients but also persons with the disease who are not yet diagnosed. All of these people are the population or universe. Clearly, one would not perform an experiment that included the entire population for many reasons:
• All of these people could not be identified.
• The time or money to conduct such a huge experiment is not available.
• To include so many people in such an experiment could be dangerous or unethical.
It is not necessary to run such a large experiment to arrive at a fair conclusion regarding the efficacy of the drug. In fact, in IntroductIon 133
locatIon ParameterS 134
deSIgn of exPerImentS and collectIon of data 135
deSIgn and conduct of clInIcal trIalS 137
the BInomIal and normal ProBaBIlIty dIStrIButIonS 138
eStImatIon and StatIStIcal Inference 141 data tranSformatIonS 173
Example 1. example of a stem-and-leaf plot
For example consider the heights of the students of my statistics class: Their heights in centimetres are as follows:
195, 191, 198, 185, 158, 170, 160, 158, 172, 165, 185, 169, 187, 180, 178, 172, 180, 173, 168, 168, 172, 174, 160, 184, and 171.
The corresponding stem-and-leaf plot is presented in Table 10-1.
Table 10-1. Stem-and-leaf Plot of Students’ heights
15 88
16 005889
17 01222348
18 004557
19 158
most cases, the test consists of a relatively small sample taken from a relatively large population.
Another more concrete example is the process of sampling in quality control. It may be of interest to estimate the proportion of defective tablets or the average drug content and uniformity of tab-lets in a production batch. Certainly in the latter case every tablet in the batch would not be examined because the test is destructive, i.e., the tablet is destroyed during the analysis for drug content.
Rather, a sample of 20 tablets would be chosen to estimate the av-erage drug content of the more than 1 million tablets in the batch.
Thus, in typical experiments in the pharmaceutical sciences, a small sample from the population is examined in order to make inferences about the large population.
Summary numBerS
In the next step we try to reduce the amount of data. We can try to specify a distribution by a mathematical model (e.g., nor-mal distribution) and a finite set of parameters of that distribu-tion. A sentence like, “The data follow a normal distribution with mean μ and variance σ2” contains more information than a thousand or more data points. This is because any normal distribution is completely determined by its mean and its vari-ance (or its standard deviation, the square root of the varivari-ance).
locatIon ParameterS
The single most important parameter for any distribution of data from ordinal or interval scales is its location parameter or central value.
mode
For data from nominal scales, we can only name a mode, i.e., the value that occurs most frequently. A data set or distribu-tion may have more than one mode; it is then called bimodal or multimodal.
median
The median is a location parameter for data on ordinal or in-terval scales. At most, one half of the data are lower than the median, and at most, one half of the data is higher. If the data set consists of an odd number of observations (n say), the median equals the (n+1)/2 observation in the ordered data set. For an even number of observations, the median can be either stated as the average of the (n/2)th and the (n/2 + 1)th value (for data from an interval scale) or as the (n/2)th or (n/2 + 1)th values them-selves, then referred to as “lower median” and “upper median.”
For a complete population the median is identical to the 50%
quantile.
mean
The arithmetic mean can only be stated for data from an inter-val (or ratio) scale. The population mean is commonly denoted by the Greek letter μ, whereas the mean of a sample data set is referred to as x¯. The sample mean x¯ is an unbiased estimator for the population mean μ. The mean is defined as the sum of the observed values divided by the number of measurements:
x=
( ∑
in=1xi)
n)
which makes it the average of the observed values.Spread Parameters
Other important information about a set of data is the variabil-ity or spread of the data.
Range
The difference between the largest and smallest value in a data set is known as the range.
VaRiance
Another measure for the variability of the data that depends on all values is the standard deviation ó or its square, the variance
δ2
For independent random variables X and Y, the variances are additive: Var(X + Y) = Var (X) + Var (Y). For instance, the con-tent of a drug in the parts of a divisible tablet varies by the vari-ance Vartbl of the content of the whole tablet, plus the varivari-ance Vardiv introduced by the breaking: Vartot=Vartbl+Vardiv. The additive property of variances is the key to analysis of variance (ANOVA, see below).
covariance
When considering data in more than one dimension, we are usually interested in the extend of the relationship of any two dimensions. For example, if we take the masses, compaction forces, hardness, and friability of a set of tablets, we might want to know whether compaction forces increase or decrease with the mass of the granules in the die, or how hardness or friabilty change with the compaction force. The answers might seem trivial in this example, but the statistical concept of covariance can give a quantitative answer to this question. The covariance is a generalisation of the variance. If we rewrite the formula for the calculation of the variance as:
Var x x x x x
we find that the covariance is quite similar:
Cov x y x x y y
We have just replaced the second factor in the numerator by the difference of value and mean in another dimension. (Thus, if we calculate the covariance of one dimension with itself, we get the variance.) From the equation above it is also obvious that Cov(x,y) = Cov (y,x).
The covariance Cov (x,y) tells us whether the two variables x and y are correlated or not: If the covariance is positive, y increases as x increases; if it is negative, y gets smaller with increasing x; if the covariance is zero, x and y are independent of each other.
frequency dIStrIButIonS
A frequency distribution of a data set can be constructed by counting the number of data points falling into a series of in-tervals (usually of equal size). The frequency distribution and its corresponding graph, a histogram or bar chart, show the distribution of the data, its central value (e.g., mean or median), and variability (e.g., SD or range). Example 2 shows the weights of 50 weanling rats to be used in an experiment.
Example 2. the weights in gram of 50 rats at weaning
Table 10-2 is a frequency distribution with 13 intervals derived from the data given in Example 2. A rule of thumb is to use 8 to 20 intervals, depending on the quantity and spread of the data.
The histogram or bar chart of these data is shown in Figure 10-1.
BIaS, PrecISIon, and accuracy
Precision refers to the reproducibility of a series of ments. If the values are very close to each other, the measure-ments are said to be precise. Accuracy refers to the closeness of measurements to the true value. For example, if a tablet con-tains exactly 200 mg of drug, and three analyses show a drug content of 205, 205, and 206 mg, it might be concluded that the analysis is precise, but not accurate. Bias refers to a sys-tematic difference from the true value. Figure 10-2 illustrates these concepts.
The three assays observed above seem to be biased on the high side, i.e., errors in the assay procedure result in too-high values. Figure 10-2 shows that “precise” data need not be ac-curate. In fact, there need not be any relationship or correla-tion between the qualities of precision and accuracy. Note that biased data cannot be accurate but can be precise.
In addition to the concept of bias in the area of experimental measurements, it appears also in the field of experimental de-sign. Bias can be introduced into an experiment, not because of an error in an experimental measurement, but because of poor judgment. For example, consider an experiment where the ef-ficacy of oral and sublingual nitroglycerin are to be compared by administering both products to 20 patients on two different occasions and measuring the time to incidence of an angina attack in a treadmill test. Each of 20 patients will receive both the oral and buccal forms. If each patient receives the buccal drug on Monday and the oral drug on the following Sunday, a bias may be observed in the experimental results even if the measurements are not biased. This could be due to either the day of the week when the test was given (gloomy Monday ver-sus a holiday weekend day) or an order effect where there is a
different effect, depending on which drug is given first. For ex-ample, there may be psychological factors causing the response to drug taken first to be systematically better (or worse) than that taken second, or the weather may be such as to cause more positive results on the first occasion. In the latter case, differ-ences between the two dosage forms would be exaggerated (bi-ased) in favor of the drug administered first, the buccal drug. To obviate this potential bias, we would give ten of the patients the oral drug first (Monday) and the buccal drug second (Sunday).
The other ten patients would receive the products in opposite order. Perhaps, an improvement in this design would be to test the drugs on the same day of the week, e.g., Monday.
deSIgn of exPerImentS and collectIon of data
The application of statistics in the analysis of data is optimal when the data are collected in a planned or designed manner.
If data are analyzed after the fact (retrospective analysis), great care should be taken to examine the data for possible bias.
For example, prescription-volume data gathered for the years 1970 to 1980 may be available only from cities with popula-tions greater than 500 000 or from cities in the Western States.
Clearly, conclusions from such data should not be applied in-discriminately to the entire country. Also, the information may have been gathered on a voluntary basis; without knowledge of the characteristics of those who did and did not supply the information, the conclusions could be tainted.
The manner in which data are collected is connected to the planning and design of experiments. In the collection of data, a small sample generally is taken from a large population or universe. Sometimes a sample is taken inadvertently when the original intention was to obtain data from the population. For example, when a questionnaire is sent to every pharmacist in the state, there always will be some people who do not respond to the questionnaire, and anything less than 100% response constitutes a sample. A variety of examples of sampling meth-ods is illustrated below.
SamPlIng By queStIonnaIre
Suppose that questionnaires on the sales of certain drugs were sent to all pharmacists in a state and only 50% were returned.
In this type of survey, the results tabulated from such a sample Table 10-2. frequency distribution of rat Weights
Weight
Group (g) Frequency Weight
Group (g) Frequency
24–25 1 38–39 3
26–27 2 40–41 5
28–29 4 42–43 2
30–31 6 44–45 1
32–33 8 46–47 1
34–35 9 48–49 1
36–37 7
10
8
6
4
2
0 24 28 32 36
Weight (grams)
40 44 48
Frequency
figure 10-1. Bar chart showing frequency distribution of weights of 50 weanling rats (data in Example 2).
Precise
(1) (2)
(3) (4)
Inaccurate Biased
Unbiased
Accurate
Inaccurate
Inaccurate Imprecise
figure 10-2. Diagram illustrating bias, precision, and accuracy.
The shots on targets 1 and 2 are biased; in both cases the shots cluster away from the bull’s-eye. The clusters on targets 3 and 4 both are unbiased; the center of each cluster is on the bull’s-eye. The shots on targets 1 and 3 are precise; both sets are bunched together. The shots on targets 2 and 4 are scattered widely, hence imprecise. Only the shots on target 3 are accu-rate—precise and unbiased.
probably would be biased because those who did not return the questionnaire would not be represented in the sample.
It has been shown that persons who respond may have differ-ent characteristics from those who do not respond. In this hy-pothetical example, perhaps unanswered questionnaires were represented largely by pharmacists who had large drug sales and were too busy to answer. In another community a phar-macist may have little or no sales of the drugs, resulting in a nonresponse. The reason for each unanswered questionnaire is unknown. These unreturned questionnaires cause a bias, the direction and magnitude of which is unknown.
Other potential errors in this type of response that may in-troduce bias include the way in which the question is asked, the order in which questions are asked, and the psychological interaction between the interviewer and respondent. Question-naire and survey techniques that can be employed to reduce or eliminate bias in the sample of responses have been proposed by mathematical statisticians.2
For example, public opinion polls use certain statistical sam-pling techniques that not only reduce bias but also optimize the information gathered. The Census Bureau has information about the percentages of men, women, and children in the US in various income and nationality groups, in addition to many other detailed categorizations. A sample may be designed to contain the same proportion of particular group(s) as that in the population. Instead of mailing questionnaires, interviewers may be recruited and assigned quotas of the types of people to interview. The interviewers fill out the questionnaires for each respondent during the interview, ensuring a complete response.
It is not possible to elaborate fully on the various methods of sampling here. One should be aware of problems in sampling and aware that a sampling design can be used that will give the limits of error of the resulting compilation for any given cost.2 SamPlIng In the chemIcal laBoratory
The procedure for gathering data in the laboratory differs from that of the questionnaire. Different kinds of sampling processes include the sampling of material to be assayed chemically or physically, sampling of analytical reagents and instruments when multiple instruments are available, and sampling of ana-lysts, i.e., the chemists who will perform the assay.
By way of illustration, several samples may be taken from a large lot of digitalis leaves for the chemical determination of acid-insoluble ash, or drug may be analyzed in samples taken from a blend. For the sample to be representative of the lot, the samples should be taken from different parts of the lot to ensure that every part of the lot is represented. Determinations from five samples taken from the same part of a lot (e.g., the top of a container) probably will have values closer together than five samples taken from different parts of the lot (e.g., the top, top-middle, middle, low-middle, and bottom of a container).
Despite the good precision, the former five samples may give a biased estimate of the average value for the lot. The more heterogeneous the lot, the more effort should be expended in being sure that every part of the lot is represented by a sample.
It might be that the granulation having the most drug is in the bottom of the lot; samples all taken from the top would give an estimate of average drug content that is considerably lower than the true value.
Another aspect of sampling in a chemical determination is the sampling of the chemists who perform the chemical analysis.
If a single chemist makes several determinations on portions taken from the same sample of thoroughly mixed material, one expects the results to be more precise than if several chemists made these determinations. Probably the true reproducibility of a method can be indicated only in terms of how closely an
If a single chemist makes several determinations on portions taken from the same sample of thoroughly mixed material, one expects the results to be more precise than if several chemists made these determinations. Probably the true reproducibility of a method can be indicated only in terms of how closely an