• No se han encontrado resultados

Una antropología ignaciana a la luz de la mistagogía

The problem of information overload is a well-recognised symptom of the information era. Today’s businesses create such large amounts of computer-generated information that man- agement is often unable to make full use of all available data. The primary role of statistics is to provide managers with mathematical tools that will help them to organise and analyse data in an effective and meaningful way. By summarising the essential features and relationships of data, managers are better able to interpret details on product performance, patterns of consumer behaviour, sales forecasts, and other areas of interest. A brief summary of the main terms used in business statistics is now given.

A ‘population’ is a collection of all the items (data, facts or observations) of interest to the decision-maker. A ‘sample’ is a subset of a population. The set of measurements from the whole population is called a ‘census’. Normally, taking a complete census is economically impossible. Even if a manager could take a census, there are obvious constraints, namely: (i) time constraints – it is quicker to ask a question of 100 people than 10,000; (ii) cost constraints – it is cheaper to ask 100 than 10,000 people!

Decision-making means choosing between two or more alternatives. Good decision-making is based on evaluating which alternative has the best chance of succeeding. When managers refer to the chance of something occurring, they are using probability in the decision-making process. Common methods associated with probability are permutations, combinations and various probability rules (including Bayes’s Rule). Statistical sampling techniques include all those using random or probability sampling, whereby data items are selected by chance alone. Statistics can be divided into three main areas, namely, (i) descriptive statistics, (ii) proba- bility and (iii) inferential statistics.

Descriptive Statistics – Organising and Presenting Data

Descriptive statistics consist of techniques that help decision-makers to present data in a meaningful way. Frequency distribution tables, histograms, pie charts, scatter plots, and bar charts are some of the tools that allow data to be organised and presented in a manageable form (Groebner, Shannon et al., Chapters 1–3). The easiest method of organising data is to construct a frequency distribution table using classes. A class is simply a specified interval of interest, usually having an upper and lower limit. Presentation techniques such as frequency histograms can, however, be misleading when making comparisons between two different sets of data. For example, by varying the number and width of class intervals, histograms from two quite different sets of data may appear very similar!

Other statistical techniques are needed to test for differences between data groups. The more common of these attempt to determine whether the two groups have the same distribution, i.e., have they the same central location and spread? The central location is simply the middle or centre of a set of data. Common measures of central location are the mode, median and arith- metic mean. Statistical measures of spread are also called measures of variation (or dispersion) because they focus on fluctuations that occur on either side of the central location. The more common measures of spread are the range, variance and standard deviation.

EXAMPLE 2.2 Graphical presentation using Chart Wizard

The Wheelie Company is about to introduce a new line of tyres for racing bicycles. The quality control manager has been asked to present the results of a recent test of tyres on a hundred bicycles competing in the Tour de France. The following is a list of how far the 100 bicycles got (to the nearest 100 km) before one of the tyres failed to meet minimum EC standards. 38 24 12 36 41 40 45 41 40 47 26 15 48 44 29 43 28 29 37 10 37 45 29 31 23 49 41 47 41 42 61 40 40 45 37 55 47 42 28 38 38 48 18 16 39 50 14 52 33 32 51 10 49 21 44 31 43 34 49 48 28 39 28 36 56 54 39 31 35 36 32 20 54 25 39 44 25 42 50 41 9 34 32 34 42 40 43 32 30 45 20 29 14 19 38 46 46 39 40 47

The quality control manager has decided to use a frequency histogram for his presentation. His first task is to convert the raw data into a number of groups or classes, and then count the number of values which fall into each class. The number of values in each class is called the ‘class frequency’. The ideal number of classes N can be found by using the ‘rule of thumb’

inequality which states that 2N

must be greater than the number of observations, O. In this example, O= 100, and so 27 > 100, i.e., N = 7.

Class width=largest value – smallest value number of classes, N =

61− 9 7 = 7.43

Rounding the class width of 7.43 down to 7 and then applying this value to the data range (9–16), the following frequency distribution table is obtained:

Range 9–16 17–24 25–32 33–40 41–48 49–56 57–64

Class frequency 8 7 19 26 28 11 1

The quality manager’s next step is to enter this frequency table into a spreadsheet as shown in Figure 2.7. Excel’s ChartWizard is used to obtain a histogram. The steps required to produce graphical output are given on lines 33–43.

Probability – Measuring Levels of Uncertainty

Probability is the chance that something will happen. Probabilities are expressed as fractions or decimals. If an event is assigned a probability of 0, this means that the event can never happen. If the event is given a probability of 1 then it will always happen. Classical probability defines the probability that an event will occur, given that each of the outcomes are equally likely, as

P (event)= number of ways that the event can occur total number of possible outcomes

For example, what is the probability of rolling a 4 on the first throw of a dice? Since the total number of possible outcomes is 6, and the number of ways that 4 can be achieved with one throw is 1, the answer is P (event)= 1/6.

A probability distribution is similar to a frequency distribution because each uses intervals to group data items into more meaningful form. Probability distributions can be either discrete or continuous. A discrete probability distribution describes instances where the variable of interest can take on only a limited number of values, e.g., the rolling of a dice is limited to one of six numbers. In a continuous probability distribution, the variable can take any value in a given range, e.g., measuring a child’s growth over a specified time period. The mean of a discrete probability distribution is referred to as its ‘expected value’.

There are many different probability distributions, both discrete and continuous. The four more commonly used are (i) the Binomial distribution, which is a discrete distribution used to describe many business activities, (ii) the Poisson distribution which is a discrete distribution often used to count the number of occurrences of some event in a given period of time, (iii) the exponential distribution which is a continuous distribution used to measure the length of time needed to perform some activity, and (iv) the important continuous distribution known as the normal distribution.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 A B C D E F G H I

Example 2.2 - The Wheelie Company

No. of Class Relative

Classes Interval Frequency Frequency

1 9−16 8 7 19 26 28 11 1 0.08

2 17−24 0.07 Relative Frequency is the ratio of

3 25−32 0.19 the class frequency to the total

4 33−40

41−48 49−56 57−64

0.26 frequency, e.g. relative frequency

5 0.28 for class 1 is 8/100 = 0.08,

6 0.11 for class 2 is 7/100 = 0.07…etc.

7 0.01

Totals 100 1

Using ChartWizard to obtain graphical output – see Appendix for details [1] Highlight the (shaded) range C5:D11

[2] Click the ChartWizard button (with coloured columns) on the standard tool bar [3] From ChartWizard's Step 1 dialog box, choose Chart type 'column' and the Chart

sub-type (shown in row 1, column 1). Click twice on the 'next' button to go to Step 3. [4] Step 3 presents Chart Options. Enter titles for X and Y axes and then click on the

'Legend' tab. Clear the 'Show Legend' box. Then click on 'next' and 'finish' buttons. [5] Click anywhere on a column with the right-hand button and choose 'Format Data

Series' option. Click on 'Options' tab, change 'Gap width' to zero, and then click 'OK'. [6] Position the top-left of the chart in cell B14. Then click on the lower-right handle

and drag it into cell H30.

0 5 10 15 20 25 30 9–16 17–24 25–32 33–40 41–48 49–56 57–64

100s of km travelled before one tyre failed

Frequency

1 A B C D E F G H I 2 3 4 5 6 7 8 9 10 11 12 13 14 15 (i) (ii) (iii) (iv) Answers BINOMDIST(number, size, probability, cumulative)

where number is number of successful outcomes; size is the sample size;

probability is the sample probability; cumulative is a logical value;

if cumulative is set to TRUE then the function returns the cumulative Binomial probability; if FALSE it returns the individual Binomial probability

In this example, number = no. of sales made per day, size = 20, probability = 0.1

Probability of no sales, P(0) = BINOMDIST(0,20,0.1,FALSE) = BINOMDIST(4,20,0.1,FALSE) = 1 - BINOMDIST(4,20,0.1,TRUE) = 1 - BINOMDIST(3,20,0.1,TRUE) = Probability of four sales, P(4) =

Probability of more than 4 sales = Probability of 4 or more sales =

0.1216 0.0898 0.0432 0.1330 Example 2.3 - Using Excel's Binomial distribution function BINOMDIST

Figure 2.8 Checking out sales data with the BINOMDIST function.

Probabilities can be either individual or cumulative as demonstrated by Excel’s BINOMDIST function in Figure 2.8. The cumulative probability is the sum of all probabilities up to and including the particular probability. For example, consider the following probability distribu- tion table which gives the individual probabilities of selling different numbers of items:

No. of items sold 20 18 15 10

Probability 0.1 0.2 0.3 0.4

The cumulative probability of selling 15 items or more, usually denoted as P (≥15), is the sum of the probabilities of selling 15, 18 and 20, i.e., P (≥15) = 0.3 + 0.2 + 0.1 = 0.6. Similarly, the cumulative probability of selling 18 units or more, i.e., P (≥18), is the sum of the probabilities of selling 18 and 20 = 0.2 + 0.1 = 0.3. Conversely, the probability of selling less than 18 items, P (<18) = 1 − P (≥18) = 0.7, which is the same as the sum of the probabilities of selling 10 and 15 items. Note that in any probability distribution table, the sum of all the probabilities is always unity, i.e., 0.1+ 0.2 + 0.3 + 0.4 = 1.

EXAMPLE 2.3 Using Excel’s binomial distribution function BINOMDIST

A salesperson makes twenty calls per day to randomly selected houses. If the probability of the salesperson making a sale is 0.1, use Excel’s binomial distribution function BINOMDIST to find the probability of (i) no sales (ii) four sales (iii) more than four sales (iv) four or more sales. The spreadsheet of Figure 2.8 utilises both the individual and cumulative probability features of the distribution function BINOMDIST (click Excel’s Function Wizard button fxto

Inferential Statistics – Drawing Conclusions from Samples

Inferential statistics, usually abbreviated to inference, is a process by which conclusions are reached on the basis of examining only a part of the total data available. A typical example of inference is an opinion poll that is used to predict the voting pattern of a country’s popula- tion during an election. Statistical inference can be divided into two main areas – estimation and hypothesis testing. Estimation is concerned with drawing conclusions from population samples. The objective of hypothesis testing is to use sample information to decide whether a manufacturer’s claim about a product should be confirmed or refuted.

In operations management, quality-control testing relies heavily on statistical estimation to accept or reject production output. A quality-control manager will take a random sample of products and if it is found that the number of defective items is too high the batch will be rejected. Adjustments must then be made to the production process in order to eliminate, or at least reduce, the level of deficiency.

Most people at some time or other have purchased a box of matches with the label inscribed ‘contents 100 approx’. If anyone bothered to count the number of matches in a random sample of six boxes, they would most likely find that the contents varied from, say 98 to 102. It would be most unusual, if not unique, if every one of the six boxes contained the same number of matches. Putting this observation into statistical terms, when the mean is calculated from a sample the value obtained, ¯X , depends on which sample (of the many possible samples that could be chosen) is observed.

The difference between the population mean,µ (pronounced mu), and the sample mean, ¯X, is called the sampling error. Two samples from the same population are likely to have different sample values and therefore possibly lead to different conclusions being drawn. Consequently, managers need to understand how sample means are distributed throughout the population, i.e., they need to understand the concepts of sampling distribution. Consider the following example.

EXAMPLE 2.4 Sampling error model

The investment manager of AstroReturns stockbrokers has been asked by a client to determine the average return on her portfolio investment of six stocks. The returns on each stock for last year are:

Stock A B C D E F

Return (%) 8 11 –3 18 3 5

The population meanµ for the six stocks is (8 + 11 − 3 + 18 + 3 + 5)/6 = 7%.

In this example, in order to illustrate the concept of sampling error, the investment manager will base his report on a simple random sample of three stocks from the six available. Because the population is so small – consisting of only six stocks, the investment manager could easily have carried out a census (i.e., show thatµ = 7). In Table 2.3, the twenty possible combinations of samples, along with their sample means, ¯X , have been arranged in ascending order from 1.67 to 12.33.

When the sampling errors, µ – ¯X, are calculated, they show a wide variation, ranging from+5.33 to −5.33. The client could therefore be seriously misled, depending upon which sample(s) the investment manager included in his report. Being aware that there will always be

Table 2.3

Stock Average Stock Average Stock Average

sample % return sample % return sample % return

CEF 1.67 CDE 6.00 DEF 8.67

ACE 2.67 BEF 6.33 ADE 9.67

ACF 3.33 CDF 6.67 ADF 10.33

BCE 3.67 ABE 7.33 BDE 10.67

BCF 4.33 ACD 7.67 BDF 11.33

ABC 5.33 ABF 8.00 ABD 12.33

AEF 5.33 BCD 8.67

a sampling error, is only part of the problem. Since the investment manager cannot know in advance how large the sampling error will be, he must organise his ¯X data in order to obtain a clearer picture of how the sample means are distributed.

The easiest way of presenting data is to construct a frequency distribution table. All of the sample means, ¯X , lie within the range 1–13. By taking six classes of interval size 2, ranging from 1–3 to 11–13, the graph of Figure 2.9 can be created to show the distribution of all possible ¯X values, i.e., the sampling distribution of ¯X . It can be seen that the distribution follows a normal bell-shaped curve. This feature illustrates one of the characteristics of the important central limit theorem. Another aspect of the central limit theorem is that the ‘mean of the means’, i.e., the mean of the sampling distribution, is equal to the population mean,µ.

Using Excel’s ChartWizard, follow the steps below to obtain the graph in Figure 2.9: