Clasificación de los espacios de la vivienda

CAPÍTULO IV: CARGAS TÉRMICAS

4.3. Caracterización térmica de la vivienda

4.3.2. Clasificación de los espacios de la vivienda

Measures of dispersion

Measures of association

Measures of central tendency

Measures of central tendency show the middle or center of a sample or population. For example, the average value of a dataset is a measure of central tendency. There are three widely used measures of central tendency, and each has advantages and disadvantages:

Mode Median Mean

These three measures provide an instructive example of why understanding whether your data is nominal, ordinal, or interval is important. We address them in increasing order of complexity.

Mode

The mode of a dataset is simply the most commonly observed value. There is no formula for computing the mode, and the procedure is the same for both samples and populations. First sort the observations from the lowest to the highest. Then compute the frequency of each observation.

For example, suppose a sample consists of the following values: 0, 3, 8, 2, 2, 6, 5

The sorted values are as follows: 0, 2, 2, 3, 5, 6, 8

Because 2 appears twice in this sample, and no other value appears more than once, 2 is the mode.

One interesting feature of the mode is that unlike the mean or the median, there can be two or more modes, or no mode at all. For example, here’s a sample with two modes:

1, 1, 3, 5, 7, 8, 8, 9

In this case, the modes are both 1 and 8, since these values appear twice in the sample, whereas none of the others is repeated more than once.

An example of a sample with no mode: 0, 2, 4, 6, 8, 10

None of the values is repeated, so there is no mode.

The mode can be useful for non-numeric data, when computing a mean or a median would be impossible. The mode is essentially the only statistical measure that can be correctly applied to nominal data.

Median

The median of a dataset is its midpoint. In other words, half of the observations are below the median, and half are above it. There is no one formula for computing the median; you compute it in two steps. The first step is to sort the observations from the lowest to the highest value. The next step is to identify the “central” observation in the dataset.

For example, suppose a sample consists of the following values: 1, 4, 2, 7, 5

Here are the sorted values: 1, 2, 4, 5, 7

are above. Therefore, 4 is the median of this sample.

For an even number of observations, the procedure changes slightly. For example, suppose that a sample consists of the following values:

1, 4, 2, 7, 5, 6

Here are the sorted values: 1, 2, 4, 5, 6, 7

The third smallest and fourth smallest values are considered to be the two central values; in this case, they are 4 and 5. Take the midpoint between 4 and 5, which is 4.5. This is the median of the sample. Half of the observations in the sample (1, 2, 4) are below 4.5, and half of the observations (5, 6, 7) are above 4.5.

There is no difference in computing the median of a population. You compute both a sample and a population in exactly the same way.

The median can be a more meaningful measure than the mean when a dataset contains extremely large or small values, which are known as outliers. Outliers are the topic of Chapter 10.

Calculating the median requires you to be able to sort your data from lowest to highest. Obviously, then, your data needs to be ordered. As explained at the beginning of this chapter, ordering is exactly the property that characterizes ordinal data. So the median is a meaningful measure for ordinal data.

Mean

In statistics, the word mean is a synonym for average. You compute a mean by adding up all the elements in a dataset and then dividing by the number of elements. The equations in this section show how to compute the mean of a sample and the mean of a population.

Suppose a researcher chooses a sample of prices of a gallon of gas in a major city. The sample consists of the following eight prices:

$3.98, $4.19, $3.79, $3.99, $3.78, $3.69, $3.97, $4.13 He or she computes the mean price of this sample as follows:

size of 8, this gives a mean of $3.94 per gallon.

If the eight gas stations that were randomly chosen reflect the underlying population of gas prices, then the sample mean of $3.94 per gallon will provide a good estimate of the mean price for the entire population.

In general, the mean of a sample is computed as follows:

This formula uses the following terms:

(pronounced “X bar”) is the mean of the sample.

Σ (the upper case Greek letter sigma) is used to indicate that a sum is being computed.

n is the number of elements in the sample.

i is an index used to assign a number to each sample element, ranging from 1 to n. X_iis a single element in the sample.

The mean of a population is computed as follows:

The new term in this formula is μ, which is the Greek letter mu. It represents the mean of a population. Also note here the convention of using a capital N to represent the population size. This helps to avoid confusion with the sample size.

It’s common practice in statistics to use the Greek alphabet to represent summary measures of populations, and the English alphabet to represent summary measures of samples.

Calculation of the mean (and in fact all the other measures discussed in this chapter) requires that your data meet the requirements of interval data. Remember, interval data represents measurements that can be compared. Averages aren’t meaningful if your observations can’t be compared at face value.

See Chapter 4 for info on the normal distribution. This distribution has some very nice properties from a mathematical standpoint. One of the things that makes

it “normal” is that the three measures described in this section coincide. In other words, if a distribution is normal, then the mean, median, and mode are all equal.

In document Análisis de la viabilidad de un sistema de energía geotérmica en una vivienda unifamiliar (página 47-53)