Pebble valuation by livelihoods
6. Farmers’ vulnerability to external stressors
6.1. Introduction
Proper organization of the data is critical if they are to be helpful in continuous improvement. Rational subgrouping allows the sample data to be collected and presented in a manner that will reveal the sources of variation. Data from each subgroup need to be collected from a small area so that homogenous conditions reside within the subgroup.
Data collected and presented in such a manner can potentially reveal variations from two sources: within subgroup and across subgroup. We would like to treat the within-subgroup variation as noise so that we can look for any signals between subgroups over time.
Example: A department has five punch presses, each of which has five cavities, that make Product A. Data can be collected in any of the following ways:
1. Select one sample from each press on the hour to create a subgroup size of five 2. Select five samples (one from each cavity) within each press
3. Select five samples within each cavity in each press
If method one is chosen, we are looking at the variation of the five presses from one hour to the next. If method two is chosen, we are looking at the variation within each press. If method three is chosen, we are looking at the variation within each cavity of a press.
Proper subgrouping of the data is critical to process improvement efforts. When subgroups are not rationally created, useful information may not be obtained.
A variety of basic statistical tools are available to help in understanding the process, where problems are occurring, and which problems are trivial and which are significant.
Histograms, Pareto charts, and check sheets are some basic tools that can be used by any team member to better understand what is occurring within a process.
2.3.2.1. Histograms
A histogram is a bar graph that provides a picture of the shape and spread of the data gathered by showing the frequency of the measurements or occurrences. A histogram can be constructed whenever quantitative (continuous) data or qualitative (categorical) data are involved. The measurement scales are shown on the x axis, and the frequency in each interval is shown on the y axis. The height of the bars in each interval is represented by the frequency of observations within that interval.
Histogram Statistics
A histogram and accompanying data allow for the following statistics to be calculated:
Mean: The arithmetic average of all the values. This is the sum of all the data in the data set divided by the number of observations in the data set.
Minimum: The smallest number in the data set.
Maximum: The biggest number in the data set.
Standard deviation: The amount of variation or dispersion there is from the “average”
A small standard deviation value indicates that data are clustered together, while a large value indicates that data are spread apart.
Bin width: The x axis distance between the left and right edges of each bin in the histogram.
Number of classes: The number of bins in the histogram.
Skewness: Is the histogram symmetrical? If so, skewness is zero. If the left-hand tail is longer, skewness is negative. If the right-hand tail is longer, skewness is positive.
Kurtosis: A measure of the peak of a distribution. The standard normal curve has a kurtosis of zero. Positive kurtosis indicates a “peaked” distribution, and negative kurtosis indicates a “flat” distribution.
Histogram Interpretation
The shape of a histogram will vary depending on the choice of the size of the intervals.
Once constructed, the histogram will show the pattern of variation in the data.
Histograms show shape, skewness, and modes.
Shape
The shape of the distribution conveys important information such as the probability distribution of the data. The normal curve (Figure 2.3.2.1-1), with its bell shape, means the data collected from the process under study have a normal distribution, which is the desired shape. If we know that the data fit into the normal distribution, we can use the probability tables from the standard normal distribution to make predictions about the data.
Symmetry (Skewness)
The skewed curve tells us that the distribution is not symmetrical. Positive skewed, or right skewed, data are so named because the “tail” of the distribution points to the right, while with negative skewed data, the distribution’s tail points to the left (Figure 2.3.2.1-2).
Real estate prices are usually skewed, since you may have a number of homes sold below or above the average selling price in a suburb. It is for this reason that the median statistic would be a better predictor than the mean.
Modes
The mode is the value that occurs most frequently in a set of data. It is found by simply counting the number of times each value occurs in a data set. A distribution with one such high point, such as a normal distribution, is called unimodal. A distribution that has two modes is called bimodal (Figure 2.3.2.1-3) and indicates that one may have collected data from a mixed population. For example, data could have been collected from two different machines making the same product. If this is the case, data need to be collected on each of the machines separately. Having more than two modes means that the distribution is multimodal and that stratification of the data might be a probable cause.
Constructing a Histogram
The following steps are involved in constructing a histogram:
1. Identify a characteristic of a product for which data need to be collected. For example, consider the volume of a bottle of detergent.
2. Determine the sample size, ideally about 100 data points or 100 bottles.
3. Select the samples and record the measurement value. In this case, the volume of detergent in each bottle sampled is recorded. Record the value on a tabular sheet in time order.
4. Find the range of the data set. This is done by studying the tabular sheet to find the highest and lowest volumes. If the heaviest volume is 156.3 oz and the lowest is 150 oz, then the range is 6.3 oz.
5. Decide on how many bins you will need for your histogram. Too few or too many will make interpretation of the data difficult. For continuous data as in the example, it often depends on the precision of the measurement instrument and of the equipment. For categorical data, it is simply the number of categories being used. In this example, seven bins should be used since the range is 6.3.
6. For continuous data, determine the width of each bin. Make sure the numbers do not overlap. In this example, the bins will have increments of 0.9 oz. Since the lowest value is 150 oz, the widths of the bins are as follows:
7. Post the data on the check sheet. Each value recorded on the tabular sheet becomes a talley mark on the check sheet.
8. Construct the histogram by using the information on the check sheet.
9. Record other information in the legend such as shift, operator, machine, and product name.
10. Calculate the histogram statistics from the 100 values in the data set.
11. Interpret the shape, symmetry, and mode of the histogram.
2.3.2.2. Pareto Charts
Italian economist Vilfredo Pareto was interested in finding out the distribution of wealth in his country. Accordingly, he collected data and showed that the wealth was unevenly distributed, with about 80% of the wealth in the hands of about 20% of the people. Dr.
Joseph Juran applied the same principle to business situations, and it became known as the Pareto principle. The Pareto principle states that a few errors or defects account for most of the problems or 80% of the effects are the result of 20% of the causes and is the basis for the Pareto chart. The Pareto principle is also known as the 80/20 rule or the law of the vital few. The significance of the Pareto principle is that it helps an organization specify causes of most of the process issues, or the vital few.
A visual representation of the Pareto principle, the Pareto chart is a simple graphical technique for rank-ordering data from the most important to the least important. Data are displayed in a format to compare the relative significance of events, costs, or any other measure. By distinguishing the significant errors and defects from those of lesser importance, organizations can leverage maximum improvement by focusing attention on a few of the causes rather than all of the causes.
It is a tool commonly used in problem solving. In the “Define” phase of the DMAIC structure, it can be used to identify those significant few problems so people can target them for process improvement and narrow the scope of the problem to focus the team. In the “Measure” phase of the DMAIC structure, it can be used as a drill-down tool to get to the most likely cause of a problem, to provide a basis for action.
As organizations conduct their business, data can be generated and collected from many sources, such as errors, defects, lead time, customer complaints, causes for rework, scrap, customer returns, and field failures. A common example is where 80% of sales volume or sales revenue comes from 20% of the customers. Capturing data on these measures of an organization’s processes and creating a Pareto chart can help identify different classes or types of problems. It also graphically displays the results so that the significant few problems emerge from the general background and enable sound business decisions.
Constructing a Pareto Chart
Following are common steps for constructing a Pareto chart:
1. Decide on what data to collect for example, a quality measure of a process such as customer returns, scrap, or lead times or a productivity/efficiency measure such as downtime causes.
2. Create a preliminary list of categories. For example, if selecting customer returns, the category may be “Reason for customer return” If selecting scrap or rework/reprocessing, the category may be “Cause for scrap or rework/reprocessing”
3. Decide on a time frame for data collection. Should it be one year, six months, or some other measure? Decide who will collect the data and how the data will be
collected. If needed, a check sheet can be created to capture the raw data and the source of data identified.
4. Collect the data for the desired time frame. Tally the occurrences in each problem category.
5. Create a table to calculate the values for the Pareto chart using the data (Table 2.3.2.2-1).
6. Enter the different categories in the first column.
7. Enter the frequency of occurrence against that category in the second column.
8. Once the first two columns are completed as shown in Table 2.3.2.2-2, compute the grand totals. The grand total equals the sum of the frequency of occurrence in all the categories. Enter this value at the bottom of column two.
9. Compute the individual percentage for each category. Individual percentage equals the frequency of occurrence divided by the grand total and multiplied by 100. Enter this value against each category in column three. This value represents the individual percentage contribution of that category to all the problems.
10. Calculate the cumulative percentage. The cumulative percentage for the first category is the same as the individual percentage. The cumulative percentage for the second category is the total of the individual percentages for the first two categories, and so on. Once the cumulative percentage has been computed, the Pareto chart can be drawn.
11. Draw a horizontal x axis and two vertical y axes. Mark the left y axis in increments from 0 to the grand total and label the axis “Frequency” Mark the right y axis in increments from 0 to 100 and label the axis “Percent”
12. Construct the Pareto chart starting on the left with the highest-frequency category and ending with the lowest frequency. The height of each bar should correspond to the frequency of occurrences for that category.
13. Label the bars with the category name under the horizontal x axis.
14. Place a dot in the chart that corresponds to the cumulative percentage value shown on the right axis for each of the categories. Connect the dots with a line showing the cumulative percentage total reached with the addition of each problem category.
The line should end at the 100% mark on the right axis.
15. Title the chart and include a brief synopsis of the data collection and source data.
Example of a Pareto Chart
A company collected data on the number of accidents in the various departments for a one year period (Figure 2.3.2.2-1).
The Pareto chart shows that the plating department has the highest number of accidents, with an individual contribution of 27.1%. Plating, heat-treat, assembly, and grinding combined contribute to 80.6% of the accidents.
2.3.2.3. Check Sheets
Data can be collected in many ways. One method is the use of check sheets. Check sheets allow people to collect and record data on a real-time basis at the location of the data source. The check sheet is typically used for data on locations. This can be anything from defect locations on a product to injuries on a body location to locations in a facility where injuries or other events occur. Like the Pareto chart, the check sheet attempts to narrow down occurrences or root causes to the vital few areas of needed focus.
In order to collect data, a check sheet needs to be set up. To set up a check sheet you must
decide on the type of data you want to collect and the time frame required for collecting the data. It captures the data at the source and is an input to process improvement tools such as Pareto charts and histograms.
Data are recorded on a check sheet by placing a tick mark; for example, “III” or “XXX”
indicates three instances for that location during the observation. Once the check sheet is completed, it is read by observing the number of tick marks on the sheet against each location.
Check sheets can be used for any of the following:
To categorize observations. Observations can be categorized and the frequency of occurrence captured under each category.
To show the location of an occurrence such as a measles chart. This type of check sheet is known as a pictogram (see Section 2.3.4.2, “Concentration Diagrams”).
To record inspection data. The check sheet is segmented into measurement intervals and the actual measurements are recorded as tick marks in the appropriate interval.
Constructing a Check Sheet
1. Decide what data to collect and why they need to be collected. What question will be answered by collecting these data? Who will collect the data? Select the appropriate data to be collected that will address the purpose.
2. Decide on the frequency, timing, and location for collecting the data.
3. Construct a form to collect the appropriate data. A table or picture may be used.
4. Create an operational definition for the data to be collected. In other words, define the categories, defect types, locations, and so on, that will help the person gathering the data determine the category where a tick mark should be placed.
5. Provide training to the person recording the data on how the data should be collected and recorded.
6. Let people know what is going on, and how the data will be used, in the area where the data collection is taking place.
7. Keep the data honest; do not discard data that disagree with the hypothesis.
8. Determine how the data will be analyzed.
9. Do not punish or blame people for what the data reveal. If you penalize individuals, subsequent data collection efforts may not be successful.
Example of a Check Sheet
A high school wanted to collect data on errors that occur on an essay test administered to students. A check sheet was constructed and data gathered (Table 2.3.2.3-1).
Example of a Pictogram (Concentration Diagram)
A car service center identified the location of rust on car doors by having a picture of the car door and marking the location of the rust for every car it serviced. “X” marks the location of the rust. At the end of the study, the pictogram identified areas where the service center will have to provide rust prevention. The pictogram (Figure 2.3.2.3-1) indicates that rust on the bottom of the door is a problem. There is no rust on the other areas of the car door.
Histograms, Pareto charts, and check sheets are all tools utilized in root cause problem solving. Each of these tools can be used for defining an issue or determining the significant root causes of a particular problem.