One of the purposes of control charts is to estimate the average and standard deviation of a process.The average is easy to calculate and understand – it is just the average of all the results.The standard deviation is a little more difficult to understand – and to complicate things, there are multiple ways that it can be determined – each giving a different answer.
Sometimes people ask why some software packages give different values for the control limits. Which program is correct? The answer is probably both. The difference is simply how the standard deviation is estimated. The objective of this newsletter is to show three different, but common, ways that the standard deviation may be estimated. We will look at data that are formed into subgroups and the control limits on the X chart.
In this issue:
- Data Subgroups
- Three Ways to Estimate the Standard Deviation
- Average of Subgroup Ranges
- Average of Subgroup Standard Deviations
- Pooled Standard Deviation
- Quick Links
As always, you can leave comments at the end of the newsletter.
The data we will use are shown in the table.We have 10 subgroups, each containing 3 observations or results.
Table 1: Subgroup Data
So, our subgroup size is constant for each of the 10 subgroups. The subgroup average, range and standard deviation have also been calculated for use below. The overall sum and average are given for subgroup averages, subgroup ranges and subgroup standard deviations – again for use below. There may be some minor differences due to rounding.
Three Ways to Estimate the Standard Deviation
We will look at three different ways to estimate the standard deviation. These impact how control limits are calculated. Control limits for the X chart are given by:
where UCL and LCL are the upper and lower control limits, n is the subgroup size, and σ is the estimated standard deviation of the individual values. Remember: the standard deviation of the subgroup averages is equal to the standard deviation of the individual values divided by square root of the subgroup size. These control limit equations may be different from the ones you normally use. They are not, as will be shown below.
The value of σ depends on the method you use to estimate it.We will look at three methods for estimating σ for subgroup data:
- Average of the subgroup ranges
- Average of the subgroup standard deviations
- Pooled standard deviation
Average of Subgroup Ranges
The average of the subgroup ranges is the classical way to estimate the standard deviation. The average range is simply the average of the subgroup averages when the subgroup size is constant:
where Ri is the range of the ith subgroup and k is the number of subgroups. The standard deviation is then estimated from the following equation:
where d2 is a constant that depends on subgroup size. Table 2 shows the values of d2 based on subgroup sizes up to 20. From the table, you can see that d2 for a subgroup size of 3 is 1.693.
Table 2: Constants for Control Charts
For the data in Table 1, the average range and σ are given by:
Using the estimate of the standard deviation from the average range, we can now calculate the control limits:
You may not be used to calculating control limits this way for the X chart. You probably use the following equations:
where A2 is a constant that depends on subgroup size. Consider just the UCL. There are two different equations for the UCL above, which must give the same result. So,
This can be rearranged to the following:
Substituting for R and solving for A2 gives:
Substituting in d2 and n for our example gives:
This is the value of A2 for a subgroup size of 3 that you find in the tabulated control chart constants for A2. For a table of these values, please see our newsletter ourX-R control charts. So, both methods for calculating the control limits are equivalent. The X control chart for these data is shown in Figure 1.
Figure 1: X Based on Sigma from Average Range
Average of Subgroup Standard Deviations
The average of the subgroup standard deviations could also be used to estimate the standard deviation. When the subgroup size is constant, the average of the subgroup standard deviations is given by:
where si is the standard deviation of the ith subgroup and k is the number of subgroups. The standard deviation is then estimated from the following equation:
where c4 is constant that depends on subgroup size. The values of c4 are shown in Table 2 above. For n = 3, the value of c4 is 0.8862. For the data in Table 1, the average standard deviation and ? are given by:
This value of σ is different than that estimated by the average range, which was 8.36.Thus, the control limits will be different also.The control limits based on the standard deviation estimated from the subgroup standard deviations are:
The differences are not large, but there are differences.This is why people wonder why the control limits can be slightly different.It is usually the way the standard deviation is estimated.The control chart with these limits will look about the same as in Figure 1 – just with the control limits a little wider in this example.
Pooled Standard Deviation
The pooled standard deviation, sp, can also be used to estimate the standard deviation.The standard deviation, σ, is equal to the pooled standard deviation divided by c4:
where Xij is the jth observation in the ith subgroup, Xi is the average of the observations in the ith subgroup, ni is the number of observations in the ith subgroup, c4 is the constant defined above but this time it depends on the degrees of freedom (df); which is given by the sum of the ni -1 values (the denominator under the square root sign. Thus,
For df = 20, the value of c4 is 0.9869. The estimated standard deviation is then given by:
This third method of estimating the standard deviation gives another value for σ.The control limits will be different as well:
The control chart will look the same as Figure 1 – again with slightly different control limits.
This newsletter has looked at the three different methods of estimating the standard deviation from data that are in subgroups. Each method gives a different value for the estimate standard deviation:
- σ from the average range = 8.36
- σ from the average standard deviation = 8.60
- σ from the pooled standard deviation = 8.66
This leads to different values for the control limits. Which method is correct? All three are correct. Dr. Donald Wheeler has suggested that the average range method is more robust than the pooled standard deviation. But in the end, the important thing is the story that the control chart is telling you about your process. Minor changes in the estimate of the standard deviation will not change this in most cases.
Thanks so much for reading our publication. We hope you find it informative and useful. Happy charting and may the data always support your position.
Dr. Bill McNeese
BPI Consulting, LLC