**September 2013**

(Revised May 2016)

Gage R&R analysis is a common technique for analyzing “how good” your measurement system is. The R&R stands for repeatability and reproducibility. You want a measurement system that can tell the difference between parts or samples from your process – or, at a minimum, tell you if a part is within specifications. A Gage R&R analysis helps you answer these questions.

There are few things to consider when determining how to setup a Gage R&R analysis. First, is your test non-destructive or destructive? With a non-destructive test, the part is not altered during the testing. Different operators can measure the same part over and over. With a destructive test, the part is altered or destroyed during the test. Different operators cannot measure the same part. The next decision is the type of Gage R&R design to use: crossed or nested.

This month’s publication helps you determine what type of Gage R&R design to use.

In this issue:

- Gage R&R Overview
- Selecting a Gage R&R
- Crossed Gage R&R for Non-Destructive Measurement Systems
- Crossed Gage R&R for Destructive Measurement Systems
- Nested Gage R&R Design for Destructive Measurement Systems
- Summary
- Quick Links

### Gage R&R Overview

A Gage R&R study is a designed experiment to study the variation in measurement results. The experiment is design to determine how much variation is due to the test method and how much is due to the operators. This is done by performing measurements on parts from the process.

Repeatability is the “within operator” variation. It measures the variation one operator has when measuring the same “part” (and the same characteristic) using the same test measurement system more than one time. Repeatability is the measurement system error. It is sometimes called the equipment variation.

Reproducibility is the “between operators” variation. It is the variation due to different operators when measuring the same characteristic on the same part using the same measurement system. It is sometimes call the operator variation.

### Selecting a Gage R&R Design

Figure 1 lays out how to select the appropriate Gage R&R design.

**Figure 1: Selecting a Gage R&R Design**

The first question deals with if the measurement system is non-destructive or destructive. If it is non-destructive, then the parts are not altered during testing. In this case, you use a crossed Gage R&R design. In this type of design, you select a number of parts from the process (e.g., ten parts) and each operator tests each part multiple times.

If the part is altered during testing, then the measurement system is destructive. You cannot just select parts from the process and have each operator test each part. The part is altered or destroyed, so each part can only be tested one time by one operator. At this point, you have to begin to think about “batches.”

To conduct a gage R&R for a destructive measurement system, you have to make a critical assumption. You have to assume that you are able to identify a batch of parts where the parts are so close to being alike that you can reasonably assume that the parts in the batch are the “same.” You are making the assumption that the batch is homogeneous. If you measure any part of the batch for the same characteristic, you will get the same result in the perfect world.

The next question for the destructive measurement system involves the number of parts in each batch. Are there enough parts in each batch so that each operator in the study can run at least two parts from each batch? For example, suppose you are measuring the hardness of a steel bar and would like to do a Gage R&R on the hardness tester. You consider each bar to be a batch. You can cut the bar into 10 pieces to be tested for hardness. Each of these pieces is considered to be the same. Suppose you have three operators and you want each operator to test two pieces per batch. You need 6 pieces per bar to do this. So, you have enough pieces for each operator to test pieces from each batch. In this case, you use a crossed Gage R&R design.

Now suppose you were using 6 operators and wanted each operator to test 2 pieces per batch. This requires 12 pieces per batch, but each batch (bar) only provides 10 pieces. Since there are not enough pieces per batch, you will have to setup the design so each operator measures 2 pieces from a unique batch (bar). This is the nested Gage R&R design.

These designs are described in more detail below.

### Crossed Gage R&R Design for Non-Destructive Measurement Systems

A part or sample is not altered during a non-destructive gage R&R. For example, if you are measuring the length of a part, that part is not changed during the measurement. To perform a non-destructive gage R&R, you must decide on the following for the study:

- The number of operators
- The number of parts
- The number of trials

For example, you may decide to use 2 operators, 3 parts and 2 trials. This means that operator 1 will measure each of the 3 parts 2 times. Operator 2 will measure the same 3 parts 2 times. In this example, there will be a total of twelve measurement results. This type of experiment is called a “crossed” gage R&R because each operator tests each part. The experimental layout is shown below in Figure 2.

**Figure 2: Crossed Gage R&R for Non-Destructive Measurement System**

Figure 2 shows the 2 trials for part 1 that Operator 1 runs. Suppose Operator 1 gets the following tests results for the two trials: 0.29 and 0.41. The results are not the same. Not surprising. Everything varies. The first question to ask is

*What are the sources of variation in these two numbers?*

The part is the same. The operator is the same. The variation is simply a measure of how well one operator can repeat the test on same part. So, the source of variation in these number is how repeatable the test method is – the repeatability. It is a measure how close the operator is to getting the same result each time he measures the same part. Each set of trials for each part, under each operator, gives another estimate of the repeatability.

Reproducibility is the variation in the average results for different operators. It will also include the interaction between operators and parts if it is significant. The part-to-part variation examines the variation in part averages. You hope that the part-to-part variation accounts for the vast majority of the variation in the results and that repeatability and reproducibility account for very little of the variation.

There are two methods commonly used to analyze the results of a crossed gage R&R:

- Average and range method
- Analysis of variance (ANOVA)

More recently, Dr. Donald Wheeler’s “EMP” (evaluating the measurement system) is becoming the preferred way to perform a Gage R&R analysis.

All these analysis methods have been covered in previous newsletters on measurement systems analysis.

### Crossed Gage R&R Design for Destructive Gage Measurement Systems

Each part is altered or destroyed with a destructive measurement system. So, each operator cannot measure the same part multiple times. As discussed above, this is where you have to make a critical assumption. You have to assume that you are able to identify a batch of material that is so close to the same that you can reasonably assume that the parts in the batch are the “same.” You are making the assumption that the batch is homogeneous.

For example, suppose you make powdered product like polyvinyl chloride (PVC). You take a one-pound “batch” of PVC from the final product silo. You want to test the material for inherent viscosity. The sample you test is destroyed during the testing. You can mix the sample very well and assume that the batch is homogeneous. Thus, the samples from this one-pound batch are the “same.”

The next question you must consider is the design setup. If all the operators can measure parts from each batch, then you can use the crossed Gage R&R described above. Figure 3 shows this design setup.

**Figure 3: Crossed Gage R&R for Destructive Measurement Systems
**

In this design, you have 2 operators measuring 2 parts from 2 batches. There are at least 4 parts per batch so a crossed design can be use. Operator 1 measures parts 1 and 2 from batch 1 while Operator 2 measures parts 3 and 4 from batch 1. These parts are different parts – but the assumption is that they come from a homogenous batch of material. Likewise, operator 1 measures parts 1 and 2 from batch 2 and Operator 2 measures parts 3 and 4 from batch 2. These are different parts from a different batch – but batch 2 is considered to be homogenous., i.e., the parts in batch 2 are the same.

What causes variation in the results between part 1 and part 2 in the design shown in Figure 3? It is the same test and the same operator – similar to the crossed Gage R&R explained above where the difference is due to the measurement system repeatability. But unlike the crossed design, the same part is not being tested. We are assuming that the parts are the same since they come from a homogeneous batch. In reality, the difference is due to the measurement system repeatability and the within-batch variation. The assumption is that the batch is homogeneous and that the within-batch variation should be small. If a large amount of variation comes from part 1 and part 2 results, then you should question whether you really have a homogeneous batch.

### Nested Gage R&R Design for Destructive Measurement Systems

This situation is the same as for the crossed Gage R&R for destructive measurement systems until you get to the point of determining if there are sufficient parts from each batch for each operator to test. If there are, as shown above, you use the crossed Gage R&R design.

But, if there are not sufficient parts in each for each operator to test multiple times, then you must use a nested Gage R&R design. Figure 4 shows how a nested R&R design for destructive measurement systems is laid out.

**Figure 4: Nested Gage R&R**

Suppose that each homogenous batch only contains two parts. There are not sufficient parts in each batch for the both operators to test. In this case, only one operator will test the two parts in a single batch. Each operator will test different parts in different batches. So, Operator 1 will test the two parts in Batch 1 and Batch 2. Operator 2 will test the two parts in Batch 3 and Batch 4.

When operator 1 runs the two parts in batch 1, what the sources of variation present? Again, it is the same operator. And we are assuming that parts 1 and 2 are essentially the same. So, this is estimating the repeatability of the test method. But it also contains the within-batch variability. This is true for destructive tests whether you use the crossed or nested Gage R&R design.

ANOVA is used to analyze the results of a nested gage R&R for destructive measurement systems. The average and range method cannot be used. We will look at how this done next month.

### Summary

This newsletter has shown the differences in setting up a gage R&R study for non-destructive testing and a gage R&R for destructive testing. For non-destructive measurement systems, you will always use the crossed Gage R&R design because each operator can test each part multiple times. With the destructive measurement systems, you have to be able to define batches that are homogeneous – batches where the parts are considered to be the “same.” You will use the crossed Gage R&R analysis if each operator can test multiple parts from each batch. However, if there are not enough parts in each batch to accomplish this, you will use the nested Gage R&R design.

Good

Please share GR&R Excel template as well so that we can work on it.

Kindly share it in proper excel format… ::)

Hello,

We don't have a template. This is part of the SPC for Excel software package. You can search for free Gage R&R templates on-line but they will have limited ability.

I'm currently analyzing the performance of a portfolio of 34 projects to determine how much we unnecessarily spent to give the customer what they wanted. I've asked 13 individuals who can estimate this value, but each for only a subset of these projects. I did repeatability measurements (asked twice) for 6 of those operators. I therefore have some repeatability and some reproducibility data. If I set up a 34 x 13 x 2 table (or even a 34 x 6 x 2 table) however there will be lots of holes. Any idea on how to do a Gage R&R (or similar) test to get an idea as to the accuracy of this measure? Thanks.

Not really a Gage R&R most likely. You can probably set up an analysis using control charts, but I would need to see the data. Lots of issues, including the size of the project. Are you doing as a %? You can send me the data if you would like me to look at: bill@SPCforExcel.com.

Thanks Bill. I will send you the data once I have it all which will be by early next week. Still need to get some repeatability measurements. Yes, it is in terms of % unnecessary spend. It is inherently subjective so no real 'standard' to apply it to.

I know that in the industry its commonly used a sample size of 10, but i need to know how can we determin this sample size with an statistical rationale. Does anyone know how to determind the sample size for a Gage study Attribute or Variable?

Please refer to this publicaiton on our website:

https://www.spcforexcel.com/knowledge/control-chart-basics/how-much-data-do-i-need-calculate-control-limits

While it talks about control limits, the same holds for Gage R&R. You want around 30 degrees of freedom. Start with having a number of operators * the number of parts be at least 15. Then with 2 trials you are at your 30 df.

Hi, I’m currently doing a GageR&R study about Rockwell Hardness on sintered metal products. Since Hardness Test is a destructive test that only alter (not destroy) the piece I’m doing a crossed R&R using 10 pieces of one particular product, 3 trial, 3 appraisers, in the end each piece would be tested 9 times in different locations assuming it is “the same part”.The result on %R&R of the total variation is over 90 with a ndc of 0,5, but using the tolerance range of that particular piece about Rockwell Hardness I have a %R&R of tolerance that is about 5 since all the piece fall within a hardenss between 69 and 72 while the tolerance range is 15 (only a small part of the tolerance range has been used). I cannot understand if ignoring the total variation I can be satisfied since I have a %R&R=5 of the tolerance or I have to consider also the total variation. I know that hardness test has already some inherent problem since a piece is not completely homogeneus but I guess in this case this fact is more relevant since sintered metal product are less homogeneus than a pure metal product. I would be very happy if someone could help me understand the right way to go with this type of study. Thanks.

So, your Gage R&R study says that, based on the variation in the parts you used in your study, there is a large % GRR based on this total variation. Question – are the parts representative of what you produce? If you are going to use the total variation base don the parts, this should be the case. If not, you can use a standard devaition from your process results for results. Either do a control chart on the hardness results for the part and estimate the standard deivation (sigma) or simply calculated sigma from a lot of data. If the parts are reprsentative, it simply means that the Rockwell test is not very good at telling the difference between parts in the process – this means you really can't use it for process control.

But if you base the results on the tolerance, you have 5% GRR. This simply means that the test is good enough to tell if the product is in spec or not.

i have to conduct msa study of push pull gauge by applying the load & check at which particular load sample parts lock will break.i cannot divide this part in multiple section because lock will cut.one part will one batch.confirm me how the study will done.

You would have to have parts that are similar – the idea of a batch decribed above – so similar that you can consider them to the be the same bar. It is a destructive test.

Hi Bill,we are seeing poor Assay and Content Uniformtiy results for a product.I want to design a study that explains the varaibility of the measurement system.Our product is typically extacted from its device, and injected by HPLC. i have limited product to use.However, i can spike the known analyte and continue with the extraction and inject by HPLC.As the product is spiked, would the potential study design be classed as "Crossed Gage R&R for Non-Destructive Mesurement Systems"?Also, as this is for Assay (5 devices is 1 sample injected) and Content Uniformity (1 sample is 1 injected), it is expected that outliers may be present. So should i spike at the nominal and also at outlier amounts?My example study:Operators: 2Spike Levels: Low, Nominal and High (each concentration level n=10), prepared by each operator (60 points in total)Replicates: Each operator will inject their own samples on separate HPLC system

I am not familiar with what you are doing. Are your samples large enough to split in two and test each part? If so, you can use this approach:

https://www.spcforexcel.com/knowledge/measurement-systems-analysis/monitoring-destructive-test-methods

This will give you the measurement system variability. Not on an operator by operator basis though. If you have enough in a "batch" for each operator to test at least twice, then you can use the crossed gage R&R method.

I have a question.If we hava only a batch in destructive test, GRR is applied?We designed 3 operators and 6 parts(=regard to repeat) for calculation GRR in destructive test.But above design(18 runs), we can't conform batch to batch variation.That is, we can only calculate the reproducibility and repeatability.(batch to batch variation=0, a batch is homogeneous.)Thanks.

I am not clear on what you are doing. Is part destroyed during testing? You just have six parts?

In destructive case, we would like to evaluate the gage R&R for measurement system analysis.The design (1 batch) is following as.Analyst1 : part1, part2, part3, part4, part5, part6Analyst2 : part7, part8, part9, part10, part11, part12Analyst3 : part13, part14, part15, part16, part17, part18Repeatability is part to part variation. (compounding repeat & part to part varition)Reproducibility is analyst to analyst variation.Q: Is it possible to evaluate as gage R&R concept?(%GRR shall not exceed 30%)(Unlike the usual gage R&R, repeatability did not caculate from normal design (10 parts * 3analyst * 6 repeat))Q: Is it okay to remove ouliters before Gage R&R?

Question: I am to conduct my first gage r&r and I've been asked to test and compare parts of different width, but doesn't this bias the entire analysis? And I'm not talking about small differences; some parts are three times the width as others. If you compare them in the same gage r&r, the part-to-part difference must overshadow all potential differences between operators etc? Or have I completely misunderstood some fundamentals of this analysis method?

You are correct. One way to improve the results from a Gage R&R is to throw in some pieces with much more variation. If you want the Gage R&R to be able to tell the difference between parts with a smaller width, then you need to use those parts in the study. Otherwise the part-to-part difference dominates. Focus on what your process produces. If there are widely different specs or sizes produced, you need to do the gage study for each product.

Hi, thanks for sharing the knowledge here. I have one doubt for Reproducibility show 0% and how to capture the Reproducibility in this method. please help me.

If reproducibility is 0% it means that all the operators got the same result for each part – if i understand what you are asking.

Hi, Thanks for your reply.1. Our test is destructive test2. We used Gauge R R ( Nested) methord3. Sample results different form each operators, even though we are getting Reroducibility 0% , Why?

Please send me the data and I will look at it. bill@SPCforExcel.com

Hi Bill, should the run order be 1 2 3 1 2 3 or can this be 1 1 2 2 3 3 for variable non-destructive test method validation? Does the run order matter?

When you run an Gage R&R study, the runs should be in random order. So you want to randomize the runs. You can use Excel's random number generator to that.

Hi. Reading the articles with interest. I am curious about how you would set it up including another dimension. Lets say the homogenity of the sample taking process is also a suspect. So, you would run multiple sample taking operators, multiple samples, multiple analysis operators, with multiple parts. Let – for simplicity – assume that the sample contains enough material to withdraw enough parts homogenously – or that they are repeatable. Would you simply "add" another layer to the gage R&R layout? (and multiply the required number of trials by the number of "sample taking operators"?) Followup: is there functionality in spcforexcel that allows for this change of "dimensions"?

The SPC for Excel software does not allow changing the level – for Gage R&R. you have mutilpe operators measureing multiple parts, multiple times. I will have to take a closer look at what you are asking about. Do you have some exmaple data you can send me? bill@SPCforExcel.com

Sorry for the late reply, but the answer is – not yet! 😉 We have not decided to run the trial including also the "sample taking process homogenity", we ran w/ 3 sources of variance (part, operator, analysis). It is of interest to also include the sampling process, so that you take an equal amount of parts from a set of sample procedures. By example, drawing a liquid sample from the a) top of the tank, b) bottom of the tank or c) insteam when filling the tank – could introduce separate distributions of results according to whether the sample comes from a type a, b or c sampling process. This variance would be nice to see alongside the standard ones.-Stian

I have a situation where I need a push out test for a seal. I have 7 different parts all in several different sizes. This is a destructive test… Each part can only be tested once. Obviously, It is destroyed…. So based on your writing I should use a nested test structure? Also, I need a definition of batch. By batch do you mean same parts from differenct production runs? Or, Batch means different parts run durnig a similar period of production. As an example 7 different parts, all different sizes, run in week-1. Batch of 7 different parts/sizes. Then a 2nd batch of 7 different parts/sizes from week-2… Please clarify…

Hi Jerry,

A batch is a group of parts that can be considered the “same.” I am not sure about your production process, but most of the time the “same” parts are made in the same production run. You say 7 different parts. Is each part produced in a separate run? or there more parts that size in the run?

Yes… Each of the 7 parts are produced on the same machine, but at different times. The Machine accepts different injection molded dies… So between each run a changeover is done and then the next part is run. All 7 of the parts are he same material, but different sizes and configurations. A series of tests are to be performed to qualify each production run. 1.) Non-Destructive Inner O-Ring height check. 2.) A Destructive surface roughness check. 3.) A Destrucive push out test of the O-Ring bottom surface. So to qualify that the equipment is repeatable, reproducible, and capable, we are looking to first quality the test equipment before running a Gage R&R and Capability Study of all 7 parts. Therefore, I would consider a batch equal to 10 parts, from each production run, tested with different operators. This should be sufficient to qualify the test equipment as being able to provide repeatable test results. What is your opinion?

EAch batch consists of the "same" part – so you have seven different batches corresponding to each part. You make multiple parts per batch and you assume those are the same.

Hi Bill, 3 questions – 1. is there a recommended minimum number of samples per batch and number of operators when the process variation is known to be large? 2. should failing parts be created as a group/batch for inclusion in the study? 3. How do you decide if % study variation or % study tolerance is used as acceptance criteiria? Thanks

Hello,

The more data you have the better of course. Usually 3 operators and ten parts with a replicate of 2 is sufficient. I would use a hisotrical standard deivation if you have it. If not, the parts should reflect the variation in the proess. Please see this link for acceptance criteria:

Acceptance Criteria for Measurement Systems Analysis (MSA) | BPI Consulting (spcforexcel.com)

Best Regards,

Bill