Anderson-Darling Test for Normality


Quick Links

SPC for Excel Software

Visit our home page

SPC Training

SPC Consulting

Ordering Information

Thanks so much for reading our publication. We hope you find it informative and useful. Happy charting and may the data always support your position.

Sincerely,

Dr. Bill McNeese
BPI Consulting, LLC

View Bill McNeese

Connect with Us

Comments (54)

  • AnonymousJune 30, 2011 Reply

    awesome article !

  • AnonymousJuly 3, 2011 Reply

    Very Illustrative, Easy to adopt and enables any to tackle similar issues irrespective of age, education & position

  • AnonymousMay 29, 2012 Reply

    Thanks for the info.

    You will never know how much you helped!

  • AnonymousSeptember 8, 2012 Reply

    Well explained topic, thanks

  • AnonymousApril 14, 2015 Reply

    Very well explained in places, slightly ambiguous in others. Shame about the grammar used throughout the piece!

    • billApril 15, 2015 Reply

      And what is wrong with the grammar?  Ready fine to me!

  • RamasubramanianAugust 3, 2015 Reply

    How Anderson-Darling test is different from Shapiro Wilk test for normality?  

    • billAugust 3, 2015 Reply

      I have seen varying data on which approach is better – have seen where Shapiro-Wilk has more power.  But, I have not looked too much into the Shapiro-Wilk test.

  • LukeSSeptember 24, 2015 Reply

    Hi. This is really usefull thank you. However is there any way to increase the amount of data that can be analysed in this workbook? I’ve got 750 samples. I did change the maximum values in the formulas to include a bigger data sample but wasn’t sure if the formulas would be compromised.e.g E$701 =IF(ISBLANK(E2), NA(),SMALL(E$2:E$1000,F2))

    • billSeptember 24, 2015 Reply

      You can use the workbook with larger sample sizes.  You just need to be sure that it is changed in all formulas, including Avg, stdev, n, S and the ones containing SMALL. 

  • JamesJanuary 14, 2016 Reply

    Hi, Thanks for the info. I'm reproducing the steps in Excel but I don't want to compare with a Normal distribution, I have my own set of data and I want to check it with my own distribution. In this case how do generate F(Xi) using 10,000 data points I have for the distribution? 

    • billJanuary 15, 2016 Reply

      I am not sure I understand what you want to do. Maybe this:

      • Sort your data in a column (say column A) from smallest to largest.
        In Column B, put the numbers from 1 to 10,000
        In cell C1, enter = B1/10000
        Copy that from C2 to C10000
        Plot A vs C to generate the CDF
  • StefanMarch 29, 2016 Reply

    Is it possible to explain the correction in the calculation of the Z-value (see column L of sheet 2 in the embedded excel-sheet).<br /><br /> The P value is not calculated as i/n. But corrected and is now calculated as (i-0,3)/(n+0.4)<br /> Is it possible to give some substantiation of the used 0.3 and 0.4.

    • billMarch 29, 2016 Reply

      The method used is median rank method for uncensored data. This gives p = (i-0.3)/(n+.4).   There are other methods that could be used.  For example,  you could use (i-0.5)/n; or i/(n+1) or simply i/n. 

  • StefanMarch 30, 2016 Reply

    Thanks

  • JiniMay 18, 2016 Reply

    <p style="text-align: center;">This is really very informative article.I come to know about this useful test.thanks<p style="text-align: center;"> <p style="text-align: center;"> 

  • LaurenceFebruary 9, 2018 Reply

    Hi great article!! Can this be adapted for the lognormal distribution, I tried altering the formula in column H but it gave me some odd looking results (p =1)?Many Thanks 

    • billFebruary 19, 2018 Reply

      Thank you.  Yes, it can be adpated to calculate the Anderson-Darling statistics; however the p value calculation changes depending on type of distribution  you are examining.  The SPC for Excel software uses the p value calculations for various distributions from the book Goodness-of-Fit Techniques by D'Agostino and Stephens.

  • AnonymousFebruary 23, 2018 Reply

    Hi! Please tell me how the p-value is determined. Thanks!

    • billFebruary 23, 2018 Reply

      The p values come from the book mentioned above.  They are in tabular form usually.  If your AD value is from x to y, the p value is z.

  • AnonymousFebruary 26, 2018 Reply

    You said that the value of AD needs to be adjusted for small sample sizes. What is the range of number of data for it to be considered "small"? Thank you. By the way, this article is awesome! :)

    • billFebruary 28, 2018 Reply

      Thanks for hte comments.  I usually use the adjusted AD all the time.  As n gets very large, they become the same.  

  • AnonymousFebruary 27, 2018 Reply

    Hi! I have two sets of data and Im going to know their significant difference using z-test. I know that z-test requires normally distributed data. Should I determine the p value for both the two data or for each set? Thanks!

    • billFebruary 28, 2018 Reply

      You can do that.  How big is your sample size?  If it is too small, you might get an inaccurate result from doing this test.  If the sample size is too large, the z test may show a difference that is really not significant from a usefulness view.  Use your knowledge of the process.  Is there any reason to believe that the data would not be normally distributed?  

      • AnonymousMarch 1, 2018 Reply

        no reason really. but in our thesis, it is necessary to determine first if the data are normally distributed or not through the p value… we 150 sample size for each.. since i have two sets of data do u think that p-value should be determine from each set of data? and why is that?

        • billMarch 1, 2018 Reply

          If you have 150 data point sfor each set, I would start with a histogram.  If it looks somewhat normal, don't worry about it.  If not, then run the Anderson-Darling with the  normal probablity plot.  You do with both sets of data since I assume they come from 2 different processes.

  • Ricardo SchvartzaidMarch 14, 2018 Reply

    Very Helpfull informationTKS!!!

  • MerihJune 15, 2018 Reply

    Great Article. Thank you. 

  • futureNovember 30, 2018 Reply

    The text gives a value for AD statistic as "2.88" whereas the Excel sheet states "2.37". What's correct?

    • billNovember 30, 2018 Reply

      The text has the AD as 0.237  as well as the workbook.  I don't see a 2.88 anywhere in the text.

  • AnonymousJanuary 15, 2019 Reply

    This article was really useful, thank you!! The workbook made it super easy to follow along with the steps and 

  • Danny BMarch 18, 2019 Reply

    Awesome!Top quality stats lesson – will return in future. Thanks.

  • Prasanth BApril 10, 2019 Reply

    Nice Article on AD normality test.

  • AnonymousApril 19, 2019 Reply

    Thank you so much for this article and the attached workbook! It makes the test and the results so much easier to understand and interpret for a high school student like me. This has helped me a lot in a research project I did where I tested if the probability of successfully shooting three-pointers in basketball was normally distributed.

  • BradJuly 9, 2019 Reply

    This is extremely valuable information and very well explained. This greatly improved my understanding of testing normal distribution for process capability studies. Thanks for making this available for novices like myself.

  • AnonymousAugust 12, 2019 Reply

    Hello, this is a very usefull article. But i have a question. I have 1800 data points. My value for AD is 10 and my S is aprox. 3.500.000 are those high numbers normal or might there be a mistake on my behalf? My p value is 2,1*10^-24 which even for this test seems a bit low. If i plot all Points they are very close to the line in the middle. Thanks again for the article.

    • billAugust 12, 2019 Reply

      The Anderson-Darling test is not very good with large data sets like yours.  Large data sets can give small pvalues even if from a normal distribution.  I would just do a histogram and ask if it looks bell-shaped.  All the proof you need i think,

      • AnonymousAugust 13, 2019 Reply

        It does look Bell shaped. Thats the reason I tested with the Anderson Darling test. The problem with a just optic Test like looking at a histogram is that its not scientific and i have to write a paper on it. Can you recomend a diffrent test for such big data sets?

        • billAugust 13, 2019 Reply

          Not really; large data sets tend to make many tests too sensitive.  I would suggest you fit a normal curve to the data and see what the p-value is for the fit.  That would be more scientific i guess – but if it looks normal, i would be suspect of any test that says it is not normal.

  • tompFebruary 6, 2020 Reply

    Hello, this is super article. But i have a problem.

    I trayed use the VBA code form link in the article but as result I have only some thing like this -85,0097 in cell with function for this sample od data:

    23,787
    23,795
    23,708
    23,809
    23,839
    23,785
    23,757
    23,798
    23,71
    How to get S, AD, ADstar and Pvalue??

    Thanks in advance.

    • billFebruary 6, 2020 Reply

      The data are running together.  Can you send the data to me in an excel spreadsheet please?  [email protected]

  • Ani Peralai April 29, 2020 Reply

    Can you please tell me what changes need to be made if the distribution changes? Let's say, my data is known to follow Weibull distribution, how does the calculation of p-value and Anderson Darling differs? Does the p-value and the Anderson-Darling coefficient calculation remains the same? 

    • billApril 30, 2020 Reply

      The p value and Anderson Darling coefficient are dependent on the distribution you are testing.

  • Ani Peralai April 29, 2020 Reply

    I have another question. What's the case when the data is right censored? Does these calculations change?

    • billApril 30, 2020 Reply

      I have not looked into right censored data, so I don't have an answer for you.

  • OsamaNovember 20, 2020 Reply

    Great article, simple language and easy-to-follow steps.I have one qeustion, what if I want to check other types of distributions? Is there a function in Excel, similar to NORMDIST(), for other types of distributions?

    • billNovember 20, 2020 Reply

      Yes.  You can see a list of all statistical functions in Excel by going to Formulas, More Functions, and Statistical.  Our software has distribution fitting capabilities and will calculated it for you automatically.

    • billNovember 20, 2020 Reply

      Yes.  You can see a list of all statistical functions in Excel by going to Formulas, More Functions, and Statistical.  Our software has distribution fitting capabilities and will calculated it for you automatically.

  • AnonymousMay 22, 2021 Reply

    I AM VERY IMPRESSED WITH YOUR ELABORATIONS. THANK YOU. DR. KAVITHA MOHANDAS 

  • Devin DowningJune 16, 2022 Reply

    Hello Dr. Bill,I’ve turned into a long-term following of your YouTube videos and posts on your webpage; the content is insightful and helpful. I have been reading and studying as much as I can online about the Anderson Darling Normal Probability Plot and keep venturing back to your website to learn.In the below link you mentioned the P-Value needs to be >.20. “If the p-value (probability) for the Anderson-Darling statistic is less than 0.05, there is statistical evidence that the data are not normality distributed. If the p-value is greater than 0.20, the conclusion is that the data are normally distributed.” I’ve heard you mention several times in your videos about P-Value >.200 and I want to understand more.

    Normal Probability Plot Help | BPI Consulting (spcforexcel.com)So my question is; what makes a P-Value threshold level of >.200 so significant? I enjoy your content and YouTube videos, keep up the good work! I looked forward to your response, cheers!

    • billJune 17, 2022 Reply

      Hello Devin,
      Thank you for your kind words.
      On the p value: most things you read say the cutoff point if 0.05. If you are below that, then you reject the null hypothesis that is data are normally distributed. If you are above it, you accept the null hypothesis – the data are normally distributed.

      I never liked this because if you are at 0.049 you reject and if you are at 0.051 you accept. Someone taught me – don’t remember who, too many years ago – that if the p value is less than or equal to 0.05, you reject the null hypothesis, if it is between 0.05 and 0.2, you don’t know – you need to collect more data, and if it is greater than 0.2, you accept the null hypothesis. So, I don’t have a reference for this approach, but I like it and have used it for years now.

      Thanks,

      Bill

  • AnonymousAugust 19, 2022 Reply

    Thank you very much! Very useful article.Can the explained steps be used to check exponential behaviour of the data as well?

    • billAugust 21, 2022 Reply

      Yes but you have to use the exponential distribution.

  • VLBJanuary 4, 2024 Reply

    Since P value is a threshold, one can not only rely on it, You shall look to Normality test by various method, every process have difference distribution curve

Leave a Reply

Your email address will not be published. Required fields are marked *