Chi Square Goodness of Fit Test in Excel (Part 1)
View application MS EXCEL Chi-square Pearson test for simple hypotheses.
After obtaining the experimental data (i.e., when there is a fetch) normally selects the distribution law, best describing the random variable represented by the given sample. Check how well the experimental data describes the selected theoretical distribution law, is carried out using criteria of consent. the Null hypothesis, is usually a hypothesis of equality of the distribution of a random variable to some theoretical law.
First, consider the use of the goodness of fit Pearson X2 (Chi-square) in the case of simple hypotheses (parameters of the theoretical distribution are assumed to be known). Then criteria the case of complex hypotheses when it is only a form of distribution, and the parameters of this distribution and the value of statistics X2 are estimated/calculated on the basis of one and the same fetch.
note: the goodness of fit Pearson X2 in the case of composite hypotheses, see article Verification of complex hypotheses by Chi-square Pearson in MS EXCEL.
note: In English literature the procedure the goodness of fit Pearson X2 has the title The chi-square goodness of fit test.
Recall the procedure for testing hypotheses:
- based on fetch calculates the value of statistics that matches the type of the proposed hypothesis. For example, testing the hypothesis about equality of average ? some specified value ?0 using tstatistics (if standard deviation not known),
- subject to the truth of null hypothesis, distribution of statistics is known and can be used to compute probabilities (for example, for tstatistics is a stationary distribution),
- derived from fetch statistics is compared with the critical for the given significance level value (?-the quantile),
- null hypothesis reject if statistics is greater than the critical (or if the probability of getting this value statistics (p value) significance level, what is an equivalent approach).
will Hold hypotheses for different distributions.
Suppose that two people play dice. Each player has his own set of bones. Players take turns throwing at once on 3 dice. Each round the winner is the one who will throw more than sixes. The results are recorded. One of the players after 100 rounds had a suspicion that the bones of his opponent is unbalanced, because he often wins (often throwing sixes). He decided to analyze how likely it is that the number of outcomes of the enemy.
note: since the cubes of 3, then over time you can throw 0, 1, 2 or 3 sixes, i.e. a random variable can take 4 values.
From probability theory we know that if the dice are symmetrical, the probability of sixes is subject to binomial law. Therefore, after 100 rounds the frequency of occurrence of sixes can be calculated using the formula
the formula assumes that the cell A7 contains the corresponding number of dropped sixes in a single round.
note: The calculations are given in the the example file, on sheet Discrete.
For comparison, observed (Observed) and theoretical frequency (Expected) convenient to use histogram.
With a significant deviation of observed frequencies from theoretical distribution, null hypothesis on the distribution of a random variable theoretical law, must be rejected. That is, if the dice of the opponent unbalanced, then the observed frequencies will be "significantly different" from binomial distribution.
In our case, at first glance, the frequencies are close enough and without calculation it is difficult to draw an unambiguous conclusion. Apply Pearson fitting criterion X2, instead of subjective statements "materially different" that can be done on the basis of comparison of histograms, to use the mathematically correct statement.
Use the fact that, owing to the law of large numbers observed frequency (Observed) with growth of fetch n tends to the probability corresponding to the theoretical law (in this case, binomial law). In our case the sample size n equal to 100.
we Introduce test statistics, which we denote X2:
where Ol is the observed frequency of events that the random variable took a certain valid values, El – is the corresponding theoretical frequency (Expected). L is the number of values that can take the random variable (in our case equal to 4).
note: The statistics is a special case of statistics is used to calculate the criterion of independence, Chi-square (see test of independence Chi-square in MS EXCEL).
As can be seen from the formula, this statistics is a measure of the closeness of the observed frequencies to the theoretical, i.e., with it you can evaluate the "distance" between these frequencies. If the sum of these "distances" "too big", then these frequencies are "substantially different". Clearly, if our cube is symmetric (i.e., apply the binomial law), then the probability that the sum of the "distances" will be too big to be small. To calculate this probability we need to know the distribution of statistics X2 (statistics X2 computed on the basis of a random fetch, so it is a random variable and hence has its probability distribution).
a multidimensional analogue of integral theorem of de Moivre-Laplace it is known that when n>,? our random variable X2 asymptotically distributed the law of the X2 with L - 1 degrees of freedom.
so if the calculated value of statistics X2 (the amount of "distance" between frequencies) will be more than a certain limit value, we will have reason to reject null hypothesis. And when you check the parametric hypotheses, the limit value is specified via the significance level. If the probability that the statistics of X2 will take a value less than or equal to the calculated (pvalue), will be less than the significance level, null hypothesis can be rejected.
In our case, the value of statistics is 22,757. The probability that the statistics of X2 will take a value greater than or equal 22,757 very small (0,000045) and can be calculated by the formulas
note: ХИ2.TEST() specially created to test the Association between two categorical variables (see article about the criterion of independence).
the Probability 0,000045 significantly less than the usual significance level 0,05. So, the player has every reason to suspect your opponent of dishonesty (null hypothesis about his honesty is rejected).
When using criterion X2 you must follow the to the volume fetch n was large enough otherwise it will be incompetent approximation X2-distribution distribution statistics X2. It is generally believed that it is sufficient that the observed frequencies (Observed) was more than 5. If not, then the small frequencies are combined into one or join other frequencies, whereby the combined value attributed to the total likelihood and therefore decreases the number of degrees of freedom X2-distribution.
in order to improve the quality of applications criterion X2 (increase power), you must reduce the intervals of the partition (to increase L and thus increase the number of degrees of freedom), however, this is hindered by the limitation on the number caught in each interval of observation (d. b.>,5).
note: the example above is a special case test for independence Chi-square (chi-square test), which allows you to determine whether there is a relationship between two categorical variables (see test of independence Chi-square in MS EXCEL).
TIP: About test other types of hypotheses, see article statistical hypothesis testing in MS EXCEL.
Pearson fitting Criterion X2 can be used also in the case of continuous distributions.
here are some sample, consisting of 200 values. the Null hypothesis claims that fetch is made of the standard normal distribution.
note: the Random variable the example file, on the sheet Continuous generated using the formula =STANDARDS.ST.ARR(Rand()). Therefore, new value of fetch are generated at each recalculation of the worksheet.
whether an existing dataset normal distribution is possible to visually assess schedule of checking for normality (normal probability plot).
As can be seen from charts, sample values quite well along a straight line. However, as in the discrete case for testing the hypothesis applies Pearson fitting Criterion X2.
To do this, we divide the range of variation of a random variable into intervals with a step of 0.5 standard deviations. Compute the observed and theoretical frequencies. The observed frequencies we compute using the function FREQUENCY(), and the theoretical – using the function STANDARDS.ST.DIS().
note: As discrete event, you must make sure that fetch was big enough and in the interval were >,5 values.
Calculate statistics X2 and compare it with the critical value for the given significance level (0,05). Since we broke the range of variation of a random variable into 10 intervals, the number of degrees of freedom is 9. The critical value can be calculated according to the formula
the chart above shows that the value of statistics is of 8.19, which is significantly higher critical value – the null hypothesis is not rejected.
the following chart where you fetch took an unlikely value and based on the criterion consent of Pearson's X2 the null hypothesis was rejected (despite the fact that the random values were generated using the formula =STANDARDS.ST.ARR(Rand()), providing a sample of standard normal distribution).
Null hypothesis rejected although visually the data are close enough to a straight line.
as an example, take sample of the continuous uniform distribution U(-3, 3). In this case, even from the graph it is obvious that null hypothesis must be rejected.
Criterion consent of Pearson's X2 also confirms that null hypothesis must be rejected.