# Chi Square Goodness of Fit Test in Excel (Part 1)

*View application **MS **EXCEL **Chi-square Pearson test for simple hypotheses.*

After obtaining the experimental data (i.e., when there is a *fetch*) normally selects the distribution law, best describing the random variable represented by the given *sample*. Check how well the experimental data describes the selected theoretical distribution law, is carried out using *criteria of consent*. the *Null hypothesis*, is usually a hypothesis of equality of the distribution of a random variable to some theoretical law.

First, consider the use of *the goodness of fit Pearson X ^{2} (Chi-square)* in the case of simple hypotheses (parameters of the theoretical distribution are assumed to be known). Then criteria the case of complex hypotheses when it is only a form of distribution, and the parameters of this distribution and the value of

*statistics*

*are estimated/calculated on the basis of one and the same*

*X*^{2}*fetch*.

**note**: *the goodness of fit Pearson X^{2}* in the case of composite hypotheses, see article Verification of complex hypotheses by Chi-square Pearson in MS EXCEL.

**note**: In English literature the procedure *the goodness of fit Pearson X^{2} *has the title

*The chi-square goodness of fit test*.

Recall the procedure for testing hypotheses:

- based on
*fetch*calculates the value of*statistics*that matches the type of the proposed hypothesis. For example, testing the hypothesis about equality of average ? some specified value ?_{0}using*t**statistics*(if standard deviation not known), - subject to the truth of
*null hypothesis*, distribution of*statistics*is known and can be used to compute probabilities (for example, for*t**statistics*is a stationary distribution), - derived from
*fetch**statistics*is compared with the critical for the given significance level value (?-the quantile), *null hypothesis*reject if*statistics*is greater than the critical (or if the probability of getting this value*statistics*(p value)*significance level*, what is an equivalent approach).

will Hold *hypotheses* for different distributions.

## Discrete event

Suppose that two people play dice. Each player has his own set of bones. Players take turns throwing at once on 3 dice. Each round the winner is the one who will throw more than sixes. The results are recorded. One of the players after 100 rounds had a suspicion that the bones of his opponent is unbalanced, because he often wins (often throwing sixes). He decided to analyze how likely it is that the number of outcomes of the enemy.

**note**: since the cubes of 3, then over time you can throw 0, 1, 2 or 3 sixes, i.e. a random variable can take 4 values.

From probability theory we know that if the dice are symmetrical, the probability of sixes is subject to binomial law. Therefore, after 100 rounds the frequency of occurrence of sixes can be calculated using the formula

=BINOM.DIS(A7,3,1/6,FALSE)*100

the formula assumes that the cell ** A7** contains the corresponding number of dropped sixes in a single round.

**note**: The calculations are given in the the example file, on sheet Discrete.

For comparison, *observed* (Observed) and *theoretical frequency* (Expected) convenient to use histogram.

With a significant deviation of observed frequencies from theoretical distribution, *null hypothesis* on the distribution of a random variable theoretical law, must be rejected. That is, if the dice of the opponent unbalanced, then the observed frequencies will be "significantly different" from *binomial distribution*.

In our case, at first glance, the frequencies are close enough and without calculation it is difficult to draw an unambiguous conclusion. Apply *Pearson fitting criterion X ^{2}*, instead of subjective statements "materially different" that can be done on the basis of comparison of

*histograms*, to use the mathematically correct statement.

Use the fact that, owing to the *law of large numbers* observed frequency (Observed) with growth of *fetch* n tends to the probability corresponding to the theoretical law (in this case, *binomial law*). In our case the sample size n equal to 100.

we Introduce *test* *statistics*, which we denote X^{2}:

where O_{l} is the observed frequency of events that the random variable took a certain valid values, E_{l} – is the corresponding theoretical frequency (Expected). L is the number of values that can take the random variable (in our case equal to 4).

**note**: The *statistics* is a special case of *statistics* is used to calculate the *criterion of independence, Chi-square* (see test of independence Chi-square in MS EXCEL).

As can be seen from the formula, this *statistics* is a measure of the closeness of the observed frequencies to the theoretical, i.e., with it you can evaluate the "distance" between these frequencies. If the sum of these "distances" "too big", then these frequencies are "substantially different". Clearly, if our cube is symmetric (i.e., apply the *binomial law*), then the probability that the sum of the "distances" will be too big to be small. To calculate this probability we need to know the distribution of *statistics* X^{2} (*statistics* X^{2} computed on the basis of a random *fetch*, so it is a random variable and hence has its *probability distribution*).

a multidimensional analogue of *integral theorem of de Moivre-Laplace* it is known that when n>,? our random variable X^{2} asymptotically distributed the law of the X^{2} with L - 1 degrees of freedom.

so if the calculated value of *statistics* X^{2} (the amount of "distance" between frequencies) will be more than a certain limit value, we will have reason to reject *null hypothesis*. And when you check the *parametric hypotheses*, the limit value is specified via the *significance level*. If the probability that the statistics of X^{2} will take a value less than or equal to the calculated (*p**value*), will be less than the *significance level*, *null hypothesis* can be rejected.

In our case, the value of statistics is 22,757. The probability that the statistics of X^{2} will take a value greater than or equal 22,757 very small (0,000045) and can be calculated by the formulas

=ХИ2.DIS.Pkh(22,757,4-1) or

=ХИ2.TEST(Observed, Expected)

**note**: ХИ2.TEST() specially created to test the Association between two categorical variables (see article about the criterion of independence).

the Probability 0,000045 significantly less than the usual *significance level* 0,05. So, the player has every reason to suspect your opponent of dishonesty (*null hypothesis* about his honesty is rejected).

When using *criterion X ^{2}* you must follow the to the volume

*fetch*n was large enough otherwise it will be incompetent approximation X

^{2}-distribution distribution

*statistics X*. It is generally believed that it is sufficient that the observed frequencies (Observed) was more than 5. If not, then the small frequencies are combined into one or join other frequencies, whereby the combined value attributed to the total likelihood and therefore decreases the number of degrees of freedom

^{2}*X*.

^{2}-distributionin order to improve the quality of applications *criterion X ^{2}* (increase power), you must reduce the intervals of the partition (to increase L and thus increase the number of

*degrees of freedom*), however, this is hindered by the limitation on the number caught in each interval of observation (d. b.>,5).

**note**: the example above is a special case *test for independence Chi-square* (chi-square test), which allows you to determine whether there is a relationship between two categorical variables (see test of independence Chi-square in MS EXCEL).

**TIP**: About test other types of hypotheses, see article statistical hypothesis testing in MS EXCEL.

## Continuous case

*Pearson fitting Criterion X^{2} *can be used also in the case of continuous distributions.

here are some *sample*, consisting of 200 values. the *Null hypothesis* claims that *fetch* is made of the standard normal distribution.

**note**: the Random variable the example file, on the sheet Continuous generated using the formula =STANDARDS.ST.ARR(Rand()). Therefore, new value of *fetch* are generated at each recalculation of the worksheet.

whether an existing dataset normal distribution is possible to visually assess schedule of checking for normality (normal probability plot).

As can be seen from charts, sample values quite well along a straight line. However, as in the discrete case for *testing the hypothesis* applies *Pearson fitting Criterion X ^{2}.*

To do this, we divide the range of variation of a random variable into intervals with a step of 0.5 standard deviations. Compute the observed and theoretical frequencies. The observed frequencies we compute using the function FREQUENCY(), and the theoretical – using the function STANDARDS.ST.DIS().

**note**: As *discrete event*, you must make sure that *fetch* was big enough and in the interval were >,5 values.

Calculate statistics X^{2} and compare it with the critical value for the given *significance level *(0,05). Since we broke the range of variation of a random variable into 10 intervals, the number of degrees of freedom is 9. The critical value can be calculated according to the formula

=ХИ2.OBR.PH(0,05,9) or

=ХИ2.OBR(1-0,05,9)

the chart above shows that the value of statistics is of 8.19, which is significantly higher *critical value* – the *null hypothesis* is not rejected.

the following chart where you *fetch* took an unlikely value and based on the *criterion* *consent of Pearson's X ^{2 }*the null hypothesis was rejected (despite the fact that the random values were generated using the formula =STANDARDS.ST.ARR(Rand()), providing a

*sample*of

*standard normal distribution*).

*Null hypothesis* rejected although visually the data are close enough to a straight line.

as an example, take *sample* of the continuous uniform distribution U(-3, 3). In this case, even from the graph it is obvious that *null hypothesis* must be rejected.

*Criterion* *consent of Pearson's X ^{2 }*also confirms that

*null hypothesis*must be rejected.