# Chi Square Goodness of Fit Test in Excel (Part 2)

*View application **MS** EXCEL** Chi-square Pearson test to composite hypotheses.*

In the case of testing complex hypotheses, we set only the shape of the distribution, distribution parameters, in contrast to the simple hypothesis, the unknown. From *fetch* first you need to estimate these unknown parameters, and then calculate statistics X^{2} (the same procedure as that for simple hypotheses).

**note**: to Start with *the criterion of fit Pearson X ^{2} (Chi-square)* is recommended in the case of simple hypotheses, see article Test simple hypotheses by Chi-square Pearson in MS EXCEL.

In the case of complex hypotheses, p value, which we compare with the significance level, calculated using *X ^{2}*-distribution L-k-1 degrees of freedom, where k is the number of estimated parameters.

If the probability that a random variable with *X ^{2}*-distribution with L-k-1 degrees of freedom will take a value greater than the calculated statistics X

^{2}, i.e., X

^{2}

_{L-k-1}>,X

^{2}

_{0}, less

*significance level*,

*null hypothesis*is rejected.

here are two examples of testing composite hypotheses.

## Poisson Distribution

the hypothesis that the number of defects in the chips has Poisson distribution. Was investigated by *fetch* of the 50 chips.

based On *fetch* rate ? (*lambda)* is the only *Poisson* (it is average value, see the sample file sheet Complex.hypothesise). Using an estimate of the parameter distribution, we calculate theoretical frequency a =POISSON.DIS(0,?,FALSE).

As you can see from the figure above, the random variable (number of defects in the chip) takes 4 values (the fourth value corresponds to the case "3 or more" defects). Therefore, L=4, and the number *degrees of freedom* is 4-1-1=2.

statistics Compute the value of X^{2}_{0}, and then *p**value* to compare with *significance level* 0,05. In our case, *null hypothesis* that the number of defects is *Poisson distribution* could not be rejected, because *p**value *(0,676) is greater than 0.05.

it is Generally recommended that each interval contain at least 5 values (Expected). In our case this condition is not fulfilled, because for 3 or more defects, the theoretical frequency is less than 2. Combine intervals "3 or more" and "defect 2" in single spacing.

do Not forget to reduce by 1 the number of degrees of freedom, because we decreased by 1 the value of L. In the end, *p**value *will also change (0,396), but we are still no grounds to reject the null hypothesis.

## Normal distribution

Check complex hypothesis for continuous distributions.

specialist of the quality Department is testing electronic devices. The hypothesis that the magnitude of the output voltage of the device has normal distribution.

For *testing the hypothesis* taken *fetch* of the 100 devices middle sample is 4,999 In, standard deviation – 0,066.

In contrast to the discrete case (*Poisson distribution*) we want to divide a continuous range of variation of a random variable into several intervals. Usually the limits of the intervals are chosen so that the theoretical frequency was the same for each interval.

Divide the range into 8 parts. You need to determine the boundaries of the intervals so that the probability that a random variable will take a value of any interval was equal to 1/8=0,125. These boundaries can be calculated using the function =STANDARDS.OBR(1/8*i, 4,999, 0,066), where i is serial number border.

the Number of degrees of freedom equal to 8-2-1, because *fetch* we evaluated 2 of *normal distribution* (? and ?).

the Further procedure is similar to the test *simple hypotheses *(accounts, see the sample file sheet Complex.hypothetial).

**TIP**: About test other types of hypotheses, see statistical hypothesis testing in MS EXCEL.