# Chi-squared distribution in EXCEL

*Consider the Distribution of CHI-square. Using MS EXCEL functions **ХИ2.DIS()** construct graphs of the distribution function and probability density, explain the use of that distribution for purposes of mathematical statistics.*

*Distribution CHI-square* *( X^{2}, ХИ2, English. *

*Chi*

*squared*

*distribution*

*)*is used in different methods of mathematical statistics:

- when building confidence intervals for variance,
- the testing of hypotheses on the variance of the normal distribution,
- in testing the hypotheses using the goodness of fit Pearson (whether empirical data are consistent with our assumption of a theoretical distribution function or not, eng. Goodness-of-fit)
- in testing the hypotheses using the criterion of independence, Chi-square (used to define the relationship between two categorical variables, eng. Chi-square test of association).

**Definition**: If x_{1}, x_{2}, ..., x_{n} are independent random variables, distributed according to standard normal law N(0,1), the distribution of the random variable Y=x_{1}^{2}+ x_{2}^{2} +...+ x_{n}^{2} has *distribution X^{2 }*with n degrees of freedom.

*Distribution* * X^{2}* depends on one parameter, called

*freedom*(

*df*

*,*

*degrees*

*of*

*freedom*). For example, when building confidence intervals for variance

*the number of degrees of freedom*is equal to df=n-1, where n is the size of the

*fetch*.

*Density* * X^{2}* expressed by the formula:br>

if y>,0

**TIP**: Read more about *distribution Function *and* probability Density *see article distribution Function and the probability density in MS EXCEL.

## Graphic functions

*Distribution X^{2}* has an asymmetric shape, average value equal to n, variance equal to 2n.

the example file, on sheet Graph given *graphics of the density function* and probabilities *cumulative distribution function*.

**note**: To build a *distribution function* and *probability density* you can use a chart of type *Graph* or *Point* (with smooth lines and no dots). Learn more about charting, read the article Basic charts.

## Useful *ХИ2-distribution*

Let x_{1}, x_{2}, ..., x_{n} independent random variables, distributed according to *normal* with the same parameters ? and? and *X _{CP}* is

*arithmetic average*these values of x.

Then the random variable

*y*is

*X ^{2}*

*distribution*with n-1 degree of freedom. Using the definition of sample variance the above expression can be rewritten in the following way:

Consequently, *sampling distribution* statistics y when *sample* from *normal distribution*, has *X ^{2}*

*distribution*with n-1 degree of freedom.

This property we need when constructing the confidence interval in the estimate of the variance of the distribution. Because *variance* can only be a positive integer, and *X ^{2}*

*distribution*is used to evaluate the

*y*D. b. >,0, as specified in the determination.

## ХИ2-distribution in MS EXCEL

In MS EXCEL, beginning with version 2010, *X ^{2}*

*distribution*has a special ХИ2.DIS() English name – CHISQ.DIST(), which allows to calculate the

*probability density*(see the formula above) and

*integrated distribution function*(the probability that a random variable X with

*ХИ2*-

*distribution*, will take a value less than or equal to x, P{X <,= x}).

**note**: as *ХИ2-distribution* is a special case of gamma distribution, the formula =GAMMA.DIS(x,n/2,2,TRUE) for positive integer n returns the same result as the formula =ХИ2.DIS(x,n, TRUE) or =1-ХИ2.DIS.PH(x,n). And the formula =GAMMA.DIS(x,n/2,2,FALSE) returns the same result as the formula =ХИ2.DIS(x,n, FALSE), i.e. *probability density* *ХИ2 distribution.*

ХИ2.DIS.Pkh() returns the *distribution function*, specifically, right-sided probability, i.e. P{X > x}. It is obvious that true equality

=ХИ2.DIS.PH(x,n)+ ХИ2.DIS(x,n,TRUE)=1

since the first term calculates the probability P{X >, x}, and the second is P{X <,= x}.

To MS EXCEL 2010 in EXCEL was just a function ХИ2РАСП(), which allows you to calculate the right probability, i.e. P{X > x}. Possible new features of MS EXCEL 2010 ХИ2.DIS() ХИ2.DIS.Pkh() override the functionality of this feature. Function ХИ2РАСП() left in MS EXCEL 2010 for compatibility.

ХИ2.DIS() is the only function that returns a *probability density ХИ2-distribution *(the third argument must be equal to FALSE). The other functions return *integral distribution function*, i.e. the probability that a random variable will take the value from the specified range: P{X <,= x}.

the Above functions in MS EXCEL is given in the example file, on sheet Function.

## Examples

Find the probability that a random variable X will take a value less than or equal to the specified *x*: P{X <,= x}. This can be done in several functions:

= ХИ2.DIS(x, n, TRUE)

=1-ХИ2.DIS.PH(x, n)

=1-ХИ2РАСП(x, n)

ХИ2.DIS.Pkh() returns the probability P{X >, x}, the so-called right-sided probability, and therefore to find P{X <,= x}, it is necessary to subtract the result from 1.

Find the probability that a random variable X will take a value greater than the specified *x*: P{X >, x}. This can be done in several functions:

= 1-ХИ2.DIS(x, n, TRUE)

=ХИ2.DIS.PH(x, n)

=ХИ2РАСП(x, n)

## Reverse function ХИ2-distribution

the Reverse function is used to calculate *alpha*-quantiles, i.e. to calculate the values *x* for a given probability *alpha*, *x* must satisfy the expression P{X <,= x}=*alpha*.

ХИ2.OBR() is used to calculate the *confidence intervals for variance of normal distribution*.

ХИ2.OBR.Pkh() is used to calculate the top quantile, i.e. if the argument of the function specified significance level of 0.05 for example, the function will return a value of the random variable x for which P{X > x}=0,05. As a comparison: function ХИ2.OBR() will return a value of the random variable x for which P{X<,=x}=0,05.

In MS EXCEL 2007 and earlier, instead of ХИ2.OBR.Pkh() used ХИ2ОБР().

the Above functions can be interchanged, because the following formulas return the same result:

=HEE.OBR(alpha,n)

=ХИ2.OBR.PCH(1-alpha,n)

=ХИ2ОБР(1 - alpha,n)

Some examples are given in the the example file, on sheet Function.

## Functions MS EXCEL using ХИ2-distribution

In MS EXCEL there is another feature to use for calculations *ХИ2-distribution* is ХИ2.TEST(). An earlier version of her - ХИ2ТЕСТ(). For more information about using this function, see article about the test of independence Chi-square.

the following is the correspondence of Russian and English names of the functions:

ХИ2.DIS.Pkh() - eng. name CHISQ.DIST.RT, i.e., CHI-SQuared DISTribution Right Tail, the right-tailed Chi-square(d) distribution

ХИ2.OBR() - eng. name CHISQ.INV, i.e., CHI-SQuared distribution INVerse

ХИ2.Pkh.OBR() - eng. name CHISQ.INV.RT, i.e. CHI-SQuared distribution INVerse Right Tail

ХИ2РАСП() - eng. the name of the CHIDIST function is equivalent to a CHISQ.DIST.RT

ХИ2ОБР() - eng. name CHIINV, i.e. CHI-SQuared distribution INVerse

## Estimation of distribution parameters

because usually *ХИ2-distribution* is used for the purposes of mathematical statistics (computation of *confidence intervals* *test hypotheses, etc.)* and almost never to build models of real values, then for this distribution the discussion of the estimates of the distribution parameters is not performed herein.

## the Approach ХИ2 distribution normal distribution

the number of degrees of freedom n>,30 *distribution X ^{2}* is well approximated by a

*normal distribution*with

*average*?=n and

*variance ?*=2*n (see the sample file sheet Approximation).

**TIP**: About other distributions of MS EXCEL, you can read the article Distribution of a random variable in MS EXCEL.