Movatterモバイル変換

[0]ホーム

Jump to content

Chi-squared test

Edit links

From Wikipedia, the free encyclopedia

Statistical hypothesis test

Chi-squared distribution, showingχ² on thex-axis andp-value (right tail probability) on they-axis.

Achi-squared test (alsochi-square orχ² test) is astatistical hypothesis test used in the analysis ofcontingency tables when the sample sizes are large. In simpler terms, this test is primarily used to examine whether two categorical variables (two dimensions of the contingency table) are independent in influencing the test statistic (values within the table).^[1] The test isvalid when the test statistic ischi-squared distributed under thenull hypothesis, specificallyPearson's chi-squared test and variants thereof. Pearson's chi-squared test is used to determine whether there is astatistically significant difference between the expectedfrequencies and the observed frequencies in one or more categories of acontingency table. For contingency tables with smaller sample sizes, aFisher's exact test is used instead.

In the standard applications of this test, the observations are classified into mutually exclusive classes. If thenull hypothesis that there are no differences between the classes in the population is true, the test statistic computed from the observations follows aχ²frequency distribution. The purpose of the test is to evaluate how likely the observed frequencies would be assuming the null hypothesis is true.

Test statistics that follow aχ² distribution occur when the observations are independent. There are alsoχ² tests for testing the null hypothesis of independence of a pair ofrandom variables based on observations of the pairs.

Chi-squared tests often refers to tests for which the distribution of the test statistic approaches theχ² distributionasymptotically, meaning that thesampling distribution (if the null hypothesis is true) of the test statistic approximates a chi-squared distribution more and more closely assample sizes increase.

History

[edit]

In the 19th century, statistical analytical methods were mainly applied in biological data analysis and it was customary for researchers to assume that observations followed anormal distribution, such asSir George Airy andMansfield Merriman, whose works were criticized byKarl Pearson in his 1900 paper.^[2]

At the end of the 19th century, Pearson noticed the existence of significantskewness within some biological observations. In order to model the observations regardless of being normal or skewed, Pearson, in a series of articles published from 1893 to 1916,^[3]^[4]^[5]^[6] devised thePearson distribution, a family of continuousprobability distributions, which includes the normal distribution and many skewed distributions, and proposed a method of statistical analysis consisting of using the Pearson distribution to model the observation and performing a test of goodness of fit to determine how well the model really fits to the observations.

Pearson's chi-squared test

[edit]

Other examples of chi-squared tests

[edit]

Onetest statistic that follows achi-squared distribution exactly is the test that the variance of a normally distributed population has a given value based on asample variance. Such tests are uncommon in practice because the true variance of the population is usually unknown. However, there are several statistical tests where thechi-squared distribution is approximately valid:

Fisher's exact test

[edit]

For anexact test used in place of the 2 × 2 chi-squared test for independence when all the row and column totals were fixed by design, seeFisher's exact test. When the row or column margins (or both) are random variables (as in most common research designs) this tends to be overly conservative andunderpowered.^[10]

Binomial test

[edit]

For an exact test used in place of the 2 × 1 chi-squared test for goodness of fit, seebinomial test.

Other chi-squared tests

[edit]

Cochran–Mantel–Haenszel chi-squared test.
McNemar's test, used in certain2 × 2 tables with pairing
Tukey's test of additivity
Theportmanteau test intime-series analysis, testing for the presence ofautocorrelation
Likelihood-ratio tests in generalstatistical modelling, for testing whether there is evidence of the need to move from a simple model to a more complicated one (where the simple model is nested within the complicated one).

Yates's correction for continuity

[edit]

Main article:Yates's correction for continuity

Using thechi-squared distribution to interpretPearson's chi-squared statistic requires one to assume that thediscrete probability of observedbinomial frequencies in the table can be approximated by the continuouschi-squared distribution. This assumption is not quite correct and introduces some error.

To reduce the error in approximation,Frank Yates suggested a correction for continuity that adjusts the formula forPearson's chi-squared test by subtracting 0.5 from the absolute difference between each observed value and its expected value in a2 × 2 contingency table.^[11] This reduces the chi-squared value obtained and thus increases itsp-value.

Chi-squared test for variance in a normal population

[edit]

If a sample of sizen is taken from a population having anormal distribution, then there is a result (seedistribution of the sample variance) which allows a test to be made of whether the variance of the population has a pre-determined value. For example, a manufacturing process might have been in stable condition for a long period, allowing a value for the variance to be determined essentially without error. Suppose that a variant of the process is being tested, giving rise to a small sample ofn product items whose variation is to be tested. The test statisticT in this instance could be set to be the sum of squares about the sample mean, divided by the nominal value for the variance (i.e. the value to be tested as holding). ThenT has a chi-squared distribution withn − 1degrees of freedom. For example, if the sample size is 21, the acceptance region forT with a significance level of 5% is between 9.59 and 34.17.

Example chi-squared test for categorical data

[edit]

Suppose there is a city of 1,000,000 residents with four neighborhoods:A,B,C, andD. A random sample of 650 residents of the city is taken and their occupation is recorded as"white collar", "blue collar", or "no collar". The null hypothesis is that each person's neighborhood of residence is independent of the person's occupational classification. The data are tabulated as:

	A	B	C	D	Total
White collar	90	60	104	95	349
Blue collar	30	50	51	20	151
No collar	30	40	45	35	150
Total	150	150	200	150	650

Let us take the sample living in neighborhoodA, 150, to estimate what proportion of the whole 1,000,000 live in neighborhoodA. Similarly we take⁠349/650⁠ to estimate what proportion of the 1,000,000 are white-collar workers. By the assumption of independence under the hypothesis we should "expect" the number of white-collar workers in neighborhoodA to be

150\times {\frac {349}{650}}\approx 80.54

Then in that "cell" of the table, we have

{\frac {\left({\text{observed}}-{\text{expected}}\right)^{2}}{\text{expected}}}={\frac {\left(90-80.54\right)^{2}}{80.54}}\approx 1.11

The sum of these quantities over all of the cells is the test statistic; in this case, $\approx 24.57$ . Under the null hypothesis, this sum has approximately a chi-squared distribution whose number of degrees of freedom is

({\text{number of rows}}-1)({\text{number of columns}}-1)=(3-1)(4-1)=6

If the test statistic is improbably large according to that chi-squared distribution, then one rejects the null hypothesis of independence.

A related issue is a test of homogeneity. Suppose that instead of giving every resident of each of the four neighborhoods an equal chance of inclusion in the sample, we decide in advance how many residents of each neighborhood to include. Then each resident has the same chance of being chosen as do all residents of the same neighborhood, but residents of different neighborhoods would have different probabilities of being chosen if the four sample sizes are not proportional to the populations of the four neighborhoods. In such a case, we would be testing "homogeneity" rather than "independence". The question is whether the proportions of blue-collar, white-collar, and no-collar workers in the four neighborhoods are the same. However, the test is done in the same way.

Applications

[edit]

Incryptanalysis, the chi-squared test is used to compare the distribution ofplaintext and (possibly) decryptedciphertext. The lowest value of the test means that the decryption was successful with high probability.^[12]^[13] This method can be generalized for solving modern cryptographic problems.^[14]

Inbioinformatics, the chi-squared test is used to compare the distribution of certain properties of genes (e.g., genomic content, mutation rate, interaction network clustering, etc.) belonging to different categories (e.g., disease genes, essential genes, genes on a certain chromosome etc.).^[15]^[16]

References

[edit]

^"Chi-Square - Sociology 3112 - Department of Sociology - The University of utah".soc.utah.edu. Retrieved2022-11-12.
^^a ^bPearson, Karl (1900)."On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling".Philosophical Magazine. Series 5.50 (302):157–175.doi:10.1080/14786440009463897.
^Pearson, Karl (1893)."Contributions to the mathematical theory of evolution [abstract]".Proceedings of the Royal Society.54:329–333.doi:10.1098/rspl.1893.0079.JSTOR 115538.
^Pearson, Karl (1895)."Contributions to the mathematical theory of evolution, II: Skew variation in homogeneous material".Philosophical Transactions of the Royal Society.186:343–414.Bibcode:1895RSPTA.186..343P.doi:10.1098/rsta.1895.0010.JSTOR 90649.
^Pearson, Karl (1901). "Mathematical contributions to the theory of evolution, X: Supplement to a memoir on skew variation".Philosophical Transactions of the Royal Society A.197 (287–299):443–459.Bibcode:1901RSPTA.197..443P.doi:10.1098/rsta.1901.0023.JSTOR 90841.
^Pearson, Karl (1916)."Mathematical contributions to the theory of evolution, XIX: Second supplement to a memoir on skew variation".Philosophical Transactions of the Royal Society A.216 (538–548):429–457.Bibcode:1916RSPTA.216..429P.doi:10.1098/rsta.1916.0009.JSTOR 91092.
^Cochran, William G. (1952)."The Chi-square Test of Goodness of Fit".The Annals of Mathematical Statistics.23 (3):315–345.doi:10.1214/aoms/1177729380.JSTOR 2236678.
^Fisher, Ronald A. (1922). "On the Interpretation ofχ² from Contingency Tables, and the Calculation of P".Journal of the Royal Statistical Society.85 (1):87–94.doi:10.2307/2340521.JSTOR 2340521.
^Fisher, Ronald A. (1924). "The Conditions Under Whichχ² Measures the Discrepancey Between Observation and Hypothesis".Journal of the Royal Statistical Society.87 (3):442–450.JSTOR 2341149.
^Campbell, Ian (2007-08-30). "Chi-squared and Fisher–Irwin tests of two-by-two tables with small sample recommendations".Statistics in Medicine.26 (19):3661–3675.doi:10.1002/sim.2832.ISSN 0277-6715.PMID 17315184.
^Yates, Frank (1934). "Contingency table involving small numbers and theχ² test".Supplement to the Journal of the Royal Statistical Society.1 (2):217–235.doi:10.2307/2983604.JSTOR 2983604.
^"Chi-squared Statistic".Practical Cryptography. Archived fromthe original on 18 February 2015. Retrieved18 February 2015.
^"Using Chi Squared to Crack Codes".IB Maths Resources. British International School Phuket. 15 June 2014.
^Ryabko, B. Ya.; Stognienko, V. S.; Shokin, Yu. I. (2004)."A new test for randomness and its application to some cryptographic problems"(PDF).Journal of Statistical Planning and Inference.123 (2):365–376.doi:10.1016/s0378-3758(03)00149-6. Retrieved18 February 2015.
^Feldman, I.; Rzhetsky, A.; Vitkup, D. (2008)."Network properties of genes harboring inherited disease mutations".PNAS.105 (11):4323–432.Bibcode:2008PNAS..105.4323F.doi:10.1073/pnas.0701722105.PMC 2393821.PMID 18326631.
^"chi-square-tests"(PDF). Archived fromthe original(PDF) on 29 June 2018. Retrieved29 June 2018.

Weisstein, Eric W."Chi-Squared Test".MathWorld.
Corder, G. W.; Foreman, D. I. (2014).Nonparametric Statistics: A Step-by-Step Approach. New York: Wiley.ISBN 978-1118840313.
Greenwood, Cindy; Nikulin, M. S. (1996).A guide to chi-squared testing. New York: Wiley.ISBN 0-471-55779-X.
Nikulin, M. S. (1973).Chi-squared test for normality.Proceedings of the International Vilnius Conference on Probability Theory and Mathematical Statistics. Vol. 2. pp. 119–122.
Bagdonavicius, Vilijandas B.; Nikulin, Mikhail S. (2011)."Chi-squared goodness-of-fit test for right censored data".International Journal of Applied Mathematics & Statistics.24:30–50.MR 2800388.

Statistics

Descriptive statistics

Continuous data

Center	Mean Arithmetic Arithmetic-Geometric Contraharmonic Cubic Generalized/power Geometric Harmonic Heronian Heinz Lehmer Median Mode
Dispersion	Average absolute deviation Coefficient of variation Interquartile range Percentile Range Standard deviation Variance
Shape	Central limit theorem Moments Kurtosis L-moments Skewness

Count data

Index of dispersion

Summary tables

Dependence

Graphics

Data collection

Study design	Effect size Missing data Optimal design Population Replication Sample size determination Statistic Statistical power
Survey methodology	Sampling Cluster Stratified Opinion poll Questionnaire Standard error
Controlled experiments	Blocking Factorial experiment Interaction Random assignment Randomized controlled trial Randomized experiment Scientific control
Adaptive designs	Adaptive clinical trial Stochastic approximation Up-and-down designs
Observational studies	Cohort study Cross-sectional study Natural experiment Quasi-experiment

Statistical inference

Statistical theory

Frequentist inference

Point estimation	Estimating equations Maximum likelihood Method of moments M-estimator Minimum distance Unbiased estimators Mean-unbiased minimum-variance Rao–Blackwellization Lehmann–Scheffé theorem Median unbiased Plug-in
Interval estimation	Confidence interval Pivot Likelihood interval Prediction interval Tolerance interval Resampling Bootstrap Jackknife
Testing hypotheses	1- & 2-tails Power Uniformly most powerful test Permutation test Randomization test Multiple comparisons
Parametric tests	Likelihood-ratio Score/Lagrange multiplier Wald

Specific tests

Z-test(normal) Student'st-test F-test
Goodness of fit	Chi-squared G-test Kolmogorov–Smirnov Anderson–Darling Lilliefors Jarque–Bera Normality(Shapiro–Wilk) Likelihood-ratio test Model selection Cross validation AIC BIC
Rank statistics	Sign Sample median Signed rank(Wilcoxon) Hodges–Lehmann estimator Rank sum(Mann–Whitney) Nonparametric anova 1-way(Kruskal–Wallis) 2-way(Friedman) Ordered alternative(Jonckheere–Terpstra) Van der Waerden test

Bayesian inference

Correlation	Pearson product-moment Partial correlation Confounding variable Coefficient of determination
Regression analysis	Errors and residuals Regression validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS)
Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression
Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Homoscedasticity and Heteroscedasticity
Generalized linear model	Exponential families Logistic(Bernoulli) /Binomial /Poisson regressions
Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical /multivariate /time-series /survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality
Specific tests	Dickey–Fuller Johansen Q-statistic(Ljung–Box) Durbin–Watson Breusch–Godfrey
Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model(Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR)
Frequency domain	Spectral density estimation Fourier analysis Least-squares spectral analysis Wavelet Whittle likelihood

Survival

Survival function	Kaplan–Meier estimator (product limit) Proportional hazards models Accelerated failure time (AFT) model First hitting time
Hazard function	Nelson–Aalen estimator
Test	Log-rank test

Applications

Biostatistics	Bioinformatics Clinical trials /studies Epidemiology Medical statistics
Engineering statistics	Chemometrics Methods engineering Probabilistic design Process /quality control Reliability System identification
Social statistics	Actuarial science Census Crime statistics Demography Econometrics Jurimetrics National accounts Official statistics Population statistics Psychometrics
Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging