A New Test for Independence in 2×2 Contingency Tables
DOI:
https://doi.org/10.18778/0208-6018.330.04Keywords:
independence test, 2×2 contingency table, logarithmic minimum statistics, modular statistics, power divergence statistics, Monte Carlo methodAbstract
In statistical literature there exist many tests to reveal the independence of two qualitative variables in two‑way contingency tables (CTs), in particular in 2×2 CTs. In this paper four independence tests were compared. These are: the chi‑square test, being the most popular type of power divergence statistics; the modular test and the d‑square test, which is a modification of the Pearson’s test; the logarithmic minimum test which is a new proposal. Critical values for the tests listed above were determined with the Monte Carlo method. In order to compare the tests, the measure of untruthfulness of H0 was proposed and the power of the tests was calculated.
Downloads
References
Agresti A. (2002), Categorical Data Analysis, Wiley, New Jersey.
Google Scholar
Albert J.H. (1990), A Bayesian test for a two‑way contingency table using independence, “Prior. Canadian Journal of Statistics”, vol. 18, no. 4, pp. 347–363.
Google Scholar
Andrés A.M., Tejedor I.H., Mato A.S. (1995), The Wilcoxon, Spearman, Fisher, χ2, Student and Pearson Tests and 2x2 Tables, “The Statistician”, pp. 441–450.
Google Scholar
Beh E.J., Farver T.B. (2009), An evaluation of non‑iterative methods for estimating the linear‑by‑linear parameter of ordinal log‑linear models, “Australian & New Zealand Journal of Statistics”, vol. 51, no. 3, pp. 335–352.
Google Scholar
Berry K.J., Mielke P.W. (1988), Monte Carlo comparisons of the asymptotic chi‑square and likelihood‑ratio tests with the no asymptotic chi‑square tests for sparse r×c tables, “Psychological Bulletin”, vol. 103, no. 2, p. 256.
Google Scholar
Blitzstein J., Diaconis P. (2011), A sequential importance sampling algorithm for generating random graphs with prescribed degrees, “Internet Mathematics”, vol. 6, pp. 489–522.
Google Scholar
Campbell I. (2007), Chi‑squared and Fisher‑Irwin tests of two‑by‑two tables with small sample recommendations, “Statistics in Medicine”, vol. 26, no. 19, pp. 3661–3675.
Google Scholar
Ceyhan E. (2010), Directional clustering tests based on nearest neighbor contingency tables, “Journal of nonparametric Statistics”, vol. 22, no. 5, pp. 599–616.
Google Scholar
Chang C.H., Lin J.J., Pal N. (2011), Testing the equality of several gamma means: a parametric bootstrap method with applications, “Computational Statistics”, vol. 26, no. 1, pp. 55–76.
Google Scholar
Chang C.H., Pal N. (2008), A revisit to the Behrens–Fisher problem: comparison of five test methods, “Communications in Statistics – Simulation and Computation”, vol. 37, no. 6, pp. 1064–1085.
Google Scholar
Chen Y., Diaconis P., Holmes S.P., Liu J.S. (2005), Sequential Monte Carlo methods for statistical analysis of tables, “Journal of the American Statistical Association”, vol. 100, pp. 109–120.
Google Scholar
Chen Y., Dinwoodie I.H., Sullivant S. (2006), Sequential importance sampling for multiway tables, “The Annals of Statistics”, pp. 523–545.
Google Scholar
Clogg C.C., Eliason S.R. (1987), Some common problems in log‑linear analysis, “Sociological Methods & Research”, vol. 16, no. 1, pp. 8–44.
Google Scholar
Cochran W.G. (1952), The χ2 test of goodness of fit, “The Annals of Mathematical Statistics”, pp. 315–345.
Google Scholar
Cochran W.G. (1954), Some methods for strengthening the common χ2 tests, “Biometrics”, vol. 10, no. 4, pp. 417–451.
Google Scholar
Cohen J., Nee J.C. (1990), Robustness of Type I Error and Power in Set Correlation Analysis of Contingency Tables, “Multivariate Behav. Res.”, vol. 25, no. 3, pp. 341–350.
Google Scholar
Cressie N., Read T. (1984), Multinomial Goodness‑of‑Fit Tests, “J. R. Stat. Soc. Ser. B. Stat. Methodol.”, vol. 46, pp. 440–464.
Google Scholar
Cressie N., Read T.R. (1989), Pearson’s χ2 and the log likelihood ratio statistics G2: a comparative review, “International Statistical Review/Revue Internationale de Statistique”, pp. 19–43.
Google Scholar
Cryan M., Dyer M. (2003), A polynomial‑time algorithm to approximately count contingency tables when the number of rows is constant, “Journal of Computer and System Sciences”, vol. 67, pp. 291–310.
Google Scholar
Cryan M., Dyer M., Goldberg L.A., Jerrum M., Martin R. (2006), Rapidly mixing Markov chains for sampling contingency tables with a constant number of rows, “SIAM Journal on Computing”, vol. 36, pp. 247–278.
Google Scholar
Cung C. (2013), Crime and Demographics: An Analysis of LAPD Crime Data, “M. sc. Thesis”, UCLA, Department of Statistics, Los Angeles.
Google Scholar
Davis C.S. (1993), A new approximation to the distribution of Pearson’s chi‑square, “StatisticaSinica”, pp. 189–196.
Google Scholar
Desalvo S., Zhao J.Y. (2016), Random Sampling of Contingency Tables via Probabilistic Divide‑and‑Conquer, “ArXiv preprint”, ArXiv, 1507.00070v4.
Google Scholar
Diaconis P., Efron B. (1985), Testing for independence in a two‑way table: new interpretations of the chi‑square statistics, “The Annals of Statistics”, pp. 845–874.
Google Scholar
Diaconis P., Sturmfels B. (1998), Algebraic algorithms for sampling from conditional distributions, “The Annals of Statistics”, vol. 26, pp. 363–397.
Google Scholar
Dickhaus T., Straßburger K., Schunk D., Morcillo‑Suarez C., Illig T., Navarro A. (2012), How to analyze many contingency tables simultaneously in genetic association studies, “Statistical Applications in Genetics and Molecular Biology”, vol. 11, no. 4, pp. 1544–6115.
Google Scholar
Egozcue J.J., Pawlowsky‑Glahn V., Templ M., Hron K. (2015), Independence in contingency tables using simplicial geometry, “Communications in Statistics‑Theory and Methods”, vol. 44, no. 18, pp. 3978–3996.
Google Scholar
El Galta R., Stijnen T., Houwing‑Duistermaat J.J. (2008), Testing for genetic association: a powerful score test, “Stat Med.”, vol. 27, no. 22, pp. 4596–4609.
Google Scholar
Fisher R.A. (1922), On the interpretation of χ2 from contingency tables, and the calculation of P, “Journal of the Royal Statistical Society”, vol. 85, no. 1, pp. 87–94.
Google Scholar
Fishman G.S. (2012), Counting contingency tables via multistage Markov chain Monte Carlo, “Journal of Computational and Graphical Statistics”, vol. 21, pp. 713–738.
Google Scholar
García J.E., González‑López V.A. (2014), Independence tests for continuous random variables based on the longest increasing subsequence, “Journal of Multivariate Analysis”, vol. 127, pp. 126–146.
Google Scholar
García J.E., González‑López V.A. (2016), Independence test for sparse data, “International Conference of Numerical Analysis and Applied Mathematics 2015”, AIP Publishing, vol. 1738, no. 1, p. 140002.
Google Scholar
Garcia‑Perez M.A., Nunez‑Anton V. (2009), Accuracy of the power‑divergence statistics for testing independence and homogeneity in two‑way contingency tables, “Commun. Stat. – Simul. Comput.”, vol. 38, pp. 503–512.
Google Scholar
Garside G.R., Mack C. (1976), Actual type 1 error probabilities for various tests in the homogeneity case of the 2×2 contingency table, “The American Statistician”, vol. 30, no. 1, pp. 18–21.
Google Scholar
Haber M. (1987), A comparison of some conditional and unconditional exact tests for 2x2 contingency tables: A comparison of some conditional and unconditional exact tests, “Communications in Statistics – Simulation and Computation”, vol. 16, no. 4, pp. 999–1013.
Google Scholar
Haberman S.J. (1981), Tests for independence in two‑way contingency tables based on canonical correlation and on linear‑by‑linear interaction, “The Annals of Statistics”, vol. 9, no. 6, pp. 1178–1186.
Google Scholar
Hall P., Wilson S.R. (1991), Two guidelines for bootstrap hypothesis testing, “Biometrics”, pp. 757–762.
Google Scholar
Hui‑Qiong L., Guo‑Liang T., Xue‑Jun J., Nian‑Sheng T. (2016), Testing hypothesis for a simple ordering in incomplete contingency tables, “Computational Statistics & Data Analysis”, vol. 99, pp. 25–37.
Google Scholar
Iossifova R., Marmolejo‑Ramos F. (2013), When the body is time: spatial and temporal deixis in children with visual impairments and sighted children, “Research in Developmental Disabilities”, vol. 34, no. 7, pp. 2173–2184.
Google Scholar
Irwin J.O. (1935), Tests of significance for differences between percentages based on small numbers, “Metron”, vol. 12, no. 2, pp. 84–94.
Google Scholar
Jeong H.C., Jhun M., Kim D. (2005), Bootstrap tests for independence in two‑way ordinal contingency tables, “Computational Statistics & Data Analysis”, vol. 48, no. 3, pp. 623–631.
Google Scholar
Koehler K.J., Larntz K. (1980), An empirical investigation of goodness‑of‑fit statistics for sparse multinomials, “Journal of the American Statistical Association”, vol. 75, no. 370, pp. 336–344.
Google Scholar
Lawal H.B., Uptong G.J.G. (1984), On the use of χ2 as a test of independence in contingency tables with small cell expectations, “Australian Journal of Statistics”, vol. 26, pp. 75–85.
Google Scholar
Lawal H.B., Uptong G.J.G. (1990), Comparisons of Some Chi‑squared Tests for the Test of Independence in Sparse Two‑Way Contingency Tables, “Biometrical Journal”, vol. 32, no. 1, pp. 59–72.
Google Scholar
Lin J.J., Chang C.H., Pal N. (2015), A revisit to contingency table and tests of independence: bootstrap is preferred to Chi‑Square approximations as well as Fisher’s exact test, “Journal of Biopharmaceutical Statistics”, vol. 25, no. 3, pp. 438–458.
Google Scholar
Lipsitz S.R., Fitzmaurice G.M., Sinha D., Hevelone N., Giovannucci E., Hu J.C. (2015), Testing for independence in J×K contingency tables with complex sample survey data, “Biometrics”, vol. 71, no. 3, pp. 832–840.
Google Scholar
Lydersen S., Fagerland M.W., Laake P. (2009), Recommended tests for association in 2×2 tables, “Statistics in Medicine”, vol. 28, no. 7, pp. 1159–1175.
Google Scholar
Meng R.C., Chapman D.G. (1966), The power of chi square tests for contingency tables, “Journal of the American Statistical Association”, vol. 61, no. 316, pp. 965–975.
Google Scholar
Nandram B., Bhatta D., Bhadra D. (2013), A likelihood ratio test of quasi‑independence for sparse two‑way contingency tables, “Journal of Statistical Computation and Simulation”, vol. 85, no. 2, pp. 284–304.
Google Scholar
Pearson K. (1904), On the theory of contingency and its relation to association and normal correlation, “K. Pearson, Early Papers”.
Google Scholar
Shan G., Wilding G. (2015), Unconditional tests for association in 2×2 contingency tables in the total sum fixed design, “Statistica Neerlandica”, vol. 69, no. 1, pp. 67–83.
Google Scholar
Sulewski P. (2009), Two-by-two Contingency Table as a Goodness-of-Fit Test, “Computational Methods in Science and Technology”, vol. 15, no. 2, pp. 203–211.
Google Scholar
Sulewski P. (2013), Modyfikacja testu niezależności [Modification of the independence test], “Statistical News – Central Statistical Office”, vol. 10, pp. 1–19.
Google Scholar
Sulewski P. (2014), Statystyczne badanie współzależności cech typu dyskretne kategorie [The statistical study of features interdependence of discrete categories type], Pomeranian University, Slupsk.
Google Scholar
Sulewski P., Motyka R. (2015), Power analysis of independence testing for contingency tables, “Scientific Journal of Polish Naval Academy”, vol. 56, no. 1, pp. 37–46.
Google Scholar
Sulewski P. (2016a), Moc testów niezależności w tablicy dwudzielczej [Power of independence test in the contingency tables], “Statistical News – Central Statistical Office”, vol. 8, pp. 1–17.
Google Scholar
Sulewski P. (2016b), Moc testów niezależności w tablicy dwudzielczej większej niż 2x2 [Power of independence test in the 2´2 contingency tables bigger than 2x2], “Statistical Review”, vol. 63, no. 2, pp. 191–209.
Google Scholar
Taneichi N., Sekiya Y. (2007), Improved transformed statistics for the test of independence in r×s contingency tables, “Journal of Multivariate Analysis”, vol. 98, no. 8, pp. 1630–1657.
Google Scholar
Vélez J.I., Marmolejo‑Ramos F., Correa J.C. (2016), A Graphical Diagnostic Test for Two‑Way Contingency Tables, “RevistaColombiana de Estadística”, vol. 39, no. 1, pp. 97–108.
Google Scholar
Wickens T.D. (1969), Multiway Contingency Tables Analysis for the Social Sciences, Psychology Press, United States.
Google Scholar
Yenigün C.D., Székely G.J., Rizzo M.L. (2011). A Test of Independence in Two‑Way Contingency Tables Based on Maximal Correlation, “Communications in Statistics‑Theory and Methods”, vol. 40, no. 12, pp. 2225–2242.
Google Scholar
Yoshida R., Xi J., Wei S., Zhou F., Haws D. (2011), Semigroups and sequential importance sampling for multiway tables, “ArXiv preprint”, ArXiv, 1111, p. 6518.
Google Scholar
Yu Y. (2014), Tests of independence in a single 2×2 contingency table with random margins, Doctoral dissertation, Worcester Polytechnic Institute, Worcester.
Google Scholar
Zelterman D. (1987). Goodness‑of‑fit tests for large sparse multinomial distributions, “J. Am. Stat. Assoc.”, vol. 82, pp. 624–629.
Google Scholar