A Multivariate Extension of McNemar’s Test Based on Permutations

The purpose of this publication is to propose a permutation test to detect the departure from symmetry in multidimensional contingency tables. The proposal is a multivariate extension of McNemar’s test. McNemar’s test could be applied to 2 × 2 contingency tables. The proposal may be also treated as a modification of Cochran’s Q test which is used for testing dependency for multi‐ variate binary data. The form of the test statistics that allows us to detect the departure from counts symmetry in multidimensional contingency tables is presented in the article. The permutation meth‐ od of observations was used to estimate the empirical distribution of the test statistics. The consid‐ erations were supplemented with examples of the use of a multivariate test for simulated and real data. The application of the proposed test allows us to detect the asymmetrical distribution of counts in multivariate contingency tables.


Introduction
McNemar 's test was proposed in 1947's test was proposed in (McNemar, 1947. It is a statistical test used for paired nominal data. This test is applied to 2 × 2 contingency tables with a binomial outcome, with matched pairs of subjects, to determine whether there is marginal homogeneity. Typical applications involve two independent raters providing dichotomous judgments for the same set of ratings, or a panel of separate rates responding on two occasions to the same dichotomous variable. Bowker (1948) presented a generalisation of McNemar's test for k (k > 2) variables. The generalisations of McNemar's test for square tables larger than 2 × 2 are often referred to as the Stuart-Maxwell test (Stuart, 1955;Maxwell, 1970). Some of the modifications concern the extension of the test application to quantitative data, others to nominal polynomial data, and still others to multidimensional dependent dichotomous data.
The purpose of this publication is to propose a permutation test to detect the departure from symmetry in multidimensional contingency tables. The proposed test, like the Cochran Q test, leads to testing the null hypothesis on the independence of k (k > 2) binary variables. The null hypothesis is the same as in Cochran's Q test, but the alternative hypotheses in these tests are different. The use of the Cochran Q test leads to the detection of existing differences in the percentage of responses for individual variables, and the proposed test, like the Bowker test, lets us detect asymmetry of counts in multivariate contingency tables.

McNemar's test
Let us consider (Y i1 , Y i2 ) for i = 1, 2, …, n paired data with the binary response: "0" and "1". There are four possible outcomes for each pair: (0, 0), (0, 1), (1, 0) and (1, 1). Let us assume that: a) the sample of n subjects has been randomly selected from the population; b) each of the n subjects in the contingency table is independent of the other observations; c) the scores of subjects are in the form of a dichotomous categorical measure involving two categories; d) the sample size should not be extremely small.
The chi-square distribution is employed to calculate the McNemar's test statistic (McNemar, 1947). When the sample size is small, some sources endorse the use of a correction for continuity, while other sources prefer the exact binomial probability for the data to be computed instead of the chi-square based statistic (Fay, 2011). Suppose (Y i1 , Y i2 ) for i = 1, 2, …, n are identically and independently distributed bivariate data vectors. Let the mean vector of (Y i1 , Y i2 ) be (p 1 , p 2 ) and the null hypothesis H 0 : p 1 = p 2 against the alternative H 1 : p 1 ≠ p 2 . The observable data may be arranged in a 2 × 2 contingency table (see Table 1).
Source: own elaboration based on McNemar (1947) The empirical probabilities for cells π ij are shown in Table 2.
Total π. 0 π. 1 1 Source: own elaboration based on McNemar (1947) McNemar's test could be used for testing the hypothesis: (it means that the theoretical proportion of cell b equals the proportion of cell c in the underlying population the sample represents), against the alternative hypothesis: The test statistic in McNemar's test has the following form (McNemar, 1947): Under the null hypothesis, with a sufficiently large number of discordant pairs (b and c), the statistic Q has a chi-square distribution with one degree of freedom.
For small sample sizes, the modified form Q C of the test statistic (1) with continuity-correction should be calculated. This statistic has the following form (Sheskin, 2011): Statistics Q and Q C measure the asymmetry of counts in the contingency table (see Table 1). The test leads to rejection of the null hypothesis in the case of counts asymmetry in the contingency table.
An alternative form for the McNemar's test statistic is based on a normal distribution (Sheskin, 2011). The equation (3) can be employed to compute the McNemar's test statistic (1): Another form of test statistic (2) could be written as follows: For the small sample size, instead of statistic (2) with continuity-correction, the exact version of McNemar's test could be used (Sheskin, 2011). In this case, the odds ratio is calculated (Fay, 2011): where:

Some modifications and extensions of McNemar's test
McNemar's test may be applied to only binary categories for each outcome. It is possible that in some cases the outcomes could be classified into k categories where k is greater than two. The paired data that result from this type of experiment can be summarised in a k × k contingency table of counts. Such data can be analysed using the Bowker test (Bowker, 1948 Table 3. Source: own elaboration based on Bowker (1948) The form of the null hypothesis in the Bowker symmetry test could be written as follows: If n ij is the count of i-th row and j-th column in the contingency table (see Table 3), then the test statistic could be written in the following form (Bowker, 1948): Under the null hypothesis, the test statistic B has an asymptotic chi-square distribution, with k(k -1)/2 degrees of freedom. In the case of k = 3, the statistic (6) has the following form: It is visible that only counts that are symmetric in pairs (n 12 and n 21 , n 13 and n 31 , n 23 , and n 32 ) in the contingency table are compared.
There are some well-known extensions of McNemar's test. The generalisations of McNemar's test for square tables larger than 2 × 2 are often referred to as the Stuart-Maxwell test (Stuart, 1955;Maxwell, 1970). Feuer and Kessler (1989) discussed the generalisations of McNemar's test based on the case of two independent samples of paired univariate binary responses. They considered the null hypothesis that the marginal changes in each of two independently sampled tables are equal. Agresti and Klingenberg (2005) (see also Klingenberg, Agresti, 2006) considered methods for comparing two independent multivariate binary vectors. Pesarin (2001) considered methods for comparing two dependent vectors of sample proportions. Westfall, Troendle and Pennello (2010) consider the problem of multiple comparisons of dependent proportions. They argue that multiple comparisons of dependent proportions can be made more powerful by utilising testing methods and incorporating dependence structures. They proposed a method that utilises stepwise testing and discrete characteristics for exact McNemar's test. McNemar (1947) assumed that the data being analysed are measured on a nominal or ordinal scale. However, the experimental data may be often measured on at least an interval scale. Oyeka (2012) proposed an extension of McNemar's test which could be used for data measured on an interval or ratio scale. This modification is based on data transformation from a continuous scale to a nominal scale based on the formula: Cochran's Q test is an extension of McNemar's test to more than two matched samples (Donald, Shahren, 2018). When the Cochran's Q test statistic is computed with only k = 2 groups, the results are equivalent to the results obtained from McNemar's test. Cochran's Q could be also considered to be a special case of the Friedman test (Sheskin, 2011). When the responses are binary, the Friedman test becomes Cochran's Q test.
Suppose that there are k binary measurements on each of n subjects. Let y ij be the binary response from the subject i in the category j (i = 1, 2, …, n, j = 1, 2, …, k), with success = 1 and failure = 0. The null hypothesis for Cochran's Q test is that there are no differences between the categories (Sheskin, 2011). If the calculated probability is low (p-value is less than the selected significance level α), the null-hypothesis is rejected, and it can be concluded that proportions in at least 2 of k variables are significantly different from each other.
The null hypothesis in Cochran Q test could be written as follows: Under the null hypothesis, the test statistic Q has an asymptotic chi-square distribution with k -1 degrees of freedom.

Proposal of a multivariate extension of McNemar's test
Suppose (Y i1 , Y i2 , …, Y ik ) for i = 1, 2, …, n and k > 2 are identically and independently distributed bivariate data vectors. The multivariate binary data could be arranged as in McNemar's test in a contingency table. In the case of k variables, the contingency table will be a k-dimensional contingency table. The example of a three-dimensional contingency table is shown in Table 4.

Source: own elaboration
The hypothesis that the theoretical probabilities in symmetric cells in a multidimensional contingency table are equal will be considered. This hypothesis could be written as follows: The idea of the test statistic M k is based on the test statistic (6) used in the Bowker symmetry test. The main goal is to detect counts asymmetry in a contingency table. The independent permutation of each variable is used for obtaining the empirical distribution of the M k statistics under the null hypothesis. Permutation tests have optimum properties, which means good merit for its practical use (Oden, Wedel, 1975). The recommended number of data permutation should be N ≥ 1000 (Pesarin, 2001;Kończak, 2016). The value of the test statistic for the sample data is denoted by M k0 . The estimated p-value is calculated as follows: is the value of the test statistic in the i-th permutation and M k0 is the value of the test statistic for non-permuted data.

Multivariate extension -empirical verification
Two examples of the use of the proposed multivariate permutation extension of McNemar's test are presented. The first example is based on the simulated data and the other is based on the real data obtained from the Diagnoza społeczna survey (2019). The results of this test are compared to another extension -Cochran's Q test.
The random sample of the size n = 30 of the random vector Y = (Y 1 , Y 2 , Y 3 ) is shown in Table 6.
Source: own elaboration Based on the source data from Table 6, the data could be presented in a three-dimensional contingency table (see Table 7). Cochran's Q test for data presented in Table 6 leads to a decision that there is not enough evidence to reject the null hypothesis (Q = 3.455, p-value 0.1778).
Source: own calculations The symmetric element in the three-dimensional contingency table for the element (0, 0, 1) with counts n 001 = 1 is the element (1, 1, 0) with count n 110 = 0. The symmetric element for the element (0, 1, 1) with counts n 011 = 0 is the element (1, 0, 0) with count n 100 = 6. The symmetric element for the element (1, 0, 1) with counts n 101 = 0 is the element (0, 1, 0) with count n 010 = 4.  Table 6 Source: own elaboration The proposed multivariate test uses a data permutation method to estimate the distribution of the test statistic M k . Due to the Monte Carlo procedure, the estimated p-values may differ in subsequent simulations. There were made N = 1000 runs of the proposed test for the considered data. The p-values were from 0.000 up to 0.009 and the empirical distribution is shown in Figure 1. In each of the N tests, the decision was to reject the null hypothesis. This example shows the difference between the Cochran Q test and the proposed multivariate permutation extension of McNemar's test. It could be seen that in the case of counts asymmetry in the three-dimensional contingency table the proposed test leads to the rejection of the null hypothesis even if the Cochran Q test leads to the decision that we have not enough evidence to reject the null hypothesis.

Empirical verification -real data case
To show differences between the Cochran Q test and the proposed modification of McNemar's test, the data from the Diagnoza społeczna survey (2019) were used. Diagnoza społeczna is based on panel research. Researchers return to the same households every few years, with the first sample being taken in the year 2000 and the last sample in the year 2015. One of the questions asked concerns escaping into alcohol in order to deal with problems and difficulties in 2007, 2011, and 2015. The results are shown in Table 8. Table 8. Three-dimensional contingency table for the following question "I reach for alcohol" (0 -"no", 1 -"yes") The percentages of "yes" answers for all variables (years 2007, 2011, 2015) are equal to 3%, 3% and 2.4% respectively in 2007, 2011 and 2015. This leads to the statement that we have not enough evidence to reject the null hypothesis in the Cochran Q test (Q = 3.846, p-value = 0.1462).
It is visible that counts in the three-dimensional contingency table are not symmetric. The proposal of the permutation extension of McNemar's test should detect the asymmetry in these counts. The symmetric element in the three-dimensional contingency table (see Table 7) for the element (0, 0, 1) with counts n 001 = 29 is the element (1, 1, 0) with count n 110 = 13. The symmetric element for the element (0, 1, 1) with counts n 011 = 22 is the element (1, 0, 0) with count n 100 = 53. The symmetric element for the element (1, 0, 1) with counts n 101 = 8 is the element (0, 1, 0) with count n 010 = 44.
The permutation test was performed N = 1000 times. For each series of permutation tests, there was strong evidence to reject the null hypothesis. In each case, the p-values were from 0.000 up to 0.009. The asymmetry in the three-dimensional contingency table leads to rejecting the null hypothesis by the proposed multivariate permutation extension of McNemar's test.

Conclusions
The permutation multivariate extension of McNemar's test was proposed in the paper. This test could be used to detect the dependency for multidimensional dependent binary variables. The test can also be considered as a modification of the Cochran Q test. The proposed test is similar to the Q Cochran test, but the test statistic is based on the test statistic as in the Bowker symmetry test. The main idea of the proposed multivariate permutation test is to detect counts asymmetry in the contingency table. The examples with the use of the simulation and the real data were presented in the paper. The presented calculations have shown that the proposal leads to effective detection of counts asymmetry in the multidimensional contingency table. A special property of the proposed test is the ability to detect asymmetry of counts in a multidimensional contingency table.