Options
Zhu, Tianming
- PublicationMetadata onlyTest of the equality of several high-dimensional covariance matrices: A normal-reference approachAs the field of big data continues to evolve, there is an increasing necessity to evaluate the equality of multiple high-dimensional covariance matrices. Many existing methods rely on approximations to the null distribution of the test statistic or its extreme-value distributions under stringent conditions, leading to outcomes that are either overly permissive or excessively cautious. Consequently, these methods often lack robustness when applied to real-world data, as verifying the required assumptions can be arduous. In response to these challenges, we introduce a novel test statistic utilizing the normal-reference approach. We demonstrate that the null distribution of this test statistic shares the same limiting distribution as a chi-square-type mixture under certain regularity conditions, with the latter reliably estimable from data using the three-cumulant matched chi-square-approximation. Additionally, we establish the asymptotic power of our proposed test. Through comprehensive simulation studies and real data analysis, our proposed test demonstrates superior performance in terms of size control compared to several competing methods.
13 - PublicationEmbargoTwo-sample test for high-dimensional covariance matrices: A normal-reference approach
Testing the equality of the covariance matrices of two high-dimensional samples is a fundamental inference problem in statistics. Several tests have been proposed but they are either too liberal or too conservative when the required assumptions are not satisfied which attests that they are not always applicable in real data analysis. To overcome this difficulty, a normal-reference test is proposed and studied in this paper. It is shown that under some regularity conditions and the null hypothesis, the proposed test statistic and a chi-squared-type mixture have the same limiting distribution. It is then justified to approximate the null distribution of the proposed test statistic using that of the chi-squared-type mixture. The distribution of the chi-squared-type mixture can be well approximated using a three-cumulant matched chi-squared-approximation with its approximation parameters consistently estimated from the data. The asymptotic power of the proposed test under a local alternative is also established. Simulation studies and a real data example demonstrate that the proposed test works well in general scenarios and outperforms the existing competitors substantially in terms of size control.
Scopus© Citations 1 44 24 - PublicationEmbargoA global test for heteroscedastic one-way FMANOVA with applications
Multivariate functional data are prevalent in various fields such as biology, climatology, and finance. Motivated by the World Health Data applications, in this study, we propose and examine a global test for assessing the equality of multiple mean functions in multivariate functional data. This test addresses the one-way Functional Multivariate Analysis of Variance (FMANOVA) problem, which is a fundamental issue in the analysis of multivariate functional data. While numerous analysis of variance tests have been proposed and studied for univariate functional data, only a limited number of methods have been developed for the one-way FMANOVA problem. Furthermore, our global test has the ability to handle heteroscedasticity in the unknown covariance function matrices that underlie the multivariate functional data, which is not possible with existing methods. We establish the asymptotic null distribution of the test statistic as a chi-squared-type mixture, which depends on the eigenvalues of the covariance function matrices. To approximate the null distribution, we introduce a Welch–Satterthwaite type chi-squared-approximation with consistent parameter estimation. The proposed test exhibits root-𝓃 consistency, meaning it possesses nontrivial power against a local alternative. Additionally, it offers superior computational efficiency compared to several permutation-based tests. Through simulation studies and applications to the World Health Data, we highlight the advantages of our global test.
Scopus© Citations 1 75 38 - PublicationEmbargoA fast and accurate kernel-based independence test with applications to high-dimensional and functional data.
Testing the dependency between two random variables is an important inference problem in statistics since many statistical procedures rely on the assumption that the two samples are independent. To test whether two samples are independent, a so-called HSIC (Hilbert–Schmidt Independence Criterion)-based test has been proposed. Its null distribution is approximated either by permutation or a Gamma approximation. In this paper, a new HSIC-based test is proposed. Its asymptotic null and alternative distributions are established. It is shown that the proposed test is root-𝓃 consistent. A three-cumulant matched chi-squared-approximation is adopted to approximate the null distribution of the test statistic. By choosing a proper reproducing kernel, the proposed test can be applied to many different types of data including multivariate, high-dimensional, and functional data. Three simulation studies and two real data applications show that in terms of level accuracy, power, and computational cost, the proposed test outperforms several existing tests for multivariate, high-dimensional, and functional data.
Scopus© Citations 1 34 24 - PublicationOpen AccessOne-way MANOVA for functional data via Lawley-Hotelling trace testFunctional data arise from various fields of study and there have been numerous works on their analysis. However, most of existing methods consider the univariate case and methodology for multivariate functional data analysis is rather limited. In this article, we consider testing equality of vectors of mean functions for multivariate functional data, i.e., functional one-way multivariate analysis of variance (MANOVA). To this aim, we study asymptotic null distribution of the functional Lawley–Hotelling trace (FLH) test statistic and approximate it by a Welch–Satterthwaite type X2 approximation. We describe two approaches to estimating the parameters in the X2 approximation ratio-consistently. The resulting FLH test has the correct asymptotic level, is root-n consistent in detecting local alternatives, and is computationally efficient. The numerical performance is examined via some simulation studies and application to three real data examples. The proposed FLH test is comparable with four existing tests based on permutation in terms of size control and power. The major advantage is that it is much faster to compute.
WOS© Citations 2Scopus© Citations 5 92 34 - PublicationOpen AccessTwo-sample Behrens–Fisher problems for high-dimensional data: A normal reference F-type test
The problem of testing the equality of mean vectors for high-dimensional data has been intensively investigated in the literature. However, most of the existing tests impose strong assumptions on the underlying group covariance matrices which may not be satisfied or hardly be checked in practice. In this article, an F-type test for two-sample Behrens–Fisher problems for high-dimensional data is proposed and studied. When the two samples are normally distributed and when the null hypothesis is valid, the proposed F-type test statistic is shown to be an F-type mixture, a ratio of two independent 𝒳2-type mixtures. Under some regularity conditions and the null hypothesis, it is shown that the proposed F-type test statistic and the above F-type mixture have the same normal and non-normal limits. It is then justified to approximate the null distribution of the proposed F-type test statistic by that of the F-type mixture, resulting in the so-called normal reference F-type test. Since the F-type mixture is a ratio of two independent 𝒳2-type mixtures, we employ the Welch–Satterthwaite 𝒳2-approximation to the distributions of the numerator and the denominator of the F-type mixture respectively, resulting in an approximation F-distribution whose degrees of freedom can be consistently estimated from the data. The asymptotic power of the proposed F-type test is established. Two simulation studies are conducted and they show that in terms of size control, the proposed F-type test outperforms two existing competitors. The good performance of the proposed F-type test is also illustrated by a COVID-19 data example.
41 46