Master of Arts (Applied Linguistics)

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 5 of 230
  • Publication
    Restricted
    A corpus-based frame semantic analysis of commercialized listening tests : implications for content validity
    (2024)
    Zhao, Yufan

    Commercialized listening tests can significantly impact test-takers’ lives, as they are often required for purposes such as immigration, employment opportunities, and university admissions. However, there is a noticeable research gap regarding the content validity of these tests. To address the gap, this study aims to examine the semantic features of the simulated mini-lectures in the listening sections of the Test of English as a Foreign Language (TOEFL) and the International English Language Testing System (IELTS) to explore the content validity of the two tests.

    This study utilized two study corpora, the IELTS corpus with 68 mini-lectures (46,823 words) and the TOEFL corpus with 285 mini-lectures (207,296 words). The reference corpus comprised 59 lectures from the Michigan Corpus of Academic Spoken English (MICASE), totaling 571,354 words. The theoretical framework employed in the study is frame semantics that asserts words should be understood within cognitive frames. The data was submitted to Wmatrix5 for automatized semantic tagging, which generated 488 semantic frames. Three comparisons were conducted: IELTS vs. TOEFL, IELTS vs. MICASE lectures, and TOEFL vs. MICASE lectures.

    The results suggest that the mini-lectures of IELTS listening tests cover fewer academic discourse fields than TOEFL mini-lectures. Therefore, it is suggested that IELTS test developers prioritize materials resembling genuine academic lectures over non-specialist texts. TOEFL test developers should extend the coverage of the test content and continue to mirror the academic discourse.

    Furthermore, IELTS and TOEFL mini-lectures reflected the similarity of 78% and 64% of the examined semantic frames respectively, underlining their relative authenticity. Similarly, a pervasive ‘objectivity’ was evident across all three corpora, with emotion-related categories being sparse. Nevertheless, specific topics, such as politics, war, and intimate and sexual relationships, were notably absent from the test corpora, even though they appeared in the academic lecture corpus.

    Finally, as the simulated mini-lectures in IELTS and TOEFL are significantly shorter than authentic lectures, the positive results supporting the authenticity of the simulated lectures are attenuated. It is necessary to confirm whether these mini-lectures in the listening tests can engage test takers in the same cognitive processes as authentic academic lectures.

      19  11
  • Publication
    Restricted
    A meta-analysis of the reliability of L2 reading comprehension assessments
    (2024)
    Zhao, Huijun

    Score reliability is one of the major facets of modern validity frameworks in language assessment. Specifically, within the argument-based validation of assessments, reliability functions as indispensable evidence in the cause-effect dynamic of generalization and higher-level validity inferences. The present study aims to determine the average reliability of L2 reading tests, identify the potential moderators of reliability in L2 reading comprehension tests and explore the potential power of reliability in predicting the relationship between generalization and explanation inferences.

    A reliability generalization (RG) meta-analysis was conducted to compute the average reliability coefficient of L2 reading comprehension tests and identify the potential predictor variables that moderate reliability. I examined 1883 individual studies from Scopus, the Web of Science, ERIC, and LLBA databases for possible inclusion and assessed 266 studies as eligible for the inclusion criteria. Out of these, I extracted 85 Cronbach’s alpha estimates from 60 studies (years 2002-2023) that reported Cronbach’s alpha estimates properly and coded 28 potential predictors comprising of the characteristics of the study, the test, and test-takers. A linear mixed-effects model (LMEM) analysis was subsequently conducted to test for the predictive power of reliability coefficient in the relationship between generalization and explanation inferences. I further examined the impact of Cronbach’s alpha coefficient on the correlation between L2 reading comprehension tests and various language proficiency measures. This involved the reliability estimates of reading comprehension tests from 24 studies and 189 correlation data points between the reading comprehension tests and measures of language proficiency categorized into 11 groups.

    The RG meta-analysis found an average reliability of 0.78 (95% CI [0.76, 0.80]) with 40% of Cronbach’s coefficients falling below the lower bound of the confidence interval. The results of a heterogeneity test of Cronbach’s alphas indicated significant heterogeneity across studies (I2 = 97.58%) with variances partitioned into sampling error (2.42%), within-study (24.64%) and between-study (72.95%) differences. The number of test items and test-takers’ L1 were found to explain 19.65% and 13.70% of variation in the reliability coefficients across the studies, respectively. The LMEM analysis showed that alpha coefficients do not predict the correlation between reading comprehension tests and other measures of language proficiency. The implications of this study, its limitations and future studies are further discussed.

      17  15
  • Publication
    Restricted
    What makes articles highly cited? A bibliometric analysis of the top 1% most cited research in applied linguistics(2000-2022)
    (2024)
    Zhang, Sai

    Citation counts, although controversial, have long been used as a yardstick for research evaluation. The normative view regards citing as a means to credit scientific contributions, so the number of citations reflects not only scholarly attention but also research quality. However, the application of social constructivist theory introduces a nuanced perspective, asserting that a variety of factors unrelated to scientific merit can potentially influence citation counts. This dual nature of citation practices has been widely discussed across disciplines, yet it remains an underexplored domain in applied linguistics. This bibliometric study, with a particular interest in highly cited papers, aimed to investigate the citation patterns of applied linguistics research over two decades as well as the complexity that underpins their making.

    The dataset consists of 302 Quartile-1 journal papers that rank in the top 1% by citations in applied linguistics literature (2000-2022), with their detailed bibliometric information collected from Scopus (as of March 2023). Building upon literature, we considered a total of eleven extrinsic factors independent of scientific quality but could potentially affect citation counts, covering journal-related, author-related, and article-related features, respectively. Descriptive analysis was applied to unfold the citation landscape of targeted papers over time characterized by each factor. After a primary look at the bivariate relationship between variables through correlation analysis, multiple linear regression models were adopted to simultaneously examine the extent to which predictor variables are associated with citation outcomes.

    The results showed that in the best regression model, time-normalized citations were significantly predicted by six factors: journal prestige, accessibility, co-authorship, research performance, title, and subfield of applied linguistics. The remaining five factors did not exhibit any statistical significance, including internationality, geographical origin, funding, references, and methodology. Certain underlying social mechanisms were further unraveled, among which visibility explains the roles of significant factors in a unified manner, accelerating the recognition and dissemination of research discoveries in the dedicated field. The explanatory strength of all predictors together was observed to be limited (R²=.208, p<.05), but it was expected, considering that they are extrinsic properties unrelated to scientific merit. There is no doubt that the major citation driver should be the intrinsic quality of research, and the remaining variance can be also explained by many other extrinsic features yet to be explored.

    To the best of our knowledge, this is the first study that investigates a host of factors contributing to high citations in applied linguistics research. Implications of the research were also discussed, addressing the needs of both applied linguistics researchers and policymakers. We further suggested a more comprehensive approach for evaluative bibliometrics, integrating both qualitative and quantitative indicators to shed light on the whole rewarding system for only good research practices.

      18  19
  • Publication
    Restricted
    An eye-tracking and neuroimaging investigation of negative wording in an L2 metacognitive awareness questionnaire
    (2024)
    Wang, Xinhe
    This study set out to investigate the effect of cognitive load, related explicitly to constructs (five MALQ constructs) versus wording (negative wording and non-negated wording), on respondents' responses to metacognitive awareness listening questionnaire (MALQ), a widely used instrument for assessing self-perceived metacognitive awareness strategies. Respondents’ (N=109) eye movement measured by eye-tracker and brain activation levels measured by functional near-infrared spectroscopy (fNIRS) were obtained to examine their cognitive load in responding to MALQ. Distinct gaze behavior and neural activation associated with negatively worded items were identified, indicating increased cognitive load in the presence of negatively worded items, while English second language (E-L2) participants were found to exhibit higher cognitive load than English as first (EL1) respondents. Additionally, linear mixed effect models (LMEMs) were employed to test the power of eye behaviors, neural activations, language, constructs, and wording in predicting MALQ results. The results showed that although models under two-wording conditions caused a significant amount of variation in respondents’ MALQ scores, they had relatively lower explanatory power (R²) compared to the models based on the five constructs. The implications of these findings and recommendations for future studies and questionnaire design are discussed.
      10  19
  • Publication
    Restricted
    Reliability generalization meta-analysis of L2 listening tests
    (2024)
    Shang, Yuxin
    Second language (L2) listening has been widely examined for decades, and L2 listening assessments have received much attention. A reliable L2 listening test is recognized as a valuable tool for researchers and teachers to better assess learners’ listening ability and draw valid inferences. However, there is a lack of a synthesized view of reliability of L2 listening tests. To investigate the reliability of L2 listening tests and explore potential factors affecting the reliability, a reliability generalization (RG) meta-analysis was conducted in the present study. 167 out of 260 journal articles were not included in the analysis due to the absence of reported reliability coefficients. Thus, a total number of 122 reliability coefficients of L2 listening tests from 92 published articles were collected and submitted to a linear mixed effects RG analysis. The papers were coded based on a coding scheme consisting of 16 variables classified into three categories: study descriptions, test features, and statistical results. The results showed an average reliability of 0.8181 (CI= 0.8034 - 0.8329), with 40% of reliability estimates falling below the lower bound of CI. The presence of publication bias and heterogeneity was found in the reliability of L2 listening tests, indicating that low reliability coefficients were likely omitted from some published studies. In addition, two factors predicting the reliability of L2 listening tests were the number of items and test type (standardized vs and researcher-/teacher-designed tests). The study also found that reliability is not a moderator of the relationship between L2 listening scores and theoretically relevant constructs. Reliability induction, referencing reliability coefficients from previous studies, was identified in reporting the reliability of L2 listening tests. Implications for researchers and teachers were discussed.
      21  24