Now showing 1 - 10 of 77
  • Publication
    Metadata only
    Investigating the visual content of a commercialized academic listening test: Implications for validity
    (Elsevier, 2024)
    Hou, Zhuohan
    ;
    ;
    Azrifah Zakaria
    As incorporating visual modes in listening tests is gradually gaining traction in second language (L2) assessment, the inclusion of such visuals brings up questions about the role of visual modes in meaning-making during listening and test validity. In this study, we investigated the visual features of the International English Language Testing System (IELTS) listening test through the application of the social semiotic multimodal framework. Our corpus comprised 300 visuals from 256 academic listening testlets published between 1996 and 2022. Unlike the past studies of social semiotic multimodal analyses that relied on qualitative methods, our study adopted a series of visualization and quantitative statistical analysis of frequency and dispersion measures, using the general linear model to examine the visuals from a social semiotic multimodal perspective. The results revealed significant variation in the visual structures of the testlets. Through applying a post-hoc analysis, we further proposed recommendations for further research on multimodal materials in listening assessment and discussed the implications of the observed variation for the validity of the IELTS listening test. This study may be considered the first attempt to examine L2 listening assessment from a corpus-based social semiotic multimodal perspective, which may inspire more investigations on multimodal listening.
      49
  • Publication
    Open Access
    Investigating the construct validity of the MELAB Listening Test through the Rasch analysis and correlated uniqueness modeling
    (University of Michigan, 2010) ;
    This article evaluates the construct validity of the Michigan English Language Assessment Battery (MELAB) listening test by investigating the underpinning structure of the test (or construct map), possible construct under representation and construct-irrelevant threats. Data for the study, from the administration of a form of the MELAB listening test to 916 international test takers, were provided by the English Language Institute of the University of Michigan. The researchers sought evidence of construct validity primarily through correlated uniqueness models (CUM) and the Rasch model. A five factor CUM was fitted into the data but did not display acceptable measurement properties. The researchers then evaluated a three-traits1 confirmatory factor analysis (CFA) that fitted the data sufficiently. This fitting model was further evaluated with parcel items, which supported the proposed CFA model. Accordingly, the underlying structure of the test was mapped out as three factors: ability to understand minimal context stimuli, short interactions, and long-stretch discourse. The researchers propose this model as the tentative construct map of this form of the test. To investigate construct under representation and construct-irrelevant threats, the Rasch model was used. This analysis showed that the test was relatively easy for the sample and the listening ability of several higher ability test takers were sufficiently tested by the items. This is interpreted to be a sign of test ceiling effects and minor construct-underrepresentation, although the researchers argue that the test is intended to distinguish among the students who have a minimum listening ability to enter a program from those who don’t. The Rasch model provided support of the absence of construct-irrelevant threats by showing the adherence of data to uni dimensionality and local independence, and good measurement properties of items. The final assessment of the observed results showed that the generated evidence supported the construct validity of the test.
      81  311
  • Publication
    Open Access
    A synthetic review of cognitive load in distance interpreting: Toward an explanatory model
    (Frontiers, 2022)
    Zhu, Xuelian
    ;
    Distance Interpreting (DI) is a form of technology-mediated interpreting which has gained traction due to the high demand for multilingual conferences, live-streaming programs, and public service sectors. The current study synthesized the DI literature to build a framework that represents the construct and measurement of cognitive load in DI. Two major areas of research were identified, i.e., causal factors and methods of measuring cognitive load. A number of causal factors that can induce change in cognitive load in DI were identified and reviewed. These included factors derived from tasks (e.g., mode of presentation), environment (e.g., booth type), and interpreters (e.g., technology awareness). In addition, four methods for measuring cognitive load in DI were identified and surveyed: subjective methods, performance methods, analytical methods, and psycho-physiological methods. Together, the causal factors and measurement methods provide a multifarious approach to delineating and quantifying cognitive load in DI. This multidimensional framework can be applied as a tool for pedagogical design in interpreting programs at both the undergraduate and graduate levels. It can also provide implications for other fields of educational psychology and language learning and assessment.
    WOS© Citations 2Scopus© Citations 5  314  244
  • Publication
    Open Access
    A scientometric analysis of applied linguistics research (1970-2022): Methodology and future directions
    (De Gruyter, 2024)
    Azrifah Zakaria
    ;
    In this study, we provide a scientometric analysis of 43,685 studies published in 51 quartile-1 journals in the field of applied linguistics (1970–2022). Scientometric analysis uses citation records to quantitatively compute networks of cited works and map out how published works have been cited. We adapted a multi-stage scientometric method consisting of database identification, dataset generation, document co-citation analysis, research cluster identification, and cluster characterization. A number of major research clusters were identified and a high degree of interconnectedness in terms of theoretical base was observed between the clusters. The pre-2000 publications had a conspicuous focus on theories derived from language use, which might be said had set the tone for the maturation of the field. By contrast, the clusters that emerged from the 2000s showed more specificity and granularity in focus and scope, suggesting the beginning of a research era with more specialized directions. Despite this trend, we identified influential publications which received several spikes in citations in different eras, indicating their continued temporal and thematic relevance in different clusters. In addition, we found evidence of inter-cluster cross-pollinations. We discuss how each cluster should be characterized in terms of its knowledge base and knowledge front. Highly cited works form the knowledge base of a cluster while novel works form the knowledge fronts of a cluster. Future directions are mentioned and highlighted.
    Scopus© Citations 8  164  136
  • Publication
    Open Access
    An automatized semantic analysis of two large-scale listening tests: A corpus-based study
    (Sage, 2024)
    Zhao, Yufan
    ;
    This study examined the semantic features of the simulated mini-lectures in the listening sections of the International English Language Testing System (IELTS) and the Test of English as a Foreign Language (TOEFL) based on automatized semantic analysis to explore the content validity of the two tests. Two study corpora were utilized, the IELTS corpus with 56 mini-lectures (38,944 words) and the TOEFL corpus with 285 mini-lectures (207,296 words). The reference corpus comprised 59 lectures from the Michigan Corpus of Academic Spoken English (MICASE), totaling 571,354 words. The corpora were submitted to automatized semantic tagging using Wmatrix5. Three comparisons were conducted: IELTS versus TOEFL, IELTS versus MICASE lectures, and TOEFL versus MICASE lectures. The results suggest that IELTS and TOEFL mini-lectures shared 78% and 64% of the same semantic features as MICASE, respectively, supporting their relative content validity. Nevertheless, specific semantic categories, such as politics, war, and intimate and sexual relationships, were notably absent from the test corpora, even though they appeared in the academic lecture corpus. In addition, causal connectors are frequently used in both tests, while the mini-lectures of IELTS listening tests cover fewer academic discourse fields than TOEFL mini-lectures. Implications for content validity are discussed.
      28  173
  • Publication
    Metadata only
    A multidimensional analysis of a high-stakes English listening test: A corpus-based approach
    (MDPI, 2024)
    Tao, Xuelian
    ;

    The Gaokao, also known as China’s national college entrance exam, is a high-stakes exam for nearly all Chinese students. English has been one of the three most important subjects for a long time, and listening plays an important role in the Gaokao English test. However, relatively little research has been conducted on local versions of Gaokao’s English listening tests. This study analyzed the linguistic features and corresponding functional dimensions of the three different text types in the Gaokao’s listening test, investigating whether the papers used in three major regions of China were differentiated in terms of the co-occurrence patterns of lexicogrammatical features and dimensions of the transcripts. A corpus consisting of 170 sets of test papers (134,913 words) covering 31 provinces and cities from 2000 to 2022 was analyzed using a multidimensional analysis wherein six exclusive dimensions were extracted. The results showed that there were meaningful differences across short conversations, long conversations, and monologues with regard to the six dimensions’ scores, and regions further had significant differences in three dimensions: Syntactic and Clausal Complexity, Oral versus Literate Discourse, and Procedural Discourse, while Time Period was not associated with any differences. Implications for language teaching and assessment are discussed.

    Scopus© Citations 1  62
  • Publication
    Open Access
    Scopus© Citations 10  170  575
  • Publication
    Open Access
    A neurocognitive investigation of test methods and gender effects in listening assessment
    (Taylor & Francis, 2020) ;
    Ng, Li Ying
    ;
    Foo, Stacy
    ;
    Esposito, Gianluca
    This is the first study to investigate the effects of test methods (while-listening performance and post-listening performance) and gender on measured listening ability and brain activation under test conditions. Functional near-infrared spectroscopy (fNIRS) was used to examine three brain regions associated with listening comprehension: the inferior frontal gyrus and posterior middle temporal gyrus, which subserve bottom-up processing in comprehension, and the dorsomedial prefrontal cortex, which mediates top-down processing. A Rasch model reliability analysis showed that listeners were homogeneous in their listening ability. Additionally, there were no significant differences in test scores across test methods and genders. The fNIRS data, however, revealed significantly different activation of the investigated brain regions across test methods, genders, and listening abilities. Together, these findings indicated that the listening test was not sensitive to differences in the neurocognitive processes underlying listening comprehension under test conditions. The implications of these findings for assessing listening and suggestions for future research are discussed.
    WOS© Citations 9Scopus© Citations 19  149  279
  • Publication
    Metadata only
    Does the peer review mode make a difference? An exploratory look at undergraduates’ performances and preferences in a writing course
    (Elsevier, 2024)
    Hsieh, Yi-Chen
    ;
    Leong, Alvin Ping
    ;
    Lin, Yu-Ju
    ;

    The importance of peer review practice in writing courses has been strongly supported by pedagogical research. Adopting a mixed-methods approach, this study investigated three peer review modes in an undergraduate academic writing course through the lens of students’ writing performances and perceptions. The three modes are (i) face-to-face peer review (F2F), (ii) anonymous computer-mediated peer review (CMPR), and (iii) blended peer review (a blend of F2F and anonymous CMPR). Three classes enrolled in an academic writing course participated in this study. Students’ assignments were collected to analyze their writing performances. Focus group discussions (FGDs) were administered to investigate students’ perceptions of the peer review modes, including their perceived usefulness of the feedback and the review processes. The findings show that the students’ writing performances significantly improved after the peer review session in all three peer review modes, with the anonymous CMPR and the blended mode showing stronger effectiveness as compared to the F2F mode. The participants generally preferred the blended mode, which addresses the limitations of both F2F and anonymous CMPR by leveraging the merits of both. We propose the use of the blended peer review mode to accommodate different learning needs and maximize the effectiveness of peer review practice.

      20
  • Publication
    Metadata only
    The log-linear cognitive diagnosis modeling (LCDM) in second language listening assessment
    (Routledge, 2019)
    Toprak, Tugba Elif
    ;
    ;
    This chapter focuses on the log-linear cognitive diagnosis modeling (LCDM), a general diagnostic classification model (DCM) family that allows researchers to model a large group of diagnostic classification models (DCMs) flexibly. Although the LCDM has important advantages over other core DCMs, it remains relatively under-researched in language assessment. This chapter first provides language testers with an introduction to the theoretical and statistical underpinnings of the LCDM. Next, it demonstrates how the LCDM could be applied to a high-stakes listening comprehension test. Finally, it presents guidelines on how to estimate and interpret the model, item, and examinee parameters with readily available software.
      15