Options
Development and validation of computerised adaptive tests with and without performance diagnosis
Loading...
Type
Thesis
Abstract
This research has focused on an application, in actual classroom practice, of computerised adaptive testing (CAT) in assessing learning achievement. The concept of CAT, an application of item response theory, has become feasible with the advent of high-speed computers. Basically, the testing in CAT is tailored to an examinee's ability level: the computer first obtains an estimate of the examinee's ability level based on his or her responses to initial test items, and then makes subsequent item-selection decisions that are most appropriate for measuring his/her performance level.
Based on the literature review, investigative directions for this study were conceptualised in terms of the development and validation of computerised adaptive tests with and without the provision of performance diagnosis. 'Item response theory' and the unitary validity concept formed the theoretical basis of the research.
This research on CAT was thus partitioned into three closely related studies. Study 1 focused on the calibration of a test-items bank on biology under the item response theory (IRT). It also examined relevant practical issues encountered in the development of a computerised item bank. In Study 2, an automated construction of tests using the calibrated pool of test items was performed. Whilst the IRT-calibrated item bank was used to construct both conventional fixed-length and adaptive variable-length tests, the functioning of the item pool was particularly investigated for computerised adaptive testing (CAT). Study 3 validated the computerised adaptive tests using students in a Singapore school. The effects of student factors - i.e. gender, ability, computer familiarity (viz. computer ownership, computing experience and frequency of computer use), and affective characteristics (viz. attitude towards computers and to science learning) - on three measures of CAT outcomes, namely, test performance, attitude towards the test administration and benefits derived from the testing, were investigated.
Results of this study indicated comparability of test scores obtained with the computerised adaptive tests and those obtained with a parallel paper-and-pencil test. In brief, ability was the best predictor of test performance in biology using CAT. Overall the students were positive towards the CAT administration; but they were most bothered by their inability to review test items during the testing. Some differences were found in the reactions to certain aspects of CAT of students in different subgroups defined by gender, computer familiarity, and affective characteristics. However, these differences did not appear to affect their test performance on CAT. Satisfactory benefits were derived from the use of CAT. These included an average of 10% reduction in test length and 50% reduction in test time, and the testing showed adequate adaptability to individual students' ability level. Students who were treated with CAT with performance diagnosis indicated the usefulness of the feedback and diagnosis facility.
With these insights into the use of CAT in a secondary school in Singapore, implications for future implementation and use of computerised adaptive tests in school-based testing are discussed. The limitations of the research are also explained and suggestions for further research indicated.
Based on the literature review, investigative directions for this study were conceptualised in terms of the development and validation of computerised adaptive tests with and without the provision of performance diagnosis. 'Item response theory' and the unitary validity concept formed the theoretical basis of the research.
This research on CAT was thus partitioned into three closely related studies. Study 1 focused on the calibration of a test-items bank on biology under the item response theory (IRT). It also examined relevant practical issues encountered in the development of a computerised item bank. In Study 2, an automated construction of tests using the calibrated pool of test items was performed. Whilst the IRT-calibrated item bank was used to construct both conventional fixed-length and adaptive variable-length tests, the functioning of the item pool was particularly investigated for computerised adaptive testing (CAT). Study 3 validated the computerised adaptive tests using students in a Singapore school. The effects of student factors - i.e. gender, ability, computer familiarity (viz. computer ownership, computing experience and frequency of computer use), and affective characteristics (viz. attitude towards computers and to science learning) - on three measures of CAT outcomes, namely, test performance, attitude towards the test administration and benefits derived from the testing, were investigated.
Results of this study indicated comparability of test scores obtained with the computerised adaptive tests and those obtained with a parallel paper-and-pencil test. In brief, ability was the best predictor of test performance in biology using CAT. Overall the students were positive towards the CAT administration; but they were most bothered by their inability to review test items during the testing. Some differences were found in the reactions to certain aspects of CAT of students in different subgroups defined by gender, computer familiarity, and affective characteristics. However, these differences did not appear to affect their test performance on CAT. Satisfactory benefits were derived from the use of CAT. These included an average of 10% reduction in test length and 50% reduction in test time, and the testing showed adequate adaptability to individual students' ability level. Students who were treated with CAT with performance diagnosis indicated the usefulness of the feedback and diagnosis facility.
With these insights into the use of CAT in a secondary school in Singapore, implications for future implementation and use of computerised adaptive tests in school-based testing are discussed. The limitations of the research are also explained and suggestions for further research indicated.
Date Issued
1998
Call Number
LB3051 Che
Date Submitted
1998