You are here:

Complementing Human Judgment of Essays Written by English Language Learners with E-Rater[R] Scoring
ARTICLE

,

Language Testing Volume 27, Number 3, ISSN 0265-5322

Abstract

E-rater[R] is an automated essay scoring system that uses natural language processing techniques to extract features from essays and to model statistically human holistic ratings. Educational Testing Service has investigated the use of e-rater, in conjunction with human ratings, to score one of the two writing tasks on the TOEFL-iBT[R] writing section. In this article we describe the TOEFL iBT writing section and an e-rater model proposed to provide one of two ratings for the Independent writing task. We discuss how the evidence for a process that uses both human and e-rater scoring is relevant to four components in a validity argument: (a) Evaluation--observations of performance on the writing task are scored to provide evidence of targeted writing skills; (b) Generalization--scores on the writing task provide estimates of expected scores over relevant parallel versions of the task and across raters; (c) Extrapolation--expected scores on the writing task are consistent with other measures of writing ability; and (d) Utilization--scores on the writing task are useful in educational contexts. Finally, we propose directions for future research that will strengthen the case for using complementary methods of scoring to improve the assessment of EFL writing. (Contains 1 figure and 5 tables.)

Citation

Enright, M.K. & Quinlan, T. (2010). Complementing Human Judgment of Essays Written by English Language Learners with E-Rater[R] Scoring. Language Testing, 27(3), 317-334. Retrieved August 14, 2024 from .

This record was imported from ERIC on April 19, 2013. [Original Record]

ERIC is sponsored by the Institute of Education Sciences (IES) of the U.S. Department of Education.

Copyright for this record is held by the content creator. For more details see ERIC's copyright policy.

Keywords