An Overview of Current Research on Automated Essay Grading
Salvatore Valenti, Francesca Neri, Alessandro Cucchiarelli, Universita’ Politecnica delle Marche, Italy
JITE-Research Volume 2, Number 1, ISSN 1539-3585 Publisher: Informing Science Institute
Essays are considered by many researchers as the most useful tool to assess learning outcomes, implying the ability to recall, organize and integrate ideas, the ability to express oneself in writing and the ability to supply merely than identify interpretation and application of data. It is in the measurement of such outcomes, corresponding to the evaluation and synthesis levels of the Bloom’s (1956) taxonomy that the essay questions serve their most useful purpose. One of the difficulties of grading essays is represented by the perceived subjectivity of the grading process. Many researchers claim that the subjective nature of essay assessment leads to variation in grades awarded by different human assessors, which is perceived by students as a great source of unfairness. This issue may be faced through the adoption of automated assessment tools for essays. A system for automated assessment would at least be consistent in the way it scores essays, and enormous cost and time savings could be achieved if the system can be shown to grade essays within the range of those awarded by human assessors. This paper presents an overview of current approaches to the automated assessment of free text answers. Ten systems, currently available either as commercial systems or as the result of research in this field, are discussed: Project Essay Grade (PEG), Intelligent Essay Assessor (IEA), Educational Testing service I, Electronic Essay Rater (E-Rater), C-Rater, BETSY, Intelligent Essay Marking System, SEAR, Paperless School free text Marking Engine and Automark. For each system, the general structure and the performance claimed by the authors are described. In the last section of the paper an attempt is made to compare the performances of the described systems. The most common problems encountered in the research on automated essay grading is the absence both of a good standard to calibrate human marks and of a clear set of rules for selecting master texts. A first conclusion obtained is that in order to really compare the performance of the systems some sort of unified measure should be defined. Furthermore, the lack of standard data collection is identified. Both these problems represent interesting issues for further research in this field.
Valenti, S., Neri, F. & Cucchiarelli, A. (2003). An Overview of Current Research on Automated Essay Grading. Journal of Information Technology Education: Research, 2(1), 319-330. Informing Science Institute.
ReferencesView References & Citations Map
These references have been extracted automatically and may have some errors. Signed in users can suggest corrections to these mistakes.Suggest Corrections to References