Machine Scoring Fails the Test Says New Statement from the National Council of Teachers of English

Share Article

Using Computers to Grade Writing Shortchanges Students.

News Image
Computers cannot recognize or judge those elements that we most associate with good writing.

Computers cannot recognize or judge those elements that we most associate with good writing, asserts the new position statement, Machine Scoring Fails the Test, from the National Council of Teachers of English. Computers can’t assess elements such as logic, clarity, accuracy, effective appeals to audience, different forms of organization, quality of evidence, humor or irony, or effective uses of repetition.

When computers are used to evaluate student writing, students are denied the chance to have their writing assessed for anything but a few limited surface features and teachers are compelled to ignore what is most important in writing instruction in order to teach what is least important. High-stakes assessments that rely on computers to grade student writing are particularly destructive.

Chris Anson, chair of the task force that created the position statement, notes, “We teach students that writing is an invention by humans for human communication. But when we have machines evaluate their writing, we tell students that we think so little about what they’re communicating that we aren't willing to read and respond to it ourselves.” In addition, task force member Peggy O’Neill explains in this video why the limited kind of writing that machines can score is not the complex writing expected in college and the workplace.

Computers don’t “make the grade” in the assessment of writing for many reasons, including:

  • Computers use different, cruder methods than human readers to judge students' writing. For example, some systems gauge the sophistication of vocabulary by measuring the average length of words and how often the words are used; or they gauge the development of ideas by counting the length and number of sentences per paragraph.
  • Computers get progressively worse at scoring as the length of the writing increases. As a result, test makers must design shorter writing tasks that don’t represent the range and variety of writing assignments needed to prepare students for the more complex writing they will encounter in college and the workplace.
  • Computer scoring favors the most objective "surface" features of writing (grammar, spelling, punctuation), but problems in these areas are often created by the testing conditions and are the most easily rectified in normal writing conditions when there is time to revise and edit. Also, privileging surface features disproportionately penalizes nonnative speakers of English who may be on a developmental path that machine scoring fails to recognize.
  • Computer scoring discriminates against students who are less familiar with using technology to write or complete tests. Further, machine scoring disadvantages school districts that lack funds to provide technology tools for every student and skews technology acquisition toward devices needed to meet testing requirements.
  • Computer scoring systems can be "gamed" because they are poor at working with human language, further weakening the validity of their assessments and separating students not on the basis of writing ability but on whether they know and can use machine-tricking strategies.
  • Computers only score as well as humans when the humans are trained to score like the computers (for example, being told not to make judgments on the accuracy of information).

There are alternatives to machine scoring, especially high-quality assessment systems that support teaching and learning, such as portfolio assessment; teacher assessment teams; balanced assessment plans that involve more localized (classroom- and district-based) assessments designed and administered by classroom teachers; and "audit" teams of teachers, teacher educators, and writing specialists who visit districts to review samples of student work and the curriculum that has yielded them.

Machine Scoring Fails the Test includes an extensive, annotated bibliography of research on human and machine scoring of writing.

# # #

The National Council of Teachers of English, with over 35,000 members and subscribers worldwide, is dedicated to improving the teaching and learning of English and the language arts at all levels of education.

Share article on social media or email:

View article via:

Pdf Print

Contact Author

Millie Davis
Follow us on
Visit website