GLEU: Automatic Evaluation of Sentence-Level Fluency

Dr Mark Dras
Senior Lecturer
Centre for Language Technology
Macquarie University

(A joint HAIL/SALS-SIG Seminar)

Tuesday 24th April 2007 at 11am

 

Abstract

In evaluating the output of language technology applications -- MT, natural language generation, summarisation -- automatic evaluation techniques generally conflate measurement of faithfulness to source content with fluency of the resulting text. We have developed an automatic evaluation metric to estimate fluency alone, by examining the use of parser outputs as metrics, and we show that they correlate with human judgements of generated text fluency. From this we have developed a machine learner based on these, which performs better than the individual parser output metrics, approaching a lower bound on human performance. We have also investigated the effect on the metric of different language models for generating sentences, and show that while individual parser metrics can be 'fooled' depending on generation method, the machine learner provides a consistent estimator of fluency.

This is joint work with Andy Mutton and Stephen Wan.

Short resume

Mark Dras is a senior lecturer in the Department of Computing at Macquarie University. He's interested in machine translation, paraphrase, and formal grammars, notably (courtesy of a postdoc at the University of Pennsylvania) Tree Adjoining Grammar.

Back to HAIL Home Page