GLEU: Automatic Evaluation of Sentence-Level FluencyDr Mark Dras Tuesday 24th April 2007 at 11am
AbstractIn evaluating the output of language technology applications -- MT, natural language generation, summarisation -- automatic evaluation techniques generally conflate measurement of faithfulness to source content with fluency of the resulting text. We have developed an automatic evaluation metric to estimate fluency alone, by examining the use of parser outputs as metrics, and we show that they correlate with human judgements of generated text fluency. From this we have developed a machine learner based on these, which performs better than the individual parser output metrics, approaching a lower bound on human performance. We have also investigated the effect on the metric of different language models for generating sentences, and show that while individual parser metrics can be 'fooled' depending on generation method, the machine learner provides a consistent estimator of fluency. This is joint work with Andy Mutton and Stephen Wan. Short resumeMark Dras is a senior lecturer in the Department of Computing at Macquarie University. He's interested in machine translation, paraphrase, and formal grammars, notably (courtesy of a postdoc at the University of Pennsylvania) Tree Adjoining Grammar. |