Probabilistic Parsing of a Context Sensitive Grammar ... OR
... How to get your Search Engine to answer a simple question
Simon Williams
CSIRO-MIS Business Intelligence Group
Tuesday 02 October at 11am
Abstract
Good arguments have been made that context-free grammars
(CFG) do not possess the generative power to capture the full complexity
of natural languages. However, CFGs are still widely used for natural
language processing (NLP) applications; sometimes because the application
does not demand the sophistication of a more powerful grammar, but
often because there is a perception that these grammars are unwieldy
and slow in application. Looking from the perspective of another
fairly recent trend in NLP, the introduction of probabilistic language
models, this perception is shown to be unfair. We will describe
our favourite mildly context-sensitive grammar, Combinatory Categorial
Grammar (CCG) and its advantages from a linguistic perspective.
Further, we will give a probabilistic model for CCG that leads to
an efficient and flexible parser. Another advantage of the probabilistic
approach is that it allows not only the integration of additional
information (linguistic or otherwise) to improve parser performance
but also the integration of the parser into larger probabilistic
systems where it can be useful. We will show how this might work
for a proposed question-answering system.
Short resume
Simon Williams learnt mathematics, physics and an unhealthy
combination of the two in Adelaide and Oxford. He taught mathematics
for a while at Adelaide before penury drove him to apply his skills
to radar signal processing at the Defence Science and Technology
Organisation, where he learnt statistical signal processing. He
moved to CSIRO where he did his usual trick of choosing to do the
hardest thing possible and started working in natural language processing.
He wasn't wrong.
Back to HAIL Home Page
|