Probabilistic Parsing of a Context Sensitive Grammar ... OR ... How to get your Search Engine to answer a simple question

Simon Williams
CSIRO-MIS Business Intelligence Group

Tuesday 02 October at 11am

Abstract

Good arguments have been made that context-free grammars (CFG) do not possess the generative power to capture the full complexity of natural languages. However, CFGs are still widely used for natural language processing (NLP) applications; sometimes because the application does not demand the sophistication of a more powerful grammar, but often because there is a perception that these grammars are unwieldy and slow in application. Looking from the perspective of another fairly recent trend in NLP, the introduction of probabilistic language models, this perception is shown to be unfair. We will describe our favourite mildly context-sensitive grammar, Combinatory Categorial Grammar (CCG) and its advantages from a linguistic perspective. Further, we will give a probabilistic model for CCG that leads to an efficient and flexible parser. Another advantage of the probabilistic approach is that it allows not only the integration of additional information (linguistic or otherwise) to improve parser performance but also the integration of the parser into larger probabilistic systems where it can be useful. We will show how this might work for a proposed question-answering system.

Short resume

Simon Williams learnt mathematics, physics and an unhealthy combination of the two in Adelaide and Oxford. He taught mathematics for a while at Adelaide before penury drove him to apply his skills to radar signal processing at the Defence Science and Technology Organisation, where he learnt statistical signal processing. He moved to CSIRO where he did his usual trick of choosing to do the hardest thing possible and started working in natural language processing. He wasn't wrong.

Back to HAIL Home Page