Stephen Wan

Stephen Wan

Research Scientist ICT

Contact

Cnr Vimiera and Pembroke Roads
Marsfield NSW 2122

Tel: 61 2 93724703
Stephen.Wan@csiro.au

Biography

My role in the Search, Language and Social Media group is to research and implement applications utilising advances in Natural Language Processing (aka Computational Linguistics). Specifically, my research is focused on finding those actionable nuggets of information (documents, paragraphs, sentences, facts, keywords) within a document that helps a user perform his or her task. As such, I am interested in Automatic Text Summarization and Information Extraction, and their use in delivering contextualised information for the user, taking into account the user's interests, tasks and information query needs. Currently, we have been developing information-based services in the e-government and academic domains.

I generally use tools and methods developed in the related fields of Information Retrieval, Statistical Text Generation (specifically, statistical syntax models and language modelling), and Supervised and Unsupervised Machine Learning in order to build such applications.

I am also interested in the following application spaces: Government 2.0; creative uses of NLP; digital libraries; NLP for the web; social media technologies; and improved methods for scholarly research and publishing.

I also recently completed a PhD in the area of Automatic Text Summarisation (June 2010). For more information about my thesis, please visit http://www.ict.csiro.au/staff/stephen.wan/phd (see links below)

In previous research, I explored the use of summarisation techniques in the email domain (This work was conducted while visiting Columbia University to work on Email Thread Summarization.)

Current projects:
Social media monitoring for improved government services: http://www.csiro.au/partnerships/HSDRA.html
IBES: A summarisation extension for Firefox
CSIBS: The Citation-Sensitive In-Browser Summariser

At CSIRO, I worked on the following projects (see links below for project pages):
CARRS: Computer Automated Road Report System
TIDDLER: Tailored Information Delivery

Academic Qualifications

2010 Doctor of Philosophy, Macquarie University
2001 Bachelor of Science(Honours -- First Class), Macquarie University
1999 Bachelor of Science, Adelaide University
1999 Bachelor of Arts, Adelaide University

Recent Professional Experience

2006-Present Research Scientist, CSIRO
2005-2006 Editorial Assistant, Journal of Computational Linguistics
2004 Academic Tutor, Macquarie University
2003 Visiting Research, Columbia University
2000-2002 Research Engineer
1999 Computer Science Practical Supervisor, Macquarie University
1998-1999 Summer Vacation Scholar, CSIRO
1998 Computer Science Practical Supervisor, Adelaide University
1997 Summer Vacation Researcher, Microsoft Research Institute, Macquarie University

Achievements & Awards

2010 Endeavour Award
2010 Best Reviewer Award EMNLP 2010
2009 Finalist Elsevier Grand Challenge
2008 Semi-Finalist Elsevier Grand Challenge
2006 Outstanding Reviewer Certificate, COLING/ACL 2006
2006 Award for Best Student Presentation, Australasian Language Technology Workshop 2006
2002-2005 CSIRO Mathematical and Information Sciences Top-Up Scholarship
2002-2005 Research Award for Areas of Centres of Excellence, Macquarie University

Other Highlights

2009 Selected for the CSIRO Talent Management Program

Summary of Science & Technical Output

Books/Book chapters 3
Journal 2
Refereed Conference/Workshop 27
Technical/Client Reports 2
Invited Presentations 0
Patents 0

Grants

2010 Endeavour Award
2009 Finalist Elsevier Grand Challenge
2002-2005 Research Award for Areas of Centres of Excellence, Macquarie University

Student Supervision

2011 James McHugh, Andrew Gall
2010 James McHugh, Andrew Gall, Vishal Juneja, Zac Turnbull
2009 James McHugh, Andrew Gall
2009 Julien Blondeau
2009 Michael Muthukrishna
2009 James McHugh
2008 Julien Blondeau

Science Citizenship

2011 Co-organised the Monolingual Text-to-Text Generation Workshop at ACL 2011 (https://sites.google.com/site/texttotext2011/) with Katja Filippova.
2010 EMNLP 2010 Best Reviewer Award
2010 Program Committee for ACL, NAACL, SIGIR, AIRS, EMNLP 2010
2009 ACM Computing Survey reviewer
2009 Program Committee for UCNLG 2009
2009 Program Committee for SIGIR 2009
2009 Program Committee for ADCS 2009
2009 Program Committee for ACL-IJCNLP 2009
2008 Program Committee for COLING 2008
2008 Program Committee for EMNLP 2008
2008 Program Committee for ALTA 2008
2007 Program Committee for ACL'07 Student Research Workshop
2006 Program Committee for ACL'06 Student Research Workshop
2005 Co-Chair of the International Natural Language Generation Conference (INLG 2006)
2005 Co-Chair of the ACL'05 Student Research Workshop
2005 SIGGEN Mailing List Maintainer
2005 Reviewer for Journal of Computational Linguistics
2004 Student member on the board for the Australasian Language Technology Association (ALTA)
2004 Student member on the board for the Special Interest Group on Text Generation (SIGGEN)
2004 Program Committee for ACL SRW '04
2004 Co-chair of the Australasian Language Technology Workshop (ALTW 2004)
2003 Organiser of the Language Technology Seminar and SALS-SIG Series
2000-2001 Co-organiser of the HAIL seminars
2000 Maintained the OZCHI 2000 conference website.  

Top 10 Publications

Publication details
Stephen Wan, Cecile Paris, and Robert Dale (2009) "Supporting Browsing-Specific Information Needs: Introducing the Citation-Sensitive In-Browser Summariser". To Appear in the Journal of Web Semantics.
Stephen Wan, Mark Dras, Robert Dale and Cécile Paris (2009) Improving Grammaticality in Statistical Sentence Generation: Introducing a Dependency Spanning Tree Algorithm with an Argument Satisfaction Model. In the Proceedings of Conference of the European Chapter of the Association for Computational Linguistics(EACL 2009). Athens, Greece.
Stephen Wan, Robert Dale, Mark Dras and Cecile Paris (2008) Seed and Grow: Augmenting Statistically Generated Summary Sentences using Schematic Word Patterns. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2008), 543-552. Hawaii, USA.
Andrew Mutton, Mark Dras, Stephen Wan and Robert Dale (2007) GLEU: Automatic Evaluation of Sentence-Level Fluency. In the Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. Prague, Czech Republic
Stephen Wan, Mark Dras, Robert Dale and Cecile Paris (2006) Using Dependency-based Features to Take the "Para-farce" out of Paraphrase. Proceedings of the Australasian Language Technology Workshop 2006 (ALTW 2006), 131-138. Sydney, Australia
Stephen Wan, Mark Dras, Robert Dale and Cécile Paris (2005) Towards statistical paraphrase generation: preliminary evaluations of grammaticality. In the Proceedings of The 3rd International Workshop on Paraphrasing (IWP2005) at IJCNLP 2005. Jeju Island, South Korea
Stephen Wan and Kathleen McKeown. (2004) Generating Overview Summaries of Ongoing Email Thread Discussions. In Proceedings of COLING 2004, the20th International Conference on Computational Linguistics. Geneva, Switzerland
Cécile Paris, Stephen Wan, Ross Wilkinson and Mingfang Wu. (2001). Generating Personal Travel Guides – and who wants them? In Proceedings of the International Conference on User Modelling (UM2001); Sonthofen, Germany, July 13-18, 2001
Ross Wilkinson, ShiJian Lu, Francois Paradis, Cécile Paris, Stephen Wan, and Mingfang Wu. (2000) Generating Personal Travel Guides from Discourse Plans. In Proceedings of International Conference on Adaptive Hypermedia and Adaptive Web-based Systems. Trento, Italy, August, 2000
Wan, Stephen and Verspoor, Cornelia Maria. (1998). Automatic English-Chinese name transliteration for development of multilingual resources. In Proceedings of COLING-ACL'98, the joint meeting of 17th International Conference on Computational Linguistics and the 36th Annual Meeting of the Association for Computational Linguistics, Montreal, Canada

Links