John A. Carroll: Research projects

Research projects

Current projects

Promoting the release and utilization of semantically analyzed Japanese-English recipe data: with research groups at NII (Tokyo) and Tokyo and Kyoto Universities, Japan (NII Collaborative Research Strategy grant).
Partner in the international DELPH-IN (Deep Linguistic Processing with HPSG) collaboration.

Past projects

Centre for the Improvement of Population Health through E-health Research (CIPHER): contributing expertise on extracting information from the free text of electronic patient records, in order to maximise the utility of routine health-related data on the UK population (MRC and other UK government and charity funders).
Mobile Commerce as a Service: developing a universal mobile transaction platform using SMS text messaging as the user interface so that people can more easily purchase and gift goods and services (Innovate UK / Technology Strategy Board).
Patient Records Enhancement Programme (PREP): developing methodologies for understanding and exploiting free text to enhance the utility of primary care electronic patient records (Wellcome Trust).
Ranking Word Senses for Disambiguation: Models and Applications: devising and testing ways to estimate the frequency distributions of senses of words from raw (unannotated) text (EPSRC).
COGENT: Controlled Generation of Text: investigating the non-determinism in wide-coverage generation of text, and developing reflective techniques for controlling it effectively (EPSRC).
MEANING - Developing Multilingual Web-scale Language Technologies: collecting and analysing language data from the Web on a large scale, building more comprehensive multilingual lexical knowledge bases to support improved word sense disambiguation (EU 5th Framework).
DEEP THOUGHT - Hybrid Deep and Shallow Methods for Knowledge-Intensive Information Extraction: devising methods for combining robust shallow methods for language analysis with deep semantic processing; and demonstrating the approach in business intelligence, automated email processing and document production support applications (EU 5th Framework).
Robust Accurate Statistical Parsing (RASP): integrating and extending several strands of research on robust statistical parsing and automated grammar and lexicon induction, to produce a new parsing toolkit (EPSRC).
PSET: Practical Simplification of English Text: building a computer system which takes in English newspaper text and outputs a simplified version with broadly similar meaning; the intended users are people suffering from aphasia which impairs their comprehension of written English (EPSRC).
LEXSYS: Analysis of Naturally-occurring English Text with Stochastic Lexicalized Grammars: developing a robust wide-coverage parsing system for English text, exploiting a combination of: statistical information from corpora; inheritance hierarchies for imposing structure on language data; and lexicalised grammars (EPSRC).
SPARKLE (Shallow PARsing and Knowledge extraction for Language Engineering): developing shallow parsing technology in four European languages together with corpus-based lexical acquisition techniques, and deploying parsers in multilingual information retrieval and speech dialogue systems (EU 4th Framework).
Robust Analysis of Unrestricted English Text: developing and integrating knowledge-based and statistical processing techniques for accurate and robust analysis of unrestricted English text (EPSRC).