Academic responsibilities: TEACHING: 2 modules, 8 BSc and MSc research projects, 16 personal tutees; RESEARCH AI4AI Artificial Intelligence for Arabic and Islamic research leader; Supervision: 10 PhDs, 1 RA; SUPPORT: MSc Tutor; PLUS: writing publications and EPSRC, AHRC, CRUK, EU project proposals; Peer Review College for AHRC EPSRC QNRF HK-RGC H2020; PhD external examiner; journal and conference reviewer; visitor SUSTECH Sudan; Arabic-L and Language@Leeds research network admin

Employment history and previous academic responsibilities:

1996-2013 Senior Lecturer, U of Leeds: Research, Teaching, NLP research group leader

1994-1996 Director, JISC CALAS NTI: Management of JISC UK R&D programme

1991-1994 National Coordinator, JISC KBSI: Management of JISC UK R&D programme

1990-1991 SERC Advanced Research Fellow: Research with SERC/MoD network

1984-1990 Lecturer, U of Leeds: Teaching, Research, developing NLP within AI team

1981-1984 Research Associate, U of Lancaster: Research, developing grant applications


Academic career breaks from Leeds University:

2013-7: visiting scholar, SUSTECH Sudan Uni of Science and Technology, Khartoum  

2012-3: visiting scholar, Linguistics and English Language Dept, University of Manchester

2011-3: visiting scholar, Computing Dept, King Saud University, Riyadh, Saudi Arabia

2000: visiting scholar, Dept of Computer Science, University of Sheffield  

1994-6: JISC Computer Analysis of Language And Speech New Technologies Initiative

1995: visiting scholar, Dept of Computer Applications, Dublin City University

1994: visiting scholar, Inst for Language & Artificial Intelligence, Tilburg University  

1991-4: National Coordinator, JISC Knowledge Based Systems Initiative

1990-1991: Science and Engineering Research Council, Advanced Research Fellowship 1989: visiting scholar, Max Planck Institute and Dept of Language, Nijmegen University


Research Grants:

Artificial Intelligence Networking (Leeds University Academic Development Fund, 2017)

Natural Language Processing Working Together With Arabic And Islamic Studies (Engineering and Physical Science Research Council, 2013-2015)

e-Health GATEway to the Clouds (Joint Information Systems Committee, 2012)

Web-based resources for Islamic Studies (Higher Education Academy, 2011)

Detecting Terrorist Activities: Making Sense (Engineering and Physical Science Research Council, 2010-2013)

Copying-Identifier for Biomedical Science Reports (Higher Education Funding Council, 2002)

ISLE - Interactive spoken language education (European Union, 1998-2001)

Information extraction from air traffic control (Visionair International, 1994-1997)

Mapping between corpus annotation schemes (Nuffield Foundation, 1994)

Computer based resources in KBS & SALT (Higher Education Funding Council, 1994-1996)

Knowledge based timetable information (Higher Education Funding Council, 1994-1995)

AMALGAM corpus annotation (Science and Engineering Research Council, 1993-1997)

Computer analysis of language and speech (Higher Education Funding Council, 1993-1995)

Pilot project for CCALAS (Leeds University Research Support, 1993-1994)

A speech-oriented stochastic parser (Ministry of Defence, 1992-1993)

Knowledge-Based Systems Initiative (Joint Information Systems Committee, 1991-1994)

Neural network parsers trained with realistic corpora (British Telecom, 1990-1991)

Advanced Fellowship in IT (Science and Engineering Research Council, 1990-1991)

ABC Arabic By Computer (British Society for Middle Eastern Studies, 1989-1990)

COMMUNAL Convivial man machine understanding (Ministry of Defence, 1987-1989)

A simulated annealing parser for authentic English (Ministry of Defence, 1986-1989)


PhD research supervision:

 Z Ahmed, due 2020, Hafs and Warsh Quran Arabic corpora

 J Alasamari, due 2020, Arabic and English verb systems in the Quran corpus

 N Ahmad, due 2019, Malay natural language processing for retrieval from Malay Qur'an

 A Alshutayri, due 2019, Arabic dialects classification

 L Aldhubayi, due 2019, Corpus-based methods and to extend Arabic WordNet

 A Alghamdi, due 2018, Arabic corpus-informed lexicon of formulaic sequences

 A Alosaimi, due 2018, Ensemble morphosyntactic analyser for classical Arabic

 M Alqahtani, due 2018, Quranic Arabic semantic search tool based on ontology of concepts

 S Alrehaili, due 2017, Ontology concept and relation learning from the Qur’an corpus

 J Jaafar, due 2017, Data mining and machine learning to predict acute coronary illness

 A Alfaifi, 2016, Arabic learner corpus and a system for Arabic error annotation

 S Danso, 2016, Text analytics to predict time and cause of death from verbal autopsies

 K Dukes, 2013, Statistical parsing by machine learning from a Classical Arabic treebank

 S Hina, 2013, Semantic tagging of medical narratives using SNOMED CT

 A Muhammad, 2012, Annotation of conceptual co-reference and similarity in the Qur'an

 J Washtell, 2011, Distributional meaning in text: distance, expectation, and composition

 M Sawalha, 2011, Open-source resources and standards for Arabic word structure analysis

 C Brierley, 2011, Prosody resources for automated phrase break prediction

 O Nancarrow, 2011, Tagging of adverbs in modern English corpora

 F Su, 2010, Computational modelling of word sense sentiment

 N Abbas, 2009, Qurany 'Search for a Concept' tool and website

 A Roberts, 2008, Unsupervised machine learning for grammatical inference

 D Elliott, 2006, Corpus-based machine translation evaluation via automated error detection

 B AbuShawar, 2005, A corpus based approach to generalise a chatbot system

 L Al-Sulaiti, 2004, Designing and developing a Corpus of Contemporary Arabic

 T Oba, 2003, HTK to analyse prosody in the ISLE corpus of spoken learner's English

 J Elliott, 2003, Natural language learning for SETI Search for Extra-Terrestrial Intelligence

 X Duan, 2001, Lexical Semantic Association Between Web Documents

 G Churcher, 1997, Speech dialogue analysis using linguistic knowledge

 G Demetriou, 1997, Lexical semantics for human-computer speech communication.

 M Schillo, 1996, Working while driving: corpus-based in-car personal assistant

 X Zhang, 1996, MIRTH Chinese and English search engine: a multilingual retrieval tool

 C Souter, 1996, A corpus-trained parser for systemic-functional syntax

 A Bull, 1996, Aerobic dance exercise: a corpus-based computational linguistics approach

 U Jost, 1994, Probabilistic language modeling for speech recognition

 S Arnfield, 1994, Prosody and syntax in corpus-based analysis of spoken English

 J Hughes, 1993, Automatically acquiring a classification of words

 T O’Donoghue, 1993, Reversing the process of generation in Systemic Grammar


Research interests

My research speciality is corpus linguistics and text analytics: Machine Learning and Data Mining analysis of a CORPUS of text - in English, Arabic, or other languages - to analyse the text and detect "interesting" and "useful" features or patterns. For example:
Detecting terrorist activities, by analysis of documents from terrorist suspects, to highlight suspicious parts of the text.
Analysis and text-mining of the Quran, to find links and patterns in the Quran verses and chapters whcih are of interest to Islamic and Religious Studies scholars.
Detecting cause of death from verbal autopsy text documents describing the circumstances of the death.

I lead research projects, including the Leeds contribution to EPSRC/ESRC/CPNI-funded 2.2M-pound research project IDEAS factory - detecting terrorist activities: making sense, as featured in The Engineer; JISC-funded project e-Health GATEwayto the Clouds to enable secure research access to anonymized e-Health patient records; HEA-funded project to collect web-based resources for teaching and research in Islamic Studiesand EPSRC-funded project Natural language processing working together with Arabic and Islamic Studies.

I work with the Language @ Leeds research group (language@lists.leeds.ac.uk), and the Artificial Intelligence Research Group.. See Guidelines on writing a successful PhD proposal, and my video clip.
I am proud of the c40 research students I have supervised, who went on to work in a range of careers, including web search, banking and finance, text analytics, translation and language consulting, online news, voice-to-text, the search for extra-terrestrial intelligence, and, of course, as University academics! See feedback from current and past students.




  • PhD (Leeds) Corpus linguistics and language learning, 2008 http://etheses.whiterose.ac.uk/7504/
  • BA 1st Class (Lancaster) Computing and Linguistics, 1981

Professional memberships

  • EPSRC (Engineering and Physical Science Research Council) Peer Review College
  • AHRC (Arts and Humanties Research Council) Peer Review College
  • ACL (Association for Computational Linguistics) member

Student education

My teaching includes: supervision of PhD, MSc and BSc student research projects; lecturing to Computing MSc and BSc students; and personal tutorials for MSc students. I have also taught at international conference tutorials and summer schools, for example Aston Summer School on Corpus Linguistics.

I have taught a range of subjects, including Data Mining, Text Analytics, Language, Natural Language Processing, Computational Linguistics, Knowledge Management and Adaptive Systems, Technologies for Knowledge Management, Corpus Linguistics, Object Oriented Programming, Professional Development, Knowledge Based Systems, Knowledge Discovery, Perceptual Systems, Future Directions in Distributed Multimedia Systems, Introductory Programming, Numerical Methods, Artificial Intelligence, Databases and Information Systems, Arabic Natural Language Processing.

Research groups and institutes

  • Applied Computing in Biology, Medicine and Health
  • Artificial Intelligence