I am a Senior Research Associate in University of Cambridge, and a member of the Natural Language and Information Processing Group. My main focus is on the ALTA project, an industry collaboration with Cambridge English, with the goal of creating innovative new technologies for language learning and assessment. I'm also a visiting lecturer in Tartu University, teaching a course on machine learning and language modelling.
Before that, I was a member of the Research team in SwiftKey, where our goal was to focus on potential future technologies, thinking a few years ahead. One of my main projects was the neural network language model for text prediction, and my work resulted in a pending patent application on high-efficiency neural networks on mobile devices. SwiftKey has since been acquired by Microsoft.
I received a PhD degree as a member of Churchill College in Cambridge, with my thesis on Minimally supervised dependency-based methods for natural language processing, under the supervision of Professor Ted Briscoe. Before that, in 2008-2009 I did an MPhil course in the Computer Lab, called Computer Speech, Text and Internet Technology. The topic of my dissertation was Adaptive Interactive Information Extraction.
I also studied three years in Tallinn University of Technology where I got my bachelor's degree with the thesis Creating a Model for Audiovisual Speech in Estonian.
Download: My CV
My main areas of interest include:
- neural networks and deep learning models
- educational applications and automated assessment
- distributional and compositional semantics
- unsupervised and semi-supervised learning
- text mining
- information extraction
- (bio)medical applications of NLP
Piano, guitar, dancing (salsa, rock'n'roll, ballroom/latin), nature, geocaching, good movies, good books
See the separate page that lists my projects.
E-mail: marek ät marekrei dot com
Attending to characters in neural sequence labeling models In Proceedings of the 26th International Conference on Computational Linguistics (COLING-2016) Osaka, Japan, 2016
A Joint Model for Word Embedding and Word Morphology In Proceedings of the 1st Workshop on Representation Learning for NLP (RepL4NLP-2016) Berlin, Germany, 2016
Compositional Sequence Labeling Models for Error Detection in Learner Writing In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL-2016) Berlin, Germany, 2016
Automatic Text Scoring Using Neural Networks In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL-2016) Berlin, Germany, 2016
Sentence Similarity Measures for Fine-Grained Estimation of Topical Relevance in Learner Essays In Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications (BEA) San Diego, United States, 2016
Online Representation Learning in Recurrent Neural Language Models In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP) Lisbon, Portugal, 2015
Looking for hyponyms in vector space In Proceedings of the Eighteenth Conference on Computational Natural Language Learning (CoNLL-14) Baltimore, Maryland, United States, 2014
Parser lexicalisation through self-learning In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2013). Atlanta, United States, 2013
Minimally supervised dependency-based methods for natural language processing PhD thesis, University of Cambridge Cambridge, United Kingdom, 2013
Unsupervised Entailment Detection between Dependency Graph Fragments In Proceedings of the 2011 Workshop on Biomedical Natural Language Processing (BioNLP-11). Portland, United States, 2011
Intelligent Information Access from Scientific Papers Current Challenges in Patent Information Retrieval, edited by Mihai Lupu, Katja Mayer, John Tait and Anthony J. Trippe. Springer, Dordrecht, 2011
Combining Manual Rules and Supervised Learning for Hedge Cue and Scope Detection The 14th Conference on Natural Language Learning (CoNLL-10). Uppsala, Sweden, 2010
Adaptive Interactive Information Extraction MPhil thesis Computer Laboratory, University of Cambridge, 2009