Marek Rei

I am a researcher in Machine Learning and Natural Language Processing. My work is focused on improving machine learning architectures for representation learning, transfer learning, autoregressive modeling and multi-task optimization. Most of my research is applied in the area of Natural Language Understanding and on tasks that benefit from capturing the semantics in text, such as structured prediction, language modeling, grammatical error detection, sentiment analysis and text classification.

I am a Senior Lecturer of Machine Learning at Imperial College London and a Visiting Researcher at the University of Cambridge. I am an AI Advisor for Gotya Technologies and Esgrid Technologies. I also provide consultancy services through Perception Labs.

Previously, I worked in the Research team at SwiftKey, where we developed experimental technologies for language modeling and natural language processing. One of my main projects was the neural network language model for text prediction. SwiftKey has since been acquired by Microsoft.

I received a PhD degree as a member of Churchill College in Cambridge, with my thesis on Minimally supervised dependency-based methods for natural language processing, under the supervision of Professor Ted Briscoe. Before that, in 2008-2009 I did an MPhil course in the Computer Lab, called Computer Speech, Text and Internet Technology. The topic of my dissertation was Adaptive Interactive Information Extraction.

I also studied three years at Tallinn University of Technology where I got my bachelor's degree with the thesis Creating a Model for Audiovisual Speech in Estonian.

Download: My CV

Research interests

My main areas of interest include:

transfer learning
representation learning
neural networks and deep learning models
large language models
unsupervised and semi-supervised learning
educational applications
(bio)medical applications of NLP

I provide consultancy services in the areas of machine learning and natural language processing. If you are interested, feel free to get in touch.

Contact

E-mail: marek@marekrei.com

Publications

Predicting cell type-specific epigenomic profiles accounting for distal genetic effects [bioRxiv] Alan E Murphy, William Beardall, Marek Rei, Mike Phuycharoen and Nathan G Skene bioRxiv, 2024

Prompting open-source and commercial language models for grammatical error correction of English learner text [arXiv] Christopher Davis, Andrew Caines, Øistein Andersen, Shiva Taslimipoor, Helen Yannakoudakis, Zheng Yuan, Christopher Bryant, Marek Rei and Paula Buttery ArXiv, 2024

When and Why Does Bias Mitigation Work? [pdf] Abhilasha Ravichander, Joe Stacey and Marek Rei In Findings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023) Singapore, 2023

Did the Neurons Read your Book? Document-level Membership Inference for Large Language Models [arXiv] Matthieu Meeus, Shubham Jain, Marek Rei and Yves-Alexandre de Montjoye ArXiv, 2023

The alignment of companies' sustainability behavior and emissions with global climate targets [link] Simone Cenci, Matteo Burato, Marek Rei and Maurizio Zollo Nature Communications, 2023

On the application of Large Language Models for language teaching and assessment technology [arXiv] Andrew Caines, Luca Benedetto, Shiva Taslimipoor, Christopher Davis, Yuan Gao, Oeistein Andersen, Zheng Yuan, Mark Elliott, Russell Moore, Christopher Bryant, Marek Rei, Helen Yannakoudakis, Andrew Mullooly, Diane Nicholls and Paula Buttery In Proceedings of the AIED 2023 Workshop on Empowering Education with LLMs (AIED LLM 2023) Tokyo, Japan, 2023

Logical Reasoning for Natural Language Inference Using Generated Facts as Atoms [arXiv] Joe Stacey, Pasquale Minervini, Haim Dubossarsky, Oana-Maria Camburu and Marek Rei ArXiv, 2023

Improving Robustness in Knowledge Distillation Using Domain-Targeted Data Augmentation [arXiv] Joe Stacey and Marek Rei ArXiv, 2023

Modelling Temporal Document Sequences for Clinical ICD Coding [pdf] Clarence Ng, Diogo Santos and Marek Rei In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023) Dubrovnik, Croatia, 2023

An Extended Sequence Tagging Vocabulary for Grammatical Error Correction [pdf] Stuart Mesham, Christopher Bryant, Marek Rei and Zheng Yuan In Findings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023) Dubrovnik, Croatia, 2023

Finding the Needle in a Haystack: Unsupervised Rationale Extraction from Long Text Classifiers [arXiv] Kamil Bujel, Andrew Caines, Helen Yannakoudakis and Marek Rei ArXiv, 2023

Logical Reasoning with Span-Level Predictions for Interpretable and Robust NLI Models [pdf] [arXiv] Joe Stacey, Pasquale Minervini, Haim Dubossarsky and Marek Rei In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022) Abu Dhabi, United Arab Emirates, 2022

Multimodal Conversation Modelling for Topic Derailment Detection [pdf] Zhenhao Li, Marek Rei and Lucia Specia In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022) Abu Dhabi, United Arab Emirates, 2022

Control Prefixes for Parameter-Efficient Text Generation [pdf] [arXiv] Jordan Clive, Kris Cao, Marek Rei In Proceedings of the Second workshop on Generation, Evaluation & Metrics (GEM 2022) Abu Dhabi, United Arab Emirates, 2022

Probing for targeted syntactic knowledge through grammatical error detection [pdf] Christopher Davis, Christopher Bryant, Andrew Caines, Marek Rei and Paula Buttery In Proceedings of the 26th Conference on Computational Natural Language Learning (CoNLL 2022) Abu Dhabi, United Arab Emirates, 2022

An Analysis of Corporate Sustainability Behaviour Through the Lens of Empirical Fitness Landscapes [pre-print] Simone Cenci, Marek Rei and Maurizio Zollo SSRN pre-print under review, 2022

Business sustainability behaviour and alignment with climate targets [pre-print] Simone Cenci, Matteo Burato, Marek Rei and Maurizio Zollo Research Square pre-print under review, 2022

Guiding Visual Question Generation [pdf] [arXiv] [video] Nihir Vedd, Zixu Wang, Marek Rei, Yishu Miao, Lucia Specia In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL-HLT 2022) Seattle, Washington, USA, 2022

Memorisation versus Generalisation in Pre-trained Language Models [pdf] [arXiv] [poster] [video] Michael Tänzer, Sebastian Ruder and Marek Rei In Proceedings of the 60th annual meeting of the Association for Computational Linguistics (ACL 2022) Dublin, Ireland, 2022

Supervising Model Attention with Human Explanations for Robust Natural Language Inference [arXiv] [poster] Joe Stacey, Yonatan Belinkov and Marek Rei In Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI 2022) *Acceptance rate: 15%* Virtual Conference, 2022

Visual Cues and Error Correction for Translation Robustness [pdf] [arXiv] [video] [code] Zhenhao Li, Marek Rei, Lucia Specia In Findings of the Association for Computational Linguistics: EMNLP 2021

GiBERT: Introducing Linguistic Knowledge into BERT through a Lightweight Gated Injection Method [pdf] [arXiv] [video] [code] Nicole Peinelt, Marek Rei and Maria Liakata In Findings of the Association for Computational Linguistics: EMNLP 2021

Zero-shot Sequence Labeling for Transformer-based Sentence Classifiers [pdf] [arXiv] [video] [code] Kamil Bujel, Helen Yannakoudakis and Marek Rei In Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP 2021) Virtual Conference, 2021

How Metaphors Impact Political Discourse: A Large-Scale Topic-Agnostic Study Using Neural Metaphor Detection [pdf] [arXiv] Vinodkumar Prabhakaran, Marek Rei and Ekaterina Shutova In Proceedings of the 15th AAAI International Conference on Web and Social Media (ICWSM 2021) *Acceptance rate: 21.4%* Atlanta, USA, 2021

Contextual Sentence Classification: Detecting Sustainability Initiatives in Company Reports [arXiv] Dan Hirlea, Christopher Bryant, Maurizio Zollo, Marek Rei ArXiv, 2021

Seeing Both the Forest and the Trees: Multi-head Attention for Joint Classification on Different Compositional Levels [pdf] [arXiv] [code] Miruna Pislar and Marek Rei In Proceedings of the 28th International Conference on Computational Linguistics (COLING 2020) Virtual Conference, 2020

Grammatical error detection in transcriptions of spoken English [pdf] [dataset] Andrew Caines, Christian Bentz, Kate Knill, Marek Rei and Paula Buttery In Proceedings of the 28th International Conference on Computational Linguistics (COLING 2020) Virtual Conference, 2020

Grammatical Error Correction in Low Error Density Domains: A New Benchmark and Analyses [pdf] [arXiv] [dataset] [video] Simon Flachs, Ophélie Lacroix, Helen Yannakoudakis, Marek Rei and Anders Søgaard In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020) *Acceptance rate: 22.4%* Virtual Conference, 2020

Multidirectional Associative Optimization of Function-Specific Word Representations [pdf] [arXiv] [video] [code] Daniela Gerz, Ivan Vulić, Marek Rei, Roi Reichart and Anna Korhonen In Proceedings of the 58th annual meeting of the Association for Computational Linguistics (ACL 2020) *Acceptance rate: 25.2%* Seattle, USA, 2020

Verbal Multiword Expressions for Identification of Metaphor [pdf] [video] Omid Rohanian, Marek Rei, Shiva Taslimipoor and Le An Ha In Proceedings of the 58th annual meeting of the Association for Computational Linguistics (ACL 2020) *Acceptance rate: 25.2%* Seattle, USA, 2020

Bad Form: Comparing Context-Based and Form-Based Few-Shot Learning in Distributional Semantic Models [pdf] [arXiv] Jeroen Van Hautte, Guy Emerson and Marek Rei In Proceedings of the Second Workshop on Deep Learning for Low-Resource NLP (DeepLo 2019) Hong Kong, China, 2019

Modelling the interplay of metaphor and emotion through multitask learning [pdf] Verna Dankers, Marek Rei, Martha Lewis and Ekaterina Shutova In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP 2019) Hong Kong, China, 2019

Semi-Supervised Bootstrapping of Dialogue State Trackers for Task-Oriented Modelling [pdf] Bo-Hsiang Tseng, Marek Rei, Paweł Budzianowski, Richard Turner, Bill Byrne and Anna Korhonen In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP 2019) Hong Kong, China, 2019

Neural and FST-based approaches to grammatical error correction [pdf] Zheng Yuan, Felix Stahlberg, Marek Rei, Bill Byrne and Helen Yannakoudakis In Proceedings of the 14th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2019) Florence, Italy, 2019

Context is Key: Grammatical Error Detection with Contextual Word Representations [pdf] [arXiv] [code] Samuel Bell, Helen Yannakoudakis and Marek Rei In Proceedings of the 14th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2019) Florence, Italy, 2019

CAMsterdam at SemEval-2019 Task 6: Neural and graph-based feature extraction for the identification of offensive tweets [pdf] Guy Aglionby, Christopher Davis, Pushkar Mishra, Andrew Caines, Helen Yannakoudakis, Marek Rei, Ekaterina Shutova and Paula Buttery In Proceedings of the International Workshop on Semantic Evaluation 2019 (SemEval 2019) Minneapolis, USA, 2019

A Simple and Robust Approach to Detecting Subject-Verb Agreement Errors [pdf] Simon Flachs, Ophélie Lacroix, Marek Rei, Helen Yannakoudakis and Anders Søgaard In Proceedings of the 17th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2019) Minneapolis, USA, 2019

Jointly Learning to Label Sentences and Tokens [pdf] [arXiv] [code] [slides] [poster] Marek Rei and Anders Søgaard In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI 2019) *Acceptance rate: 16.2%* Honolulu, USA, 2019

Advance Prediction of Ventricular Tachyarrhythmias using Patient Metadata and Multi-Task Networks [arXiv] [poster] Marek Rei, Josh Oppenheimer and Marek Sirendi In Proceedings of the NeurIPS Workshop on Machine Learning for Health (ML4H 2018) Montreal, Canada, 2018

Sequence classification with human attention [pdf] [code] Maria Barrett, Joachim Bingel, Nora Hollenstein, Marek Rei and Anders Søgaard In Proceedings of the SIGNLL Conference on Computational Natural Language Learning (CoNLL 2018) *Special award for the best paper on research inspired by human language learning and processing* Brussels, Belgium, 2018

Scoring Lexical Entailment with a Supervised Directional Similarity Network [pdf] [arXiv] [code] [slides] [video] Marek Rei, Daniela Gerz and Ivan Vulić In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018) *Acceptance rate: 24.9%* Melbourne, Australia, 2018

Zero-shot Sequence Labeling: Transferring Knowledge from Sentences to Tokens [pdf] [arXiv] [slides] [video] [code] Marek Rei and Anders Søgaard In Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2018) New Orleans, United States, 2018

Variable Typing: Assigning Meaning to Variables in Mathematical Text [pdf] [slides] [video] Yiannos Stathopoulos, Simon Baker, Marek Rei and Simone Teufel In Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2018) New Orleans, United States, 2018

Neural Multi-task Learning in Automated Assessment [arXiv] Ronan Cummins, Marek Rei arXiv:1801.06830, 2018

Grasping the Finer Point: A Supervised Similarity Network for Metaphor Detection [pdf] [arXiv] [slides] [code] [video] Marek Rei, Luana Bulat, Douwe Kiela and Ekaterina Shutova In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP-2017) *Acceptance rate: 26%* Copenhagen, Denmark, 2017

Neural Sequence-Labelling Models for Grammatical Error Correction [pdf] Helen Yannakoudakis, Marek Rei, Øistein E. Andersen and Zheng Yuan In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP-2017) *Acceptance rate: 26%* Copenhagen, Denmark, 2017

Artificial Error Generation with Machine Translation and Syntactic Patterns [pdf] [arXiv] Marek Rei, Mariano Felice, Zheng Yuan and Ted Briscoe In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications (BEA-2017) Copenhagen, Denmark, 2017

Auxiliary Objectives for Neural Error Detection Models [pdf] [arXiv] [slides] Marek Rei and Helen Yannakoudakis In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications (BEA-2017) Copenhagen, Denmark, 2017

An Error-Oriented Approach to Word Embedding Pre-Training [pdf] [arXiv] Youmna Farag, Marek Rei and Ted Briscoe In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications (BEA-2017) Copenhagen, Denmark, 2017

Detecting Off-topic Responses to Visual Prompts [pdf] [arXiv] [poster] Marek Rei In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications (BEA-2017) Copenhagen, Denmark, 2017

Semi-supervised Multitask Learning for Sequence Labeling [pdf] [arXiv] [poster] [code] Marek Rei In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL-2017) Vancouver, Canada, 2017

Attending to characters in neural sequence labeling models [pdf] [arXiv] [poster] [code] Marek Rei, Gamal K.O. Crichton and Sampo Pyysalo In Proceedings of the 26th International Conference on Computational Linguistics (COLING-2016) Osaka, Japan, 2016

A Joint Model for Word Embedding and Word Morphology [pdf] [arXiv] Kris Cao and Marek Rei In Proceedings of the 1st Workshop on Representation Learning for NLP (RepL4NLP-2016) Berlin, Germany, 2016

Compositional Sequence Labeling Models for Error Detection in Learner Writing [pdf] [arXiv] [poster] [code] Marek Rei and Helen Yannakoudakis In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL-2016) Berlin, Germany, 2016

Automatic Text Scoring Using Neural Networks [pdf] [arXiv] [poster] Dimitrios Alikaniotis, Helen Yannakoudakis and Marek Rei In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL-2016) Berlin, Germany, 2016

Sentence Similarity Measures for Fine-Grained Estimation of Topical Relevance in Learner Essays [pdf] [arXiv] [weights] [code] [slides] Marek Rei and Ronan Cummins In Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications (BEA) San Diego, United States, 2016

Online Representation Learning in Recurrent Neural Language Models [pdf] [poster] Marek Rei In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP) Lisbon, Portugal, 2015

Looking for hyponyms in vector space [pdf] [vectorsets] [dataset] [slides] [poster] Marek Rei and Ted Briscoe In Proceedings of the Eighteenth Conference on Computational Natural Language Learning (CoNLL-14) Baltimore, Maryland, United States, 2014

Parser lexicalisation through self-learning [pdf] [poster] Marek Rei and Ted Briscoe In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2013). Atlanta, United States, 2013

Minimally supervised dependency-based methods for natural language processing [TR pdf] Marek Rei PhD thesis, University of Cambridge Cambridge, United Kingdom, 2013

Unsupervised Entailment Detection between Dependency Graph Fragments [pdf] [dataset] Marek Rei and Ted Briscoe In Proceedings of the 2011 Workshop on Biomedical Natural Language Processing (BioNLP-11). Portland, United States, 2011

Intelligent Information Access from Scientific Papers [Springer][draft pdf] Ted Briscoe, Karl Harrison, Andrew Naish-Guzman, Andy Parker, Marek Rei, Advaith Siddharthan, David Sinclair, Mark Slater and Rebecca Watson Current Challenges in Patent Information Retrieval, edited by Mihai Lupu, Katja Mayer, John Tait and Anthony J. Trippe. Springer, Dordrecht, 2011

Combining Manual Rules and Supervised Learning for Hedge Cue and Scope Detection [pdf] Marek Rei and Ted Briscoe The 14th Conference on Natural Language Learning (CoNLL-10). Uppsala, Sweden, 2010

Adaptive Interactive Information Extraction [pdf] Marek Rei MPhil thesis Computer Laboratory, University of Cambridge, 2009