Project proposals for the 2018/2019 MPhil in Advanced Computer Science, at the University of Cambridge.

Below are some project suggestions for the current academic year. I recommend discussing them with me before applying for these.

If you have a project idea in mind already, in the area of machine learning and language, then also feel free to get in touch.

1) Breaking (and Fixing) Grammatical Error Detection Systems with Adversarial Examples

Proposer: Marek Rei
Supervisors: Marek Rei and Helen Yannakoudakis

Automated detection of grammatical errors in text can approached with a neural sequence labeler, classifying each word in a sentence as being correct or not in a given context (Rei & Yannakoydakis, 2016). The correction of grammatical errors is often performed as a machine translation task, with a sequence-to-sequence model, learning to translate from incorrect to correct sentences (Felice et al., 2014). In both of these cases, the supervised systems are trained on human-annotated examples and therefore learn to pick up frequent patterns in the dataset. However, there are infinitely many ways of making errors in text and learning them explicitly from data is very difficult. While these systems can successfully memorize common errors, they are not always able to generalize and tell the difference between correct and incorrect text.

The first stage of this project involve analyzing the weaknesses of existing systems and creating a new dataset of adversarial examples that are designed to trick error detection and error correction models into giving incorrect predictions (either false positives or false negatives). After that, potential solutions can be applied to in order to improve the models and make them more robust to these kinds of examples. This could involve automatic generation of adversarial examples that can be included into the training data, or improvements in the neural model architecture that allow the system to capture more accurate representations of the data.

For training the models for error detection and correction, the manually labelled FCE (Yannakoudakis et al., 2011) and CoNLL14 (Ng et al., 2014) datasets are available.


Compositional Sequence Labeling Models for Error Detection in Learner Writing
Marek Rei and Helen Yannakoudakis. 2016.

Grammatical error correction using hybrid systems and type filtering.
Mariano Felice, Zheng Yuan, Øistein E. Andersen, Helen Yannakoudakis and Ekaterina Kochmar. 2014.

Semi-supervised Multitask Learning for Sequence Labeling
Marek Rei. 2017.

Automatic Annotation and Evaluation of Error Types for Grammatical Error Correction
Christopher Bryant Mariano Felice Ted Briscoe. 2017

A New Dataset and Method for Automatically Grading ESOL Texts
Helen Yannakoudakis, Ted Briscoe, Ben Medlock. 2011

The CoNLL-2014 Shared Task on Grammatical Error Correction
Ng et al. 2014

2) Multi-task Learning for Detecting Figurative Language in Context

Proposer: Marek Rei
Supervisors: Marek Rei and Katia Shutova (University of Amsterdam)

Metaphor is pervasive in our everyday communication, enriching it with sophisticated imagery and helping us to reconcile our experience in the world with our conceptual system. For instance, when we talk about “curing juvenile delinquency” or “corruption transmitting through the government ranks”, we view the general concept of crime (the target concept) in terms of the properties of a disease (the source concept). Such metaphorical associations are broad generalisations that allow us to project knowledge and inferences across domains, and our metaphorical use of language is a reflection of this process.

Given its ubiquity, metaphorical language poses an important problem for natural language understanding (Shutova and Teufel, 2010). The detection of metaphorical phrases in context can be approached as a sequence labeling task - given an input sentence, the system needs to label individual tokens as belonging to a literal or figurative part of a sentence. This project could investigate multi-task learning methods for extending neural sequence labeling models. The aim is that by introducing auxiliary supervision into the network, the model is able to learn more robust language representations and detect metaphorical phrases in context more accurately.

Multi-task optimization has not been explored for metaphor detection before and there are a number of auxiliary objectives to choose from. Some examples:

The main task could be to improve performance on the NAACL-FLP shared task dataset. An existing neural sequence labeler is recommended as a starting point.


Leong, Chee Wee Ben, Beata Beigman Klebanov, and Ekaterina Shutova. "A report on the 2018 VUA metaphor detection shared task." Proceedings of the Workshop on Figurative Language Processing. 2018.

Shutova, Ekaterina, and Simone Teufel. "Metaphor Corpus Annotated for Source-Target Domain Mappings." LREC. Vol. 2. No. 2. 2010.

Rei, Marek, Luana Bulat, Douwe Kiela, and Ekaterina Shutova. "Grasping the finer point: A supervised similarity network for metaphor detection." EMNLP (2017).

Barrett, Maria, Joachim Bingel, Nora Hollenstein, Marek Rei, and Anders Søgaard. "Sequence classification with human attention." Proceedings of the 22nd Conference on Computational Natural Language Learning. 2018.

Rei, Marek. "Semi-supervised multitask learning for sequence labeling." ACL (2017).

3) Joint Text Classification on Multiple Levels with Multiple Labels

Proposer: Marek Rei
Supervisor: Marek Rei

This summer, we published a neural model for performing zero-shot sequence labeling, without the model seeing any examples of token-level annotations, by using sentence-level supervision instead (Rei and Søgaard, 2018). The architecture takes advantage of a self-attention framework and changes it so that the individual attention weights start functioning as token-level predictions.

Even more recently, we extended the model to take advantage of supervision on both levels (Rei and Søgaard, 2019). The attention function and sequence labeling predictions are still joined together, but now each one receives a training signal. By learning to perform the same task on both granularities, the model learns better composition functions and is regularized towards a more robust solution, improving performance on both levels.

The weakness of the current model is that it works only for binary tasks - an attention mechanism assumes 1 weight per token, which matches the task of binary sequence labeling. Therefore, we can perform tasks like grammatical error detection and uncertain language detection, but not 3-way sentiment classification or named entity recognition. The goal of this project is to extend the neural architecture to work for tasks with an arbitrary number of classes. The best solution for this is not immediately apparent and will need some experimentation. One possible approach is to implement multi-head attention, similar to the transformer architecture (Vaswani et al, 2017).


Rei, Marek, and Anders Søgaard. "Zero-shot Sequence Labeling: Transferring Knowledge from Sentences to Tokens." NAACL-HLT (2018).

Rei, Marek, and Anders Søgaard. "Jointly Learning to Label Sentences and Tokens." AAAI (2019).

Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. "Attention is all you need." In Advances in Neural Information Processing Systems, pp. 5998-6008. 2017.