This page is for the module on Constructing and Evaluating Word Embeddings that I am teaching in University of Cambridge, together with Dr Ekaterina Kochmar.


Representing words as low-dimensional vectors allows systems to take advantage of semantic similarities, generalise to unseen examples and improve pattern detection accuracy on nearly all NLP tasks. Advances in neural networks and representation learning have opened new and exciting ways of learning word embeddings with unique properties.

In this topic we will provide an introduction to the classical vector space models and cover the most influential research in neural embeddings from the past couple of years, including word similarity and semantic analogy tasks, word2vec models and task-specific representation learning. We will also discuss the most recent advances in the field including multilingual embeddings and multimodal vectors using image detection.

By the end of the course you will have learned to construct word representations using both traditional and various neural network models. You will learn about different properties of these models and how to choose an approach for a specific task. You will also get an overview of the most recent and notable advances in the field.

Lecture slides

Introductory lecture on word embeddings

Background Reading:

Baroni et al. (2014). Don't count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors

Mikolov et al. (2013). Efficient Estimation of Word Representations in Vector Space

Mikolov et al. (2013). Linguistic Regularities in Continuous Space Word Representations

Papers for student presentations:

Socher et al. (2012). Semantic Compositionality through Recursive Matrix-Vector Spaces

Levy & Goldberg (2014, CoNLL best paper) Linguistic Regularities in Sparse and Explicit Word Representations

Moritz Hermann and Blunsom (2014, ACL). Multilingual Models for Compositional Distributed Semantics

Faruqui et al. (2015, best paper at NAACL). Retrofitting Word Vectors to Semantic Lexicons

Norouzi et al (2014, ICLR) Zero-Shot Learning by Convex Combination of Semantic Embeddings

Useful links

Word2vec, a tool for creating word embeddings

Word vectors pretrained on 100B words. More information on the word2vec homepage.

Tool for converting word2vec vectors between binary and plain-text formats. You can use this to convert the pre-trained vectors to plain-text.

Vectors trained using 3 different methods (counting, word2vec and dependecy-relations) on the same dataset (BNC).

An online tool for evaluating word vectors on 12 different word similarity datasets.

t-SNE, a tool for visualising word embeddings in 2D.