Word Embeddings 2017

This page is for the module on Constructing and Evaluating Word Embeddings that I am teaching in University of Cambridge, together with Dr Ekaterina Kochmar.

Description

Representing words as low-dimensional vectors allows systems to take advantage of semantic similarities, generalise to unseen examples and improve pattern detection accuracy on nearly all NLP tasks. Advances in neural networks and representation learning have opened new and exciting ways of learning word embeddings with unique properties.

In this topic we will provide an introduction to a range of vector space models and cover the most influential research in neural embeddings from the past couple of years, including word similarity and semantic analogy tasks, word2vec models and task-specific representation learning. We will also discuss the most recent advances in the field including multilingual embeddings, multimodal vectors using image detection, and building character-based representations.

By the end of the course you will have learned to construct word representations using both traditional and various neural network models. You will learn about different properties of these models and how to choose an approach for a specific task. You will also get an overview of the most recent and notable advances in the field.

Lecture slides

Introductory lecture on word embeddings

Background Reading:

Baroni et al. (2014). Don't count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors

Mikolov et al. (2013). Efficient Estimation of Word Representations in Vector Space

Mikolov et al. (2013). Linguistic Regularities in Continuous Space Word Representations

Levy et al. (2015) Improving Distributional Similarity with Lessons Learned from Word Embeddings

Socher et al. (2012). Semantic Compositionality through Recursive Matrix-Vector Spaces

Papers for student presentations:

Levy & Goldberg (2014, CoNLL best paper) Linguistic Regularities in Sparse and Explicit Word Representations

Faruqui et al. (2015, best paper at NAACL). Retrofitting Word Vectors to Semantic Lexicons

Moritz Hermann and Blunsom (2014, ACL). Multilingual Models for Compositional Distributed Semantics

Jozefowicz et al. (2016, arXiv preprint) Exploring the Limits of Language Modeling

Norouzi et al (2014, ICLR) Zero-Shot Learning by Convex Combination of Semantic Embeddings

Kiela and Bottou (2014, EMNLP) Learning Image Embeddings using Convolutional Neural Networks for Improved Multi-Modal Semantics

Resources & Datasets

Word2vec, a tool for creating word embeddings
https://code.google.com/archive/p/word2vec/

Word vectors pretrained on 100B words. More information on the word2vec homepage.
https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit?usp=sharing

Tool for converting word2vec vectors between binary and plain-text formats. You can use this to convert the pre-trained vectors to plain-text.
https://github.com/marekrei/convertvec

Vectors trained using 3 different methods (counting, word2vec and dependecy-relations) on the same dataset (BNC).
http://www.marekrei.com/projects/vectorsets/

An online tool for evaluating word vectors on 12 different word similarity datasets.
http://www.wordvectors.org/

t-SNE, a tool for visualising word embeddings in 2D.
http://lvdmaaten.github.io/tsne/

GloVe model and pre-trained vectors
http://nlp.stanford.edu/projects/glove/

Global context vectors
http://www.socher.org/index.php/Main/ImprovingWordRepresentationsViaGlobalContextAndMultipleWordPrototypes

Multilingual vectors
http://www.cs.cmu.edu/~mfaruqui/soft.html

Retrofitting word vectors to semantic lexicons
https://github.com/mfaruqui/retrofitting