Menu Close

Month: January 2019

ML and NLP Publications in 2018

It is time for another yearly update of the publication statistics in Machine Learning and Natural Language Processing. The field has continued to grow very rapidly, both in number of publications and number of attendees, breaking all sorts of previous records. Perhaps most notably the initial release of NeurIPS conference tickets sold out in 11 minutes and 38 seconds. In this post I will provide some finer-grained statistics on these numbers, showing which authors and organizations are publishing most at specific conferences.

This year, I have included the following conferences/journals: ACL, EMNLP, NAACL, EACL, COLING, TACL, CL, CoNLL, NeurIPS, ICML, ICLR, AAAI. This selection aims to cover the most well-known and high-ranking venues for publishing work on both machine learning and language technologies. Compared to last year, I've removed SemEval, as it has a large focus on shared task papers and I'm not including these for other conferences either. I've also added AAAI, which is one of the bigger conferences and was previously missing from the rankings. NeurIPS (previously known as NIPS) changed its name this year, but for consistency I will use the new name to refer to all the previous iterations as well.

This analysis is done automatically with a collection of scripts that I've continued to improve over the years. The paper lists are crawled from online proceedings and author names can usually be found there as well. Organization names need to be extracted straight from the PDFs which can lead to quite a bit of noise. I've created various methods for detecting and mapping different types of names, but let me know if you spot any remaining errors.

While this post highlights authors and organizations who have published the most in the recent year, I want to specify that I do not think that publication quantity is something that we as a field should be pursuing or rewarding. As the graphs below show, the field is becoming more and more popular, and this rapid increase in numbers comes with very varying quality. Authoring 1 piece of groundbreaking work is always better than releasing 10 totally forgettable incremental papers. This post is just meant to give a light high-level view of who is currently publishing and at which conferences, and perhaps provide a bit of inspiration for new researchers with great ideas.

Venues

We start off by looking at the publications at all the conferences between 2012-2018. Most of the ML venues continued their growth in the number of published papers, with AAAI and NeurIPS going past the 1,000 paper mark. EMNLP and NAACL also had their record years by quite a margin, whereas ACL and COLING stayed closer to the previous numbers. EACL took this year to rest, and the number of papers in TACL and CL has remained relatively stable throughout the years.