It has been only two weeks into the last month of the year and arxiv.org, the popular repository for ML research papers has already witnessed close to 600 uploads. This should give one the idea of the pace at which machine learning research is proceeding.
Copyright by www.analyticsindiamag.com
However, keeping track of all these research work is almost impossible. Every year, the research that gets maximum noise is usually from companies like Google and Facebook; from top universities like MIT; from research labs and most importantly from the conferences like NeurIPS or ACL.
- CVPR: 1,470 research papers on computer vision accepted from 6,656 valid submissions.
- ICLR: 687 out of 2594 papers made it to ICLR 2020 — a 26.5% acceptance rate.
- ICML: 1088 papers have been accepted from 4990 submissions.
In this article, we have compiled a list of interesting machine learning research work that has made some noise this year.
Natural Language Processing
GPT-3
This is the seminal paper that introduced the most popular ML model of the year — GPT-3. In the paper titled, “Transformers are few shot learners”, the OpenAI team used the same model and architecture as GPT-2 that includes modified initialisation, pre-normalisation, and reversible tokenisation along with alternating dense and locally banded sparse attention patterns in the layers of the transformer. While the GPT-3 model achieved promising results in the zero-shot and one-shot settings, in the few-shot setting, it occasionally surpassed state-of-the-art models.
ALBERT: A Lite BERT
Usually, increasing model size when pretraining natural language representations often result in improved performance on downstream tasks, but the training times become longer. To address these problems, the authors in their work presented two parameter-reduction techniques to lower memory consumption and increase the training speed of BERT. The authors also used a self-supervised loss that focuses on modelling inter-sentence coherence and consistently helped downstream tasks with multi-sentence inputs. According to results, this model established new state-of-the-art results on the GLUE, RACE, and squad benchmarks while having fewer parameters compared to BERT-large.
Check the paper here.
Beyond Accuracy: Behavioral Testing of NLP Models with CheckList
Microsoft Research, along with the University of Washington and the University of California, in this paper, introduced a model-agnostic and task agnostic methodology for testing NLP models known as CheckList. This is also the winner of the best paper award at the ACL conference this year. It included a matrix of general linguistic capabilities and test types that facilitate comprehensive test ideation, as well as a software tool to generate a large and diverse number of test cases quickly.
Check the paper here.
Thank you for reading this post, don't forget to subscribe to our AI NAVIGATOR!
Linformer is a Transformer architecture for tackling the self-attention bottleneck in Transformers. It reduces self-attention to an O(n) operation in both space- and time complexity. It is a new self-attention mechanism which allows the researchers to compute the contextual mapping in linear time and memory complexity with respect to the sequence length.
Read more about the paper here.
Plug and Play Language Models
Plug and Play Language Models (PPLM) are a combination of pre-trained language models with one or more simple attribute classifiers. This, in turn, assists in text generation without any further training. According to the authors, model samples demonstrated control over sentiment styles, and extensive automated and human-annotated evaluations showed attribute alignment and fluency. […]
Read more: www.analyticsindiamag.com
It has been only two weeks into the last month of the year and arxiv.org, the popular repository for ML research papers has already witnessed close to 600 uploads. This should give one the idea of the pace at which machine learning research is proceeding.
Copyright by www.analyticsindiamag.com
However, keeping track of all these research work is almost impossible. Every year, the research that gets maximum noise is usually from companies like Google and Facebook; from top universities like MIT; from research labs and most importantly from the conferences like NeurIPS or ACL.
In this article, we have compiled a list of interesting machine learning research work that has made some noise this year.
Natural Language Processing
GPT-3
This is the seminal paper that introduced the most popular ML model of the year — GPT-3. In the paper titled, “Transformers are few shot learners”, the OpenAI team used the same model and architecture as GPT-2 that includes modified initialisation, pre-normalisation, and reversible tokenisation along with alternating dense and locally banded sparse attention patterns in the layers of the transformer. While the GPT-3 model achieved promising results in the zero-shot and one-shot settings, in the few-shot setting, it occasionally surpassed state-of-the-art models.
ALBERT: A Lite BERT
Usually, increasing model size when pretraining natural language representations often result in improved performance on downstream tasks, but the training times become longer. To address these problems, the authors in their work presented two parameter-reduction techniques to lower memory consumption and increase the training speed of BERT. The authors also used a self-supervised loss that focuses on modelling inter-sentence coherence and consistently helped downstream tasks with multi-sentence inputs. According to results, this model established new state-of-the-art results on the GLUE, RACE, and squad benchmarks while having fewer parameters compared to BERT-large.
Check the paper here.
Beyond Accuracy: Behavioral Testing of NLP Models with CheckList
Microsoft Research, along with the University of Washington and the University of California, in this paper, introduced a model-agnostic and task agnostic methodology for testing NLP models known as CheckList. This is also the winner of the best paper award at the ACL conference this year. It included a matrix of general linguistic capabilities and test types that facilitate comprehensive test ideation, as well as a software tool to generate a large and diverse number of test cases quickly.
Check the paper here.
Thank you for reading this post, don't forget to subscribe to our AI NAVIGATOR!
Linformer
Linformer is a Transformer architecture for tackling the self-attention bottleneck in Transformers. It reduces self-attention to an O(n) operation in both space- and time complexity. It is a new self-attention mechanism which allows the researchers to compute the contextual mapping in linear time and memory complexity with respect to the sequence length.
Read more about the paper here.
Plug and Play Language Models
Plug and Play Language Models (PPLM) are a combination of pre-trained language models with one or more simple attribute classifiers. This, in turn, assists in text generation without any further training. According to the authors, model samples demonstrated control over sentiment styles, and extensive automated and human-annotated evaluations showed attribute alignment and fluency. […]
Read more: www.analyticsindiamag.com
Share this: