AI Research

A Brief History of Deep Learning Frameworks

A Brief History of Deep Learning Frameworks

The past decade has seen a burst of algorithms and applications in especially . Behind the burst of these algorithms and applications are a wide variety of tools and frameworks.

SwissCognitive, AI, Artificial Intelligence, Bots, CDO, CIO, CI, Cognitive Computing, Deep Learning, IoT, Machine Learning, NLP, Robot, Virtual reality, learning

They are the scaffolding of the revolution: the widespread adoption of frameworks like TensorFlow and PyTorch enabled many practitioners to more easily assemble models using well-suited domain-specific languages and a rich collection of building blocks.

Looking back at the evolution of frameworks we can clearly see a tightly coupled relationship between frameworks and algorithms. These virtuous cycle of interdependency propels a rapid development of frameworks and tools into the future.

Stone Age (early 2000s)

The concept of neural networks have been around for a while. Before the early 2000s, there were a handful of tools that can be used to describe and develop neural networks. These tools include MATLABOpenNN, and Torch etc. They are either not tailored specifically for neural network model development or having complex user APIs and lack of GPU support. During this time, practitioners had to do a lot of heavy lifting when using these primitive frameworks.

Bronze Age (~2012)

In 2012, Alex Krizhevsky et al. from the University of Toronto proposed a deep neural network architecture later known as AlexNet [1] that achieved the state-of-the-art accuracy on ImageNet dataset and outperformed the second-place contestant by a large margin. This outstanding result sparked the excitement in deep neural networks and since then various deep neural network models kept setting higher and higher record in the accuracy of ImageNet dataset.

Around this time, some early days frameworks such as CaffeChainer and Theano came into being. Using these frameworks, users could conveniently built complex deep neural network models such as CNN, RNN, and LSTM etc. In addition, multi-GPU training was supported in these frameworks which significantly reduced the time to train these models and enabled training large models that were not able to fit into a single GPU memory earlier. Among these frameworks, Caffe and Theano used a declarative programming style while Chainer adopted the imperative programming style. These two distinct programming styles also set two different development paths for the frameworks that were yet to come.


read more:


Leave a Reply