History of Deep Learning

Photo by Scott Webb on Unsplash

Deep Learning has dramatically improved the state-of-the-art in different machine learning tasks such as machine translation, speech recognition, visual object detection and many other domains such as drug discovery and genomics (LeCun, et al., 2015). In addition to that, researchers are extending capabilities of deep learning beyond these traditional tasks such that Osaka et al. use recurrent neural networks to denoise speech signals, Gupta et al. use autoencoders to discover patterns in gene expressions, Gatys et al. use a generative adversarial network to generate images, Wang et al. use deep learning to allow sentiment analysis from multiple modalities simultaneously (Wang & Raj, 2017).

According to the Artificial Index 2021 report, peer-reviewed AI publications are growing exponentially. (Zhang, et al., 2021).

Figure 1 Peer-Reviewed AI Publications by Year

However, one must understand how deep learning has evolved over the years and formed the current models. The history of machine learning goes back to 300 BC, Aristotle and it is seen as starting point by Associationism (Wang & Raj, 2017).

Table 1 — Brief History of Machine Learning (Wang & Raj, 2017)

As seen in Table 1, the progress in AI has stalled around the 60s and 70s. And there were some applications of machine learning such as machine translation which were not successful at all, especially US Navy-funded machine translation study from Russian to English. Minsky and Pappert also proved that Rosenblatt’s perceptron was only capable of solving linearly separable problems, even though they knew multiple layers could solve that, there was no algorithm at that time to train the network. Today that algorithm is known as back-propagation.

In 1973, the Lighthill report was published which gave a very pessimistic prognosis for many core aspects of the field such as “In no part of the field have the discoveries made so far produced the major impact that was then promised” (Lighthill, 1973). After this report, many funding resources were cut and a quiet period began which is known as the first AI winter.

After the first AI winter, interest has begun to rise to create commercial products in the 1980s like expert systems. As Schuchmann stated, those expert systems were handcrafted by surveying experts and building “if-then” rule sets (Schuchmann, 2019). There were some applications such as financial planning, medical diagnosis, geological exploration, and microelectronic circuit design. In 1984, the magazine Business Week published the headline “AI: It is here”. (Schuchmann, 2019). In 1984, John McCarthy criticized expert systems because they lacked common sense and knowledge about their limitations. He described a wrong decision by an expert system for medical treatment of Cholerae Vibrio and he added that the complex system such as vision and speech have many edge cases for engineers to build the rules. After Schwarz, the director of DARPA ISTO (Defence Advanced Research Projects Agency/Information Science and Technology Office) reported that “Expert systems are very limited success in particular areas, followed immediately by failure to reach the broader goal at which these initial successes seem at first to hint.”, the funding of AI research started to decrease, and second AI winter has started.

Figure 2 Milestones of AI research after the 1950s (Schuchmann, 2019)

It was not easy to come to 2012, but finally the Deep Learning revolution has begun.


LeCun, Y., Bengio, Y. & Hinton, G., 2015. Deep Learning. Nature, Volume 521, p. 436–444 .

Lighthill, J., 1973. Lighthill report. [Online]
Available at: https://en.wikipedia.org/wiki/Lighthill_report

Schuchmann, S., 2019. History of the Second AI Winter. [Online]
Available at: https://towardsdatascience.com/history-of-the-second-ai-winter-406f18789d45

Wang, H. & Raj, B., 2017. On the Origin of Deep Learning. arXiv, Volume arXiv:1702.07800v4.

Zhang, D. et al., 2021. The AI Index 2021 Annual Report. [Online]
Available at: https://aiindex.stanford.edu/wp-content/uploads/2021/03/2021-AI-Index-Report_Master.pdf




AI Researcher and Entrepreneur

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

‘Fake News optimization’ and Cyber security creating ‘mental illnesses’ in AI thinking

Can AI reinvent airport excellence?

Machine Learning Series: Fighting Fake News

How Metaverse Could Inspire Self-replication

WTF is an OSARO?

What Will Life Look Like in 2030 Thanks to Artificial Intelligence?

Why Coca Cola Uses AI to Create Intelligent Vending Machines

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Baris Cekic

Baris Cekic

AI Researcher and Entrepreneur

More from Medium

Neural Speech Synthesis using ForwardTacotron and WaveRNN

Understanding Memory Requirements for Deep Learning and Machine Learning

How to Pick Optimal Learning Rate Using TensorFlow 2.x

Uncovering the Deep State… of Neural Networks