Key figures in neural network history

cs.stanford.edu research

Neural Networks - History - CS Stanford

https://cs.stanford.edu/people/eroberts/courses/soco/projects/neural-networks…

| The Artificial Neuron History Comparison Architecture Applications Future Sources | Neural Network Header **History: The 1940's to the 1970's** In 1943, neurophysiologist Warren McCulloch and mathematician Walter Pitts wrote a paper on how neurons might work. In order to describe how neurons in the brain might work, they modeled a simple neural network using electrical circuits. MADALINE was the first neural network applied to a real world problem, using an adaptive filter that eliminates echoes on phone lines. It is based on the idea that while one active perceptron may have a big error, one can adjust the weight values to distribute it across the network, or at least to adjacent perceptrons. Despite the later success of the neural network, traditional von Neumann architecture took over the computing scene, and neural research was left behind. In the same time period, a paper was written that suggested there could not be an extension from the single layered neural network to a multiple layered neural network.

Visit

youtube.com video

Neural Networks Explained: From 1943 Origins to Deep Learning ...

https://www.youtube.com/watch?v=AA2ettRM6_Q

Neural Networks Explained: From 1943 Origins to Deep Learning Revolution 🚀 | AI History & Evolution The AI Guy 1400 subscribers 258 likes 10587 views 10 Jun 2024 Discover the fascinating history of neural networks, from their origins in 1943 to the groundbreaking deep learning advancements of today. Learn how pioneering scientists like Warren McCulloch, Walter Pitts, Frank Rosenblatt, John Hopfield, Geoffrey Hinton, and others contributed to this revolutionary field. Understand key developments like the perceptron, backpropagation, and the role of GPUs in transforming AI. Join us on this journey through time to see how neural networks have evolved to shape modern machine learning and artificial intelligence. 🚀 #NeuralNetworks #DeepLearning #AIHistory #MachineLearning #ArtificialIntelligence 9 comments

Visit

medium.com article

A Quick History of Neural Nets: From Inglorious to Incredible

https://medium.com/insight-data/a-quick-history-of-neural-nets-from-ingloriou…

# A Quick History of Neural Nets: From Inglorious to Incredible | by Lauren Holzbauer | Insight | Medium. In 1957, Frank Rosenblatt applied McCulloch and Pitt’s idea to early AI when he invented the “Perceptron.” The perceptron is probably the simplest of all Neural Net architectures: it only has a single layer and uses a simple step function as its activation function. Since training neural nets requires so much computing power, it wasn’t until GPUs were widely available that researchers had access to cheaper, faster computers that could handle the demands of training deep networks. The ILSVRC served as an international platform for neural nets to emerge as a powerful tool for image classification. With a better grasp of the evolution of neural nets over the past 70(!!) years, I have a better appreciation for them and the small group of researchers who persisted through the many ups and downs, widespread criticism, and long AI “winters” (denoted by snowflakes in the image above).

Visit

dataversity.net article

A Brief History of Neural Networks - Dataversity

https://www.dataversity.net/articles/a-brief-history-of-neural-networks/

Deep learning uses neural networks, a data structure design loosely inspired by the layout of biological neurons. (It should be noted, Rosenblatt’s primary goal was not to build a computer that could recognize and classify images, but to gain insights about how the human brain worked.) The Perceptron neural network was originally programmed with two layers, the input layer and the output layer. This was the first design of a deep learning model using a convolutional neural network. The early designs of neural networks (such as the Perceptron) did not include hidden layers, but two obvious ones (input/output). In 1989, deep learning became an actuality when Yann LeCun, et al., experimented with the standard backpropagation algorithm (created in 1970), applying it to a neural network. In 2009, Nvidia supported the “big bang of deep learning.” At this time, many successful deep learning neural networks received training using Nvidia GPUs. GPUs have become remarkably important in machine learning. Deep learning algorithms are supported by neural networks.

Visit

galileo-unbound.blog article

A Short History of Neural Networks - Galileo Unbound

https://galileo-unbound.blog/2025/02/05/a-short-history-of-neural-networks/

* ai, Artificial Intelligence, Attention mechanism, convolutional neural network, Deep Learning, History of Physics, Hopfield network, Machine Learning, neural networks, Neurodynamics, Nonlinear Dynamics, recurrent neural network, technology, van der Pol oscillator. Drawing from the work of McCulloch and Pitts, his team constructed a software system and then constructed a hardware model that adaptively updated the strength of the inputs, that they called neural weights, as it was trained on test images. PDP was an exciting framework for artificial intelligence, and it captured the general behavior of natural neural networks, but it had a serious problem: How could all of the neural weights be trained? The breakthrough that propelled Geoff Hinton to world-wide acclaim was the success of AlexNet, a neural network constructed by his graduate student Alex Krizhevsky at Toronto in 2012 consisting of 650,000 neurons with 60 million parameters that were trained using two early Nvidia GPUs. It won the ImageNet challenge that year, enabled by its deep architecture and representing a marked advancement that has been proceeding unabated today.

Visit

magazine.caltech.edu research

The Roots of Neural Networks: How Caltech Research Paved the ...

https://magazine.caltech.edu/post/ai-machine-learning-history

In 1980, Hopfield left Princeton for Caltech in part due to the Institute’s “splendid computing facilities,” which he would use to test and develop his ideas for neural networks. “Hopfield extracted the essence of neurons.” Abu-Mostafa notes that the theoretical paper Hopfield published in 1982, “Neural networks and physical systems with emergent collective computational abilities,” is the fifth-most-cited Caltech paper of all time. His network was trained to dig a hole in the landscape corresponding to the image pattern being trained,” adds Erik Winfree (PhD ’98), professor of computer science, computation and neural systems, and bioengineering at Caltech, and a former CNS student of Hopfield’s. Even before Anandkumar joined Caltech in 2017, she says she “was fascinated by physics.” In 2011, she analyzed how the success of learning algorithms is tied to the phase transition in the Ising model, the same model upon which Hopfield built his network.

Visit

en.wikipedia.org article

History of artificial neural networks - Wikipedia

https://en.wikipedia.org/wiki/History_of_artificial_neural_networks

* [(Top)](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#). * [3.1 LSTM](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#LSTM). * [5 Deep learning](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#Deep_learning). * [7.2 Transformer](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#Transformer). * [8.3 Deep learning](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#Deep_learning_2). * [11 Notes](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#Notes). * [Read](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks). * [Read](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks). popularized backpropagation.[[31]](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_note-32). They reported up to 70 times faster training.[[85]](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_note-86). **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-fukuneoscholar_61-0)**Fukushima, K. **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-wz1988_68-0)**Zhang, Wei (1988). **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-wz1990_69-0)**Zhang, Wei (1990). **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-70)**Fukushima, Kunihiko; Miyake, Sei (1982-01-01). **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-LECUN1989_71-0)**LeCun _et al._, "Backpropagation Applied to Handwritten Zip Code Recognition," _Neural Computation_, 1, pp. **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-73)**Zhang, Wei (1991). **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-74)**Zhang, Wei (1994). **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-Weng1992_75-0)**J. **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-Weng19932_76-0)**J. **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-Weng1997_77-0)**J. **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-81)**Sven Behnke (2003). **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-:62_88-0)**Ciresan, D. **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-:9_91-0)**Ciresan, D.; Meier, U.; Schmidhuber, J. **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-szegedy_94-0)**Szegedy, Christian (2015). **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-101)**Linn, Allison (2015-12-10). **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-olli2010_106-0)**Niemitalo, Olli (February 24, 2010). **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-108)**Gutmann, Michael; Hyvärinen, Aapo. **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-Cherry_1953_115-0)**Cherry EC (1953). **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-118)**Fukushima, Kunihiko (1987-12-01). **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-:12_121-0)**Soydaner, Derya (August 2022). **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-122)**Giles, C. **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-123)**Feldman, J. **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-125)**Schmidhuber, Jürgen (January 1992). **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-135)**Levy, Steven. **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-138)**Kohonen, Teuvo (1982). **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-139)**Von der Malsburg, C (1973). **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-141)**Smolensky, Paul (1986). **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-144)**Sejnowski, Terrence J. **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-hinton2006_146-0)**[Hinton, G. **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-hinton2009_147-0)**Hinton, Geoffrey (2009-05-31). **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-149)**Watkin, Timothy L. **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-150)**Schwarze, H; Hertz, J (1992-10-15). **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-151)**Mato, G; Parga, N (1992-10-07). **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-schmidhuber19922_153-0)**Schmidhuber, Jürgen (1992). **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-154)**Hanson, Stephen; Pratt, Lorien (1988). **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-157)**Yang, J. **[^](https://en.wikipedia.org/wiki/History_of_artificial_neural_networks#cite_ref-158)**Strukov, D.

Visit

developer.nvidia.com article

Deep Learning in a Nutshell: History and Training - NVIDIA Developer

https://developer.nvidia.com/blog/deep-learning-nutshell-history-training/

The main hurdle at this point was to train big, deep networks, which suffered from the vanishing gradient problem, where features in early layers could not be learned because no learning signal reached these layers. Additional material: Deep Learning in Neural Networks: An Overview. Backpropagation of errors, or often simply backpropagation, is a method for finding the gradient of the error with respect to weights over a neural network. Figure 1: Backpropagation for an arbitrary layer in a deep neural network. Figure 1: Backpropagation for an arbitrary layer in a deep neural network. We can imagine a forward pass in which a matrix (dimensions: number of examples x number of input nodes) is input to the network and propagated t through it, where we always have the order (1) input nodes, (2) weight matrix (dimensions: input nodes x output nodes), and (3) output nodes, which usually also have a non-linear activation function (dimensions: examples x output nodes). ### Accelerate Machine Learning with the cuDNN Deep Neural Network Library.

Visit