Mohamad Hassoun, author of Fundamentals of Artificial Neural Networks (MIT Press, 1995) and a professor of electrical and computer engineering at Wayne State University, adapts an introductory section from his book in response.
Artificial neural networks are parallel computational models, comprising densely interconnected adaptive processing units. These networks are composed of many but simple processors (relative, say, to a PC, which generally has a single, powerful processor) acting in parallel to model nonlinear static or dynamic systems, where a complex relationship exists between an input and its corresponding output.
A very important feature of these networks is their adaptive nature, in which "learning by example" replaces "programming" in solving problems. Here, "learning" refers to the automatic adjustment of the system's parameters so that the system can generate the correct output for a given input; this adaptation process is reminiscent of the way learning occurs in the brain via changes in the synaptic efficacies of neurons. This feature makes these models very appealing in application domains where one has little or an incomplete understanding of the problem to be solved, but where training data is available.
One example would be to teach a neural network to convert printed text to speech. Here, one could pick several articles from a newspaper and generate hundreds of training pairs—an input and its associated, "desired" output sound—as follows: the input to the neural network would be a string of three consecutive letters from a given word in the text. The desired output that the network should generate could then be the sound of the second letter of the input string. The training phase would then consist of cycling through the training examples and adjusting the network parameters—essentially, learning—so that any error in output sound would be gradually minimized for all input examples. After training, the network could then be tested on new articles. The idea is that the neural network would "generalize" by being able to properly convert new text to speech.
Another key feature is the intrinsic parallel architecture, which allows for fast computation of solutions when these networks are implemented on parallel digital computers or, ultimately, when implemented in customized hardware. In many applications, however, they are implemented as programs that run on a PC or computer workstation.
Artificial neural networks are viable models for a wide variety of problems, including pattern classification, speech synthesis and recognition, adaptive interfaces between humans and complex physical systems, function approximation, image compression, forecasting and prediction, and nonlinear system modeling.
These networks are "neural" in the sense that they may have been inspired by the brain and neuroscience, but not necessarily because they are faithful models of biological, neural or cognitive phenomena. In fact, many artificial neural networks are more closely related to traditional mathematical and/or statistical models, such as nonparametric pattern classifiers, clustering algorithms, nonlinear filters and statistical regression models, than they are to neurobiological models.