For decades, Hava Siegelmann has explored the outer reaches of computing with great curiosity and great conviction.
The conviction shows up in a belief that there are forms of computing that go beyond the one that has dominated for seventy years, the so-called von Neumann machine, based on the principles laid down by Alan Turing in the 1930s. She has long championed the notion of “Super-Turing” computers with novel capabilities.
And curiosity shows up in various forms, including her most recent work, on “neuromorphic computing,” a form of computing that may more closely approximate the way that the brain functions.
Siegelmann, who holds two appointments, one with the University of Massachusetts at Amherst as professor of computer science, and one as a program manager at the Defense Advanced Research Projects Agency, DARPA, sat down with ZDNet to discuss where neuromorphic computing goes next, and the insights it can bring about artificial intelligence, especially why AI succeeds and fails.
Today’s deep learning form of AI, for all its achievements, has serious shortcomings, in Siegelmann’s view.
“There are many issues with deep learning,” says Siegelmann. “You see the brittleness of it: If it is presented with a new situation, it won’t know what to do. Generalization is very thin with deep learning; only when the new data has the same statistical properties as the training and validation data will generalization work.”
That critique of today’s AI is not by any means unique to Siegelmann, but her path to that realization is a bit different from other researchers. In nearly thirty years of published work, Siegelmann has pointed out how AI is hamstrung by the limited nature of the computers in which AI is built.
The von Neumann machine, inspired by Turing’s early writing on “universal computers,” and by the study of neurophysiology of Warren McCulloch and Walter Pitts in the 1940s, is based on a program that is designed and then run. That means the computer never adapts, says Siegelmann.
And that cannot be a good basis for intelligence, since all forms of life in the real world show adaptation to a changing environment.
A consistent thread in Siegelmann’s work is the contention that Turing himself realized this. Siegelmann has produced something like a secret history of Turing, showing he was not satisfied with the basic digital computer and was striving for something else.
“Turing’s true passion — to find a more appropriate machine to describe nature and intelligence — has been almost universally overlooked,” Siegelmann wrote in a 2013 paper, “Turing on Super-Turing and adaptivity.” The computer industry ignored Turing’s fascination with the brain for 70 years, resorting instead to simply building somewhat better versions of the digital computer design he theorized in the 1930s.
None other than Geoffrey Hinton, one of three winners of this year’s ACM Turing Award for lifetime achievement, and a dean of deep learning, seems to agree with Siegelmann about Turing, at least in part.
As Hinton remarked in an interview in 2017, “I think in the early days, back in the 50s, people like [John] von Neumann and Turing […] were far more inspired by the brain. Unfortunately, they both died much too young, and their voice wasn’t heard.”
Sieglemann’s conviction and curiosity has produced a bevy of work lately on so-called spiking neural nets, a form of neuromorphic computing. Such networks of artificial neurons only perform work when a stimulus makes them spike in electrical potential. The field, first conceived by Caltech professor Carver Mead in the early 80s, has been an obsession for firms including IBM for years, with little practical result.
Chip giant Intel has a longstanding project called “Loihi” to make spiking nets viable, though skeptics abound. Facebook’s Yann LeCun, who pioneered the development of convolutional neural networks, has said there are no viable ways to train spiking neural networks.
Siegelmann is aware of the critique and agrees there are problems, and shied away from spiking nets until recently. “They are a bit stuck in a local minimum,” she says of spiking neuron researchers. “I don’t know why they haven’t been bolder in exploring new possibilities,” she says.
As for Siegelmann, her reasons for exploring spiking are two-fold. “One is that we always talk about neural computation, but we don’t understand how it’s working; and two, if we want to compute at the edge, with minimal battery power, spiking neurons can work better for that, ultimately [than artificial neural networks.].”
Siegelmann is convinced there are clues to how the brain functions in the way that energy is managed in a system of spiking neurons. “The issue of energy is basic for everything,” she contends.
“Realistic paradigms of computation require consideration of resource constraints, and energy is the leading constraint,” she explains.
In Siegelmann’s view, spiking neurons in the standard models are too fixated on the spike in energy that makes a neuron contribute to an output from the network. Not enough attention is paid to another factor, the inhibition of spikes by competing neurons.
Inhibitions are what allow some signals to quiet other signals, to prevent a constant avalanche of neural firing. “Spiking networks cannot assume only excitatory neurons,” she contends.
In a paper in May of this year, Siegelmann theorized that neurons in the initial stages of a spiking network compete with other neurons by sending “inhibitory” signals. That paper, lead-authored by Daniel J. Saunders of the lab that Siegelmann set up at U Mass., the Biologically Inspired Neural and Dynamical Systems Laboratory, showed that there is something important going on with “local” clusters of neurons. It is something not unlike “convolutions” in the convolutional neural network of machine learning, which learn features of data at various places in a data sample by having signals from that one area repeatedly emphasized in the network. (Siegelmann participated in the research at U Mass., before coming to DARPA in 2016, and the work is unrelated to her work for DARPA.)
Such locally-connected spiking neural networks, as they are called, cannot yet compete with deep learning’s best results, but they preserve important aspects of the “topology” of neurons in the brain, says Siegelmann. Inhibitory neurons work in concert with what are called “glia cells,” non-neuronal cells in the nervous system, which form “energy pools.”
“The use of more than one kind of inhibitor is crucial for understanding of brain-like dynamics,” says Siegelmann. “It enables developmental self-assembly into small clusters,” a crucial structural feature left out of typical spiking neural nets.
Siegelmann is not alone in being a new convert to spiking neurons. Earlier this year, Terry Sejnowski of the Salk Institute in La Jolla, California, who is a pioneer in machine learning and was a mentor to Geoffrey Hinton, joined with researchers in his lab to publish research on spiking nets in which learning was accomplished via a transfer of parameters from an ordinary deep learning network. Sejnowski told ZDNet he is enthusiastic about the potential for spiking nets down the road.
Despite her critique of deep learning, Siegelmann has produced work that may explain some of its results. In a 2015 paper in Nature magazine called “The Global Landscape of Cognition,” Siegelmann and others in her lab showed that fMRI data, which models blood flow in the brain, reveals a hierarchy of brain activity that is not unlike the successive layers of activity of deep neural networks that contribute to a “gradient” of higher and higher levels of abstraction.
In a paper from this May, titled “Deep Neural Networks Abstract Like Humans,” Siegelmann built upon that 2015 research, by predicting how deep learning systems perform on inference tests. “Our findings reveal parallels in the mechanism underlying abstraction in DNNs and those in the human brain,” wrote Siegelmann and colleagues.
Still, there is the nagging problem that deep learning is static: its weights don’t adapt to new data in the wild. That shortcoming is an artifact, in Siegelmann’s view, of the fact that machine learning is crafted within the von Neumann machine, and its static, limited programming model.
That shortcoming is the subject of Siegelmann’s primary work at DARPA, the “Lifelong Learning Machines” project, or “L2M,” under the aegis of DARPA’s Microsystems Technology Office. (Siegelmann currently holds two program manager designations, with the other one being at the Information Innovation Office.)
“Lifelong Learning is the next generation of AI,” says Siegelmann. The big problem in deep learning, part of its failure to adapt, is that artificial networks can lose track of the weight updates that previously constituted learned information, a phenomenon known as “catastrophic forgetting.” The main goal of L2M, she says, is “a system that can learn from experience, and apply experiences toward new circumstances so that it is not as brittle.”
The implications go beyond merely improving performance on test sets. DARPA has identified AI as a frontier of cyber-attacks. Traditional cyber-security defenses are designed for fixed assaults, things such as viruses and “ROP gadgets” that are written once and distributed broadly for a specific kind of attack. But AI brittleness to new data means deep learning networks may face a continually shifting threat environment in so-called adversarial attacks.
For that reason, Siegelmann oversees another project, this one within the Information Innovation Office, called “Guaranteeing AI Robustness against Deception,” or GARD. As she describes the goal, “there can be a million different attacks, but our defenses should be sufficient for any attack.” A construct such as L2M can be the kind of AI that can adapt, one where “we think about types of defenses that change over time.” An initial meeting of GARD is planned for December, with results expected to materialize a year later.
Of course, any venture such as L2M or GARD that aims to surpass the von Neumann limitation would need a new machine, a departure from today’s available computer hardware. L2M is focused on “new ideas and new software,” says Siegelmann, but, “for complete capabilities, we will need new hardware as well.” It’s not clear what that new hardware should be, she concedes.
Most of the activity of computing giants such as Intel, and startups such as Graphcore, are focused on simply speeding up the matrix multiplication calculations that underly deep learning. An entire industry is at the moment transfixed by the commercial opportunity of making deep learning just a bit better on benchmarks, something of which Siegelmann is well aware.
Siegelmann is not alone in looking for radical solutions. The “Electronics Resurgence Initiative,” or ERI, is a DARPA project that aims to elevate the US’s status in computing hardware broadly, which some worry is falling short of where it should be. As Siegelmann sees the ERI initiative, “we need new ideas so that what we do here is better than what is done today in hardware.”
Pushing very different avenues of inquiry is a challenge in a world dominated by the state of the art in deep learning. But Siegelmann’s curiosity seems able to both incorporate those trends while also reaching for something else lying beyond them. At least, stasis is not a great option. “If you only want to continue the state of the art in AI, there is no reason to think about anything but deep learning,” says Siegelmann. “And you will stay there forever.”