Neuromorphic computing is an approach to efficiently solve complicated learning and cognition problems like the human brain using electronics. To efficiently implement the functionality of biological neurons, nanodevices and their implementations in circuits are exploited. Here, we describe a general-purpose spiking neuromorphic system that can solve on-the-fly learning problems, based on magnetic domain wall analog memristors (MAMs) that exhibit many different states with persistence over the lifetime of the device. The research includes micromagnetic and SPICE modeling of the MAM, CMOS neuromorphic analog circuit design of synapses incorporating the MAM, and the design of hybrid CMOS/MAM spiking neuronal networks in which the MAM provides variable synapse strength with persistence. Using this neuronal neuromorphic system, simulations show that the MAM-boosted neuromorphic system can achieve persistence, can demonstrate deterministic fast on-the-fly learning with the potential for reduced circuitry complexity, and can provide increased capabilities over an all-CMOS implementation.

Here, we describe a brain-plausible neuromorphic on-the-fly learning system with hybrid CMOS/MAM technologies, with the advantages of on-chip, online, and timing perceptive learning without forgetting. This spiking neuromorphic system is embedded with spike timing–dependent plasticity (STDP) learning, which supports learning while in operation and as circumstances change. The proposed system is fully designed with transistor-level circuit details, and no external computing units are required for the demonstrated applications. This is the first application of an STDP-based learning analog hardware implementation that learns to detect differences in timing of signals. In this system, the network learns the temporal relation of the input sequence using the same neuron and synapse designs. A broad range of perceptual learning tasks can benefit from this brain-plausible design. Our design includes a physical model of MAM, analog neuromorphic circuits with CMOS/MAM hybridization, and neuronal networks constituted by analog circuit neural element models. The spintronic device enables deterministic memristive behavior with ultra-low-energy operation that has inspired a feasible VLSI (very large scale integration) neuromorphic design, achieving an on-the-fly learning process with spike train signals.

Spintronic devices have been proposed as promising hardware candidates for neuromorphic computing due to their prominent properties such as nonvolatility, low power consumption, and compatibility with CMOS (complementary metal-oxide semiconductor) technologies ( 9 – 12 ). Numerous theoretical neuromorphic proposals have been explored based on spintronic devices, and some of them have been experimentally demonstrated ( 13 – 16 ). Recently, it has also been reported that a spintronic device incorporating a magnetic domain wall (DW) exhibits the functionality of an analog memristor ( 17 , 18 ), promoting its implementation in neuromorphic circuits different from existing proposals. This magnetic domain wall analog memristor (MAM) potentially provides a uniform programming signal and long retention time, which is required for implementing persistent memory in analog synapse circuits.

Bioplausible neuromorphic systems exploiting the brain’s computational methods contain hardware realizations of neural plasticity and complex synaptic connections. Neuromorphic systems range from those built with custom asynchronous analog circuits to those built with conventional synchronous digital processors ( 1 – 4 ), from those that mimic biological behavior precisely to those that mimic biological behavior coarsely. One of the main challenges in constructing these neuromorphic systems is the need for persistent memory embedded in the neural processing. This need, coupled with the advantage of small, lower-power circuitry, self assembly, and the ability to provide more three-dimensional connectivity, has led neuromorphic researchers to examine the use of nanotechnologies in conjunction with custom analog circuits ( 5 – 8 ).

RESULTS

Magnetic domain wall analog memristor As the key component of our neuromorphic architecture, the characteristic behavior of MAM is first analyzed. The structure of a MAM is illustrated in Fig. 1. It consists of a heavy metal (HM) layer, a magnetic free layer with perpendicular magnetic anisotropy (PMA), a tunnel barrier, and a magnetic fixed layer. The magnetic free layer hosts a magnetic DW that separates the spin-up (blue) and spin-down (red) regions. When an electrical current is flowing in the negative x direction through the HM, a y-polarized spin current is injected into the free layer via the spin Hall effect (Fig. 1) and drives a DW motion against the direction of the current. Because of the pinning effects (both intrinsic and extrinsic pinning), a critical current to initiate the DW motion exists. Only current with amplitude above the critical value can trigger a DW motion, and the critical current here is 0.1 mA (~1 × 1011 A m−2). The tunnel magnetoresistance (TMR) of this device can be read out by a vertical (z direction) electrical current. Note that the reading current used here is well below the critical current of the DW motion, so it does not affect the position of the DW. The TMR is determined by the relative magnetization direction between the two magnetic layers and thus depends on the explicit position of the DW in the free layer. Because the DW motion is almost continuous depending on the sample shape, defects, and other variations, the TMR also changes continuously, imitating the continuously varying strength of an analog synapse. Fig. 1 Schematic view of the MAM. Yellow arrows indicate the magnetization direction. An electrical current flow in the x direction could induce a DW motion in the magnetic free layer. The TMR of this device is read out using a vertical current (z direction). Inset: Calculated resistance of the device after injecting both positive and negative current with an amplitude of 5 × 1011 A m−2 and a duration of 1 ns. Combining the in-plane and out-of-plane current, this MAM device is able to achieve the functionalities of an analog memristor. A fixed in-plane current pulse with amplitude above the critical value is used for the DW motion. Each incoming current pulse changes the DW position and the corresponding resistance state, and the resistance state can be read out via the out-of-plane current. Here, we consider a magnetic free layer with geometry 1000 nm × 108 nm × 1 nm. A Néel-type magnetic DW is stabilized in the free layer. A series of micromagnetic simulations were performed to capture the microscopic dynamics of MAM (for details of the micromagnetic simulation, see Methods). Furthermore, PMA variations were also considered in the simulations, which is usual in experiments and could affect the device performance. Fifty MAM devices with different PMA variations were simulated and implemented in our circuit simulations. The calculated resistance (averaged over 50 devices) as a function of the input current pulse is shown in Fig. 1 (inset), which is fitted to the reported experimental values (18). Depending on the current direction, the device resistance either increases or decreases quasi-linearly, which is similar to the functionality of the Set/Reset signal for an analog memristor. To demonstrate the potential of this novel spintronic analog memristor in neuromorphic circuits and systems, an integrated SPICE model is used for circuit simulation. We extracted the micromagnetic simulation results as a lookup table and implemented it in Verilog-A to create a SPICE model. The MAM is assumed to be a four-terminal device, where two terminals are used for the resistor and the other two terminals are used for controlling resistance Set/Reset. The resistance is decreased by the Set signal until a minimum value is reached and increased by the Reset signal until a maximum value is reached. The result is shown in Fig. 2D. The synaptic plasticity of the following neuromorphic circuits is based on this device. Fig. 2 Synapse circuit implementation and simulation results. (A) BioRC synapse circuit. NT, neurotransmitter quantity; Re, reuptake control; KR, K+ channel receptor quantity control. (B) Simulation results of the synapse circuit with 45-nm CMOS. (C) Resistive multistate synapse circuit. (D) Simulation result of resistive multistate synapse circuit with hybrid of 45-nm CMOS and MAM.

Multistate synapse circuit Figure 2 (A and B) shows a Biomimetic Real-Time Cortex (BioRC) analog CMOS excitatory synapse circuit in 45-nm technology (19) and its transient simulation results. The input action potential (AP) consists of spikes generated by a CMOS axon hillock circuit (20) with a maximum amplitude of 0.65 V. The excitatory postsynaptic potential (EPSP) magnitude of this particular synapse circuit is approximately 14% of the AP, with about five times duration of the AP. This particular simplified BioRC synapse design realizes short-term memory through the duration of the EPSP and can support long-term memory by adjusting the input of the neurotransmitter knob NT to control neurotransmitter concentration in the synapse. Other BioRC synapses also allow control of ion channel receptor concentration, providing another memory mechanism (21). However, the BioRC synapse as implemented in CMOS does not provide persistent memory unless the NT and receptor controls are generated continuously, because charge leakage occurs. To mimic a multistate human brain synapse biologically, the resistance properties of the MAM are exploited in this neuromorphic system. Figure 2 (C and D) shows the circuit design and simulation results. The Set/Reset signal varies the resistance of the MAM, and the output voltage of the synapse circuit is input to the resistance terminal of the MAM, resulting in excitatory postsynaptic current (EPSC) output. A load capacitor is connected at the MAM output to measure the EPSP variation in voltage. The input AP is generated by an axon hillock circuit, and the Set/Reset signal is set as stimuli. The resistance of MAM changes deterministically according to each Set/Reset signal, which is a fixed current pulse with ±0.5-mA amplitude and 1-ns duration generated by a pulse generator. The MAM also contains pinning areas for DW near its ends so that, when the resistance of MAM reaches its maximum (minimum) value, it will not respond to the Reset (Set) signal anymore.

On-chip learning method A learning method, STDP, is embedded in this spiking neuromorphic hardware system. STDP is implemented as a circuit in 45-nm CMOS and embedded with a synapse and a MAM to represent the basic learning element shown in Fig. 3. Fig. 3 Illustration of basic STDP learning element implementation including pre/postsynaptic neurons simplified to the axon hillock, synapse circuit with MAM, STDP learning circuit, current mirror for isolation, and capacitor for current integration.

STDP learning In a spiking neuromorphic system, it is natural to use biomimetic STDP as the learning method. In STDP, the weight dynamics depend not only on the current weight but also on the relationship between presynaptic and postsynaptic APs (22). This means that, besides the synaptic weight, each synapse keeps track of the recent presynaptic spike history. In terms of our STDP model, every time a presynaptic spike arrives at the synapse, the presynaptic spike will cause charge accumulation X pre in diffusion capacitance of a transistor, and then the charge will decay gradually. Every time the postsynaptic neuron spikes, charge X post is also accumulated and then decays. When a postsynaptic spike arrives at the synapse due to back propagation from the axon hillock, the weight change Δω is calculated on the basis of the presynaptic charges. A simplified STDP mathematical model is given below Δ ω = { − η ( − 1 ) ∂ x pre ∂ t , if X post > C 0 , if X post ≤ C (1)where η is the amount of weight change at each time step of a synapse, X pre is the accumulated charge from the pre signal, X post is the accumulated charge from the post signal, and C is a constant greater than X pre . The assumption of this equation is that both X pre and X post are decaying over time after rising high instantly. If the post signal arrives before the pre signal decays to 0, the weight of this synapse will be increased by η. If only the post signal arrives, which means the output neuron fires before with the firing of presynaptic neurons within an infinite time, the weight of this synapse will be decreased by η. The novel biomimetic circuit implementation of this STDP equation is shown in Fig. 4A; only eight transistors and two pulse generators are used in the circuit. This circuit operates on charges, thus avoiding continuous current to save power. Each synapse in this design has its own STDP circuit, and the possible power consumption of memory operation can be reduced. η is analogous to the resistance change of MAM for each pulse. X pre and X post are analogous to electrical charges, and charge signals are decaying over time through biasing transistors connected to the ground. If and only if the pre signal arrives first, the connected post gated transistor will be charged. Then, if the post signal arrives successively, the Set pulse will be triggered and the Reset signal is inhibited by discharging. The resistance of MAM will be decreased by one Set pulse to increase the strength of this synapse. If only the post signal arrives, the Set signal will not be triggered, because the precharging of the pre transistor gate is absent, and the post signal will trigger the Reset pulse without discharging inhibition. The resistance of MAM will be increased by one Reset pulse to decrease the strength of this synapse. All the charging nodes in this circuit are discharged by a constant bias transistor to implement the differential timing factor dQ/dt. In the mathematical model, the amplitude of Δω is a continuous value depending on the product of η and the differential equation. However, the amplitude of Δω in the circuit implementation is a discrete value of the resistance change of MAM for each pulse. A positive-edge input will trigger this circuit to generate one current pulse output with fixed amplitude and duration, in this case 0.5 mA and 1 ns. The simulation results shown in Fig. 4B present three scenarios: Fig. 4 STDP circuit implementation and simulation results. (A) STDP learning circuit. (B) Simulation results of the STDP learning circuit and the MAM response. 1) Scenario 1: If both pre and post spikes arrive in sequence and the timing interval is short enough, the Set pulse is triggered and the resistance of MAM is decreased. If the resistance of the MAM reaches its minimum value, the MAM will no longer respond to the Set pulse. 2) Scenario 2: If both pre and post spikes arrive in sequence with the timing interval slightly longer than the first scenario, neither signal Set nor Reset will be triggered. This scenario is shown in Fig. 4B around 70 ns. The reason for this phenomenon is that the decayed X pre cannot serve as a source to enable the Set pulse trigger, but it is strong enough to enable the discharge of X post and then inhibit the post signal to trigger the Reset signal. 3) Scenario 3: If only the post signal arrives or the time interval between signal pre and post is long enough, the Reset signal is triggered and the resistance is increased. If the resistance of the MAM reaches its maximum value, the MAM will no longer respond to the Reset pulse anymore.

Neuronal networks We connect neuron and synapse circuits in a networked manner to create a neuronal network. Unlike traditional artificial neural networks processing mathematical models of neural elements on digital processors, our neuromorphic hardware models an asynchronous neuronal network with custom analog circuits modeling biological neurons to a first order. Both pattern recognition and timing perceptive neuronal networks are presented in this section, and the details are discussed below.

Feed-forward neuronal network for pattern recognition The basic model of the feed-forward neuronal network is shown in Fig. 5A. In Fig. 5C, the configuration of the feed-forward neuronal network, which is constituted with 25 input neurons, 20 output neurons, and 500 synapses with initial random strengths, is shown. The randomness generation circuit is introduced in (23). The input neuron is simplified to the axon hillock circuit shown in Fig. 3. All the EPSC outputs of the synapses are connected with a current mirror (24) shown in Fig. 3. All isolated EPSC outputs from the current mirrors are connected together to realize linear current addition. The summation of the current is connected to an RC (resistance·capacitance) element (10 MΩ/500 fF) to constitute a simplified dendritic arbor along with converting to voltage output. Then, the output of the dendritic arbor is connected to the axon hillock of each output neuron. The pattern in Fig. 5B is fed into the neuronal network, and the simulation results are shown in Fig. 5D. The pattern is 5 × 5 pixels with 1-bit binary value, and then the pattern is encoded as 0.7-V pulses of 0.1-ns duration and repeated six times with 15-ns interval between the repetitions. These pulses activate the input neurons, and then the network learns this pattern adaptively. After the first echo, five output neurons fired spikes to respond to this pattern. If the input pattern stimulates the output neuron, the synapses that received both pre and post signals will be strengthened according to STDP learning scenario 1 or will keep the same strength according to scenario 2, while the synapses that only received post signal will be weakened according to scenario 3. Fig. 5 Neuronal network configuration for pattern recognition. (A) Feed-forward neuronal network example. Each input neuron receives image pattern pulses from one pixel, generating a presynaptic spike for synapses. The STDP synapses are initialized with random weights and receive pre and post spikes. The output neurons receive and integrate EPSPs using current adders. If the voltage of the capacitor accumulating the EPSPs exceeds 0.4 V, then the output neuron generates a post spike. The network has m input neurons, n output neurons, and m × n synapses. In this example, three input neurons, three output neurons, and nine synapses are shown. (B) Pattern example input to the neural network. Pattern is converted as pulses and fed to the input neurons six times successively. (C) Network configuration of 25 input neurons, 500 synapses, and 20 output neurons. This network is simulated using HSPICE at the transistor level. (D) Simulation results of the pattern recognition. Neurons 8, 14, 15, and 17 learned this pattern, while neuron 10 has not learned the pattern. Other nonresponding output neurons are not shown here. On the basis of the randomization of initial synapse strengths, output neurons 8, 15, and 17 have relatively stronger initial states and timing dynamics, so they learn the pattern during the first trial. Output neuron 10 has fair initial synapses states, but the delay between pre and post signals is too long to trigger the strengthening for synapses instead of weakening. Output neuron 14 weakens the synapses for the first trial like output neuron 10, but the timing is good for trials 3 and 4. This situation is due to the fact that STDP learning scenario 2 happens for neuron 14 during the first trial. Therefore, neuron 14 still has high enough EPSP to trigger the output neuron, and then the STDP strengthens the synapses to learn this pattern eventually. After training with six trials, this pattern is stably recognized by four output neurons. A more detailed simulation result of learning three successive patterns can be found in the Supplementary Materials. Because an unsupervised learning is implemented here, multiple neurons may fire for the same pattern. The trained neurons firing for the same pattern can be clustered into one class using traditional clustering technologies such as winner-take-all method (25) or normalization method (26). The pattern capacity of this neuronal network depends on the number of post neurons. For instance, in the above network configuration, the maximum number of recognized pattern is 20, and each recognizable pattern has certain variation tolerance. For more patterns, this network can be scaled with more post neurons. For deeper scaling of this type of network, adding more layers can increase its accuracy. However, the extra layers just implement an averaging function for a set of patterns or features.