The subset sum problem (SSP) is a typical nondeterministic-polynomial-time (NP)–complete problem that is hard to solve efficiently in time with conventional computers. Photons have the unique features of high propagation speed, strong robustness, and low detectable energy level and therefore can be promising candidates to meet the challenge. Here, we present a scalable chip built-in photonic computer to efficiently solve the SSP. We map the problem into a three-dimensional waveguide network through a femtosecond laser direct writing technique. We show that the photons sufficiently dissipate into the networks and search for solutions in parallel. In the case of successive primes, our approach exhibits a dominant superiority in time consumption even compared with supercomputers. Our results confirm the ability of light to realize computations intractable for conventional computers, and suggest the SSP as a good benchmarking platform for the race between photonic and conventional computers on the way toward “photonic supremacy.”

Here, we present a photonic computer constructed with chips serving as processing units to solve the SSP in a physically scalable fashion. Like the current signal in electronic computers or the molecule in molecular computers, photons contained in the optical source are treated as individual computation carriers. They travel in chips along buried waveguide networks to perform parallel computations. The specific instances of the problem are successfully encoded into the networks according to particular rules. The existence of target sums is judged by the arrival of photons to the corresponding output ports of the networks. We further investigate its scalability and performance in time consumption, showing the photon-enabled advantages.

Although improvements have been made, conventional electronic computers are ultimately limited by heat dissipation problem ( 16 ), which is also a possible limitation for memcomputing machines consisting of commercial electronic devices ( 8 ). The molecule-based computation is limited by the slow movement ( 16 – 18 ) or the long reaction process ( 14 , 15 ). Quantum computation is still hindered by decoherence and scalability ( 22 , 23 ). Other proposals are still in the stage of theory ( 11 , 12 , 24 – 26 ). However, we notice that photons have been extensively applied in proof-in-principle demonstrations of supercomputing ( 27 ) even without quantum speed-up, including an NP problem such as prime factorization ( 28 ) and NP-complete problems such as traveling salesman problem ( 29 ), Hamiltonian path problem ( 30 – 32 ), and dominating set problem ( 33 ). The #P-complete problem, boson sampling ( 34 – 39 ), other computational functions ( 40 ), and algorithms ( 41 , 42 ) are also demonstrated in a photonic regime. The successful applications imply that photons are potential excellent candidates to solve the SSP.

Despite the immense difficulty, some researchers attempt to solve NP-complete problems in polynomial time with polynomial resource. A memcomputing machine ( 6 , 7 ) as powerful as an NTM has been demonstrated, while the ambitious claim is not valid in a realistic environment with inevitable noise ( 8 ). Designs of an NTM, where the magical oracles ( 1 , 9 , 10 ) are realized by simultaneously exploring all computation paths, are proposed ( 11 , 12 ). In the cost of space or material, parallel exploration provides an alternative to decrease time consumption. As time is irreversible, not reusable and completely out of our charge, it is reasonable to trade physical resources for it. Besides the above NTM proposals, similar measurements have been taken, for instance, the increasingly powerful electronic supercomputers with an integration of an increasing number of processors ( 13 ) and molecule-based computation using large quantities of DNAs or motor molecules ( 14 – 18 ). Furthermore, optimized algorithms are applied to specific instances ( 19 – 21 ).

NP-complete problems ( 1 ) are typically defined as the problems solvable in polynomial time on a nondeterministic Turing machine (NTM), which indicates that these problems are computationally hard on conventional electronic computers, a general type of deterministic Turing machines. The subset sum problem (SSP) with practical application in resource allocation ( 2 ) is a benchmark NP-complete problem ( 3 ), and its intractability has been harnessed in cryptosystems resistant to quantum attacks ( 4 , 5 ). Given a finite set S of N integers, the SSP asks whether there is a subset of S whose sum is equal to the target T. Apparently, the number of subset grows exponentially with the problem size N, which leads to an exponential time scaling and thus strongly limits the size of the problem that can be tackled in reality.

RESULTS

Configuration of the photonic computer for the SSP The proposed photonic computer solving the SSP can be classified as a non–Von Neumann architecture (see the Supplementary Materials for its role in the evolution of computers). As shown in Fig. 1A, the photonic computer consists of an input unit, a processing unit, and an output unit. The input unit is used to generate horizontally polarized photons at 810 nm. Photons are then coupled into the processing unit to dissipate into the waveguide network to execute the computation task. After photons are emitted from the processing unit, the evolution results are read out by the output unit. Here, the processing unit is an analog to the central processing unit (CPU) of an electronic computer, playing a key role in the computation. In the following, we will discuss the design of the processing unit from mathematical and physical implementation aspects to illuminate its capability of solving the SSP. Fig. 1 Schematic of the design and setup. (A) A power-adjustable and horizontally polarized optical source is guaranteed by the quarter-wave plate (QWP), half-wave plate (HWP), and polarization beam splitter (PBS) in the input unit. The photons at 810 nm are prepared and coupled into the network in the processing unit and then travel to generate all possible subset sums. The evolution results at the output ports are retrieved by the charge-coupled device (CCD) to testify the existence of the corresponding sums. (B) The abstract network for the specific instance {2, 5, 7, 9} is composed of three different kinds of nodes representing split junctions, pass junctions, and converge junctions. Split junctions (hexagonal nodes) divide the stream of photons into two portions. One portion moves vertically, and the other travels diagonally. Pass junctions (circular white nodes) allow the photons to proceed along their initial directions. Converge junctions (circular yellow nodes) play a role in transferring photons from diagonal lines to vertical lines. Although the circular yellow nodes overlap with the hexagonal nodes in the abstract network, they are physically separate, as shown in (A). Photons traveling diagonally from a split junction to the next split junction represent including an element into the summation. The value of the element is equal to the number of junctions between two subsequent rows of split junctions, as denoted by the integers on the left. The generated subset sums are equal to the spatial positions of the output signals, as the port numbers denote. (C) The x-y view of the top left corner of the waveguide network in (A) and the abstract network in (B) is composed of the three basic junctions whose x-z views are shown in (D) to (F). The split junction is realized by a modified three-dimensional beam splitter where a coupling distance of 10 μm, a coupling length of 1.8 mm, and a vertical decoupling distance of 25 μm are deliberately selected, leading to a desirable splitting ratio. The unbalanced output of split junctions, revealed by the intensity distribution in (D), is designed to compensate the bending loss caused by the subsequent arc cm ⌢ and arc nf ⌢ in (C). The converge junction is almost a mirror-image split junction except with a different coupling length of 3.3 mm. The residual in port g is small enough to be ignored. A vertical decoupling distance of 25 μm guarantees an excellent pass junction whose extinction ratio is around 24 dB, as the intensity distribution in (F) presents. The processing unit can be represented by an abstract network composed of nodes and lines, which is primarily based on the proposal of Nicolau et al. (16), while physical implementation has to be designed to fit integrated photonics. As the network for the specific instance {2, 5, 7, 9} in Fig. 1B shows, there are three different types of nodes representing split junctions, pass junctions, and converge junctions. Note that although the circular yellow nodes overlap with the hexagonal nodes in the abstract network, they are physically separate, as the waveguide network in Fig. 1A presents. Once the photons enter the network from the top node, the computation process is activated. The photons are split into two portions at hexagonal nodes (split junctions), traveling vertically and diagonally. When meeting the circular white nodes (pass junctions), the photons proceed along the original directions. Meanwhile, the circular yellow nodes (converge junction), located at the end of the diagonal routes that start from the former row of hexagonal nodes, are responsible for transferring photons from diagonal lines to vertical lines before the next splits. The specific SSP is encoded into the network according to particular arithmetical and scalable rules: (i) The vertical distance (measured as the number of nodes) between two subsequent rows of hexagonal nodes is equal to the value of the element from the set {2, 5, 7, 9}, as denoted by the integers on the left. (ii) The diagonal routing leads to a horizontal displacement of photons, whose magnitude is also equal to the integer on the left. The diagonal movement of photons represents that the corresponding element is included into the summation. On the contrary, the vertical movement means that the element is excluded from the summation. (iii) The value of the ultimate sums is equivalent to the spatial position of the output signals, as denoted by the port numbers. For example, the path for port 14, highlighted by a translucent gray band, reveals that only elements 5 and 9 contribute to the subset sum 14. Owing to the vast parallelism, the photons arrive at the output ports with all possible subset sums generated. We fabricate the processing unit in Corning Eagle XG glass with femtosecond laser direct writing technique (see Materials and Methods). The top left corner of the waveguide network in Fig. 1A and the abstract network in Fig. 1B are depicted in detail in Fig. 1 (C to F). As we can see, the split junction is realized by a modified three-dimensional beam splitter, where the two waveguides first couple evanescently (red segment) and then decouple with one of the waveguides climbing upward and the other proceeding along the initial direction. To avoid extra loss, a vertical decoupling distance of 25 μm is deliberately selected. Meanwhile, the coupling length and coupling distance are set to 1.8 mm and 10 μm, respectively, to achieve the desirable splitting ratio. As the intensity distribution in Fig. 1D reveals, the modified beam splitter is unbalanced, which is on the purpose of compensating the bending loss caused by the subsequent arc cm ⌢ and arc nf ⌢ . The converge junction is almost the mirror-image split junction except with a different coupling length of 3.3 mm. The photons in path fg should be completely transferred to path eh in an ideal case, and the residual induced by the imperfect fabrication is minimized. The output intensity at port g is less than 3% of that at port h. The pass junction is implemented in the form of one waveguide crossing over the other at a decoupling vertical distance of 25 μm. As the output intensity in Fig. 1F reveals, the three-dimensional architecture ensures an excellent pass junction whose extinction ratio is around 24 dB. Supported by the powerful fabrication capability of femtosecond laser writing technique (43, 44), we are able to map the abstract network of the SSP into a three-dimensional photonic chip.

Experimental demonstration We demonstrate the computation of the SSP at the specific cases of {3, 7, 11} and {3, 7, 9, 11}. As shown in Fig. 2 (A and C), the evolution results are read out from a one-shot image, where the photons appear in a line of spots. Every single spot is an accepted witness of the existence of the corresponding sum (denoted by the integer below the spot) if the experiments are trusted. Because the involved problem size is not too large, we check the reliability of our experimental results by enumeration and conclude that all the spots observed are supposed to appear and that none of the expected results is absent. Fig. 2 Experimental read-out of computing results. (A) Experimental read-out of evolution results of the case {3, 7, 11}. Every observable spot certifies the existence of the subset sum denoted by the integer below. (B) Normalized intensity distribution of the case {3, 7, 11} in experiment and theory. Here, an axis break is applied to display data points with a value of zero and the logarithmic coordinate simultaneously. The theoretical results are either 0 or 0.125, while the experimental results have a fluctuant distribution. A reasonable threshold can be easily found to classify the experimental outcomes into appearance (beyond the threshold) and absence (below the threshold, which is highlighted with slash filling pattern). A wide tolerance band (filled by slash) allows a wide range of threshold with a lower bound of 0.00209 and an upper bound of 0.05891. (C) Experimental read-out of evolution results of the case {3, 7, 9, 11}. (D) Normalized intensity distribution of the case {3, 7, 9, 11} in experiment and theory. Theoretical results are either 0 or 0.0625. A wide tolerance band (filled by slash) allows a wide range of threshold with a lower bound of 0.00127 and an upper bound of 0.00661. The reliability of our experiments is further investigated by a comprehensive analysis of the intensity distribution, as presented in Fig. 2 (B and D). We calculate the theoretical distribution through a lossless model consisting of balanced split junctions, perfect pass junctions, and ideal converge junctions. Therefore, the theoretical outcomes can be regarded as benchmarks of the SSPs. For the case of {3, 7, 11}, the theoretical result is either 0 or 0.125, while it is either 0 or 0.0625 in the case of {3, 7, 9, 11}. In this theoretical regime, zero intensity indicates that a sum does not exist; otherwise, it exists. We apply a threshold to analyze the retrieved intensity for every output port. A valid appearance can be identified if the intensity goes beyond a reasonable threshold; otherwise, an absence can be confirmed (highlighted by solidus pattern). The tolerant intervals of the thresholds applicable in our experiment are presented with bands filled with a solidus in Fig. 2 (B and D), straightforward revealing the lower bounds and the upper bounds. Beneficial from the good signal-to-noise ratio obtained in our experiments, there is a wide tolerant band to accept a large range of thresholds, which implies the great accuracies of our experiments and verifies the feasibility of our approach.