Half dozen of one, six billion of the other: What can small- and large-scale molecular systems biology learn from one another?

Ian A. Mellis 1 and Arjun Raj 2 1Perelman School of Medicine, Genomics and Computational Biology Graduate Group, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6021, USA; 2Department of Bioengineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6321, USA Corresponding author: arjunraj{at}seas.upenn.edu

Next Section Abstract Small-scale molecular systems biology, by which we mean the understanding of a how a few parts work together to control a particular biological process, is predicated on the assumption that cellular regulation is arranged in a circuit-like structure. Results from the omics revolution have upset this vision to varying degrees by revealing a high degree of interconnectivity, making it difficult to develop a simple, circuit-like understanding of regulatory processes. We here outline the limitations of the small-scale systems biology approach with examples from research into genetic algorithms, genetics, transcriptional network analysis, and genomics. We also discuss the difficulties associated with deriving true understanding from the analysis of large data sets and propose that the development of new, intelligent, computational tools may point to a way forward. Throughout, we intentionally oversimplify and talk about things in which we have little expertise, and it is likely that many of our arguments are wrong on one level or another. We do believe, however, that developing a true understanding via molecular systems biology will require a fundamental rethinking of our approach, and our goal is to provoke thought along these lines.

Previous Section Next Section An inconvenient truth illustrated in silicon A central tenet of systems biology is the idea that regulation in molecular biology is modular, meaning that individual components may operate independently of one another and can be strung together in a rational way to produce higher-level functions. In this context, it is easy to see why the integrated electronic circuit—a triumph of intelligent modular design—has served as a natural conceptual framework for small-scale systems biology. They are exceedingly complex regulatory devices; yet this complexity arises from the composition of smaller, comprehensible modular components integrated according to a set of design principles. The hope is that we can gain the same detailed level of understanding of biological regulatory circuits as we have with electrical circuits by isolating and understanding regulatory modules and their interconnections. Yet a quick look at the world around us reveals the enormous differences between human-designed objects and those designed by evolution in the natural world. Given these differences, is there any reason we might expect to see similarly modular regulatory behavior in the muddled molecular soup of the cell, shaped by the random forces of evolution rather than a rational agent? This is in general a difficult question to answer, because we typically look at evolution in the biological context and human design in the context of man-made objects like electrical circuits. But what would happen if you designed electrical circuits by evolution? Would they resemble human design? Indeed, would we even expect an evolutionary process to yield modular circuits in silicon? In a very interesting set of experiments from the latter half of the 1990s, Adrian Thompson examined exactly that by attempting to evolve an electronic circuit (Thompson 1997). Specifically, the goal was to evolve a circuit capable of discriminating electronic “tones” (essentially, a low frequency and high frequency signal), and the evolutionary substrate for his experiments was the field programmable gate array (FPGA), a chip that is itself reprogrammable through software. The idea was to start with a randomly “programmed” chip and then evolve the features on the chip using a genetic algorithm to see if it would eventually be able to perform the tone-discrimination task (Fig. 1A). Amazingly enough, in about 4100 generations, the chip had evolved to the point at which it could perform the task with very high fidelity, which was particularly impressive given the small number of potential circuit elements it was given to work with (i.e., 100). (In fact, Thompson noted that many thought it would be impossible for such a small circuit to perform this task.) How did the circuit achieve this task? It was unfortunately rather hard to say exactly how, but one quick conclusion is that the solution was neither modular nor readily comprehensible through standard digital circuit design principles, with strange waveforms appearing throughout the evolutionary process. Thus, it was the first strike against modularity. View larger version: Download as PowerPoint Slide Figure 1. Complex structure of a circuit evolved in silicon. (A) The final evolved circuit for tone discrimination, a 10 × 10 array of cells. All connections that link an input and an output are marked. (Arrow from a cell) connection driven by origin cell's function; (square at arrowhead) connection selected as input for recipient cell function. (B) Minimal necessary components of final evolved circuit for tone discrimination. Missing cells can be fixed at constant value without affecting performance; gray cells cannot be fixed at constant value without affecting performance, although the cell has no path connecting to marked minimal necessary functional cells (Thompson 1997). The beautiful thing about evolving a circuit in silicon is that it enabled Thompson (1997) to dissect the evolutionary process in ways that would be very difficult to do in the messy world of molecular biology. For instance, he could perform the equivalent of looking for “phenotypes” of “knock outs” with rapidity and precision, and here things got even more interesting. One of the first things Thompson did was prune the network by clamping the output of all circuit elements that did not affect the performance of the circuit. Surprisingly, this minimal network included five elements that were not physically connected to the circuit at all, at least not in the conventional sense (Fig. 1B). Most of these elements had relatively small but quantifiable effects on circuit performance; one of them had a very large effect, even though its output had no connection to the circuit whatsoever. The interpretation is that evolution took advantage of the underlying physics (which we typically ignore in modular design) to arrive at solutions that appear incomprehensible and foreign, even to circuit engineers. Indeed, moving the exact same circuit to a different part of the chip resulted in poorer performance that could be improved by a bit of further evolution, a feature strikingly reminiscent of approaches in synthetic biology (Dougherty and Arnold 2009). Still, although Thompson (1997) should of course not be taken as a literal model for evolution per se, it is worth noting that it produced a strikingly organic result, and one that appears to also hold on to its secrets in much the same way.

Previous Section Next Section Limitations of the small-scale systems biology approach What are the implications of Thompson's findings for small-scale systems biology (Thompson 1997)? We believe the primary lesson is that the notion that evolution favors easily discernible and well-isolated regulatory modules may be fundamentally wrong. Although we might be able to make some sense of a small subset of the regulatory network, we believe that in most cases in metazoans, small-scale systems biology has not led to an understanding sufficiently detailed to allow one to clearly define regulatory modules or how they might interact, much like Thompson's evolved circuits. Take, for example, the standard formula for a small-scale systems biology paper: Start with an interesting phenomenon and make some quantitative measurements. Develop a mechanistic model, often mathematical, to explain the data. Mine this model for a prediction. Make a perturbation to test this prediction and verify it experimentally. Although these stories are often very appealing, when examined in detail, one often finds that perturbations seldom yield complete and definitive results, hence leaving us with several holes in the story. These holes typically remain unfilled, and so it is difficult to know to what extent the regulatory “module” identified in the story is truly isolated from other putative modules. Indeed, these holes call into question the very notion that modules exist at all. Take an example from our own work (Raj et al. 2010), which we focus on for the purposes of self-deprecation rather than self-promotion. We tried to explain variability in the gut development pathway in C. elegans in organisms harboring a mutated version of a particular transcription factor. This phenotypic variability, known as incomplete penetrance, is a common fact of life in genetics, with many, if not most, regulatory mutants showing such partial effects. We made quantitative measurements of transcription factor expression in a series of mutants, showing that variability in expression of a downstream regulator, subjected to a threshold, led to the decision of whether or not to form gut cells in the mutant embryo through the expression of the “gut master regulator” elt-2. A prediction of this model was that reducing the variability would result in more embryos surpassing the threshold. Sure enough, by further removing an inhibitor of the downstream regulator, we were able to reduce the variability, pushing more embryos above the threshold and leading to more gut cell formation. All in all, it was a fairly standard exercise in molecular systems biology. Yet, are we any closer to understanding variability in gut cell development? We would argue not. We know a few factors that seem to both incur this variability and then potentially manipulate the variability, but none of these perturbations give us anywhere close to a complete understanding of what governs this variability—nothing we did could restore wild-type precision fully, nor were the effects limited in scope. Scientists often attribute these untidy results to the relative imprecision of our experimental tools (Lazebnik 2002). Perhaps that is true in some cases, but we feel the evidence points to messiness being a fundamental feature of biological regulation. In the case of gut development, decades of painstaking genetics (Maduro and Rothman 2002; Maduro 2006) provided us with a relatively simple regulatory pathway that served as the framework for our results. Yet even these genetic foundations still contain “mystery factors” that we know must exist (and probably several more that we do not know exist), but have not yet identified after almost 20 years (Maduro and Rothman 2002). Indeed, upon closer inspection, these simple pathways often reveal layers of deep complexity and interconnectedness that are at odds with the notion of modularity. One concept that is most associated with modularity is that of the “master regulator,” in this case, a transcription factor whose expression is necessary and sufficient for a particular phenotype of interest. In the case of the gut, early experiments revealed elt-2 to be a candidate for the master regulator of gut formation (Fukushige et al. 1998). Surprisingly, it then turned out that the elt-2 knockout worm still expressed some downstream factors and had a reasonable approximation of a nascent gut (Fukushige et al. 1998; Sommermann et al. 2010). Attention then turned to the role of a the seemingly redundant transcription factor elt-7, with the double knockout of elt-2 and elt-7 showing a far more profound lack of gut phenotype, except for the well-differentiated gut cells interspersed between the cells displaying the mutant phenotype (Sommermann et al. 2010). Perhaps then there is also a role for elt-4 (Maduro and Rothman 2002)? One can get into semantic discussions over necessity and sufficiency and the definition of master regulators (Chan et al. 2013), but we think this example nicely illustrates the fact that redundancy and partial effects are more the rule than the exception with respect to so-called master regulators. Indeed, in many cases, close inspection of the details reveals that many other examples of master regulators are somewhat less clean and simple than previously believed. Some may counter our arguments by pointing out that we are at the very beginning of this field, developing the initial knowledge of the dominant parts and players that we will then refine and add to over time toward a more complete understanding (Brenner 2010). Rob Phillips is a strong proponent of this line of thought, pointing out that many fields of study look messy initially until decades of hard, careful work brings a systematic order to them. Impressively, his group has shown that careful analysis of transcriptional factor concentration could explain transcriptional regulation in E. coli through a single governing principle (Brewster et al. 2012, 2014). It is also possible that currently mysterious regulatory behavior may have explanations involving other forms of biochemical and physical interactions than typically considered (Frechin et al. 2015). Are these the first steps along the path to a complete explanation of transcriptional regulation? Is “complexity” just a word we trot out whenever we refuse to think harder about the problem? Or are these successes one-offs or limited in scope to simpler prokaryotic systems and utterly useless in the face of metazoan complexity? We think it is hard to say at this point. An oft-repeated truism from George E.P. Box is that all models are wrong, but some are useful. We think the difficulty lies in the definition of useful (Box and Draper 1987). As a means to roughly explain some effects in a particular regulatory system, our current models are useful. As a building block for a larger model to truly explain, for instance, how a developmental gene network attains such high levels of precision, our current models are still largely useless. We wonder whether such higher-order models will ever emerge from the paradigm of combining modular building blocks because biological regulation may be intrinsically nonmodular and thus perhaps not understandable by the framework used by small-scale systems biology.

Previous Section Next Section Genomics and the revelation of dense interconnectedness The arrival of the genomic era has in many ways laid these facts bare. Take the example of differential gene expression analysis. Now that we have the ability to accurately measure differences in gene expression across the genome, it is clear that the consequences of virtually any perturbation are seldom relegated to one or a few genes, but rather spread across large numbers of genes, often numbering in the thousands. Moreover, these sets of genes almost never fall into clear mechanistic subgroups, but rather only show “enrichment” for various cellular functions that typically overlap significantly with other perturbations. It is certainly possible that the majority of these differences are completely inconsequential, as nicely argued by Atay and Skotheim (2014). In this view, there is a core set of circuits underlying cellular regulation surrounded by a bunch of irrelevant noise. We are not so sure, however, that this view is completely supported by the evidence. We offer another recent example, CellNet, a computational framework developed for using genome-wide expression profiles analysis to help understand and manipulate cell types (Cahan et al. 2014; Morris et al. 2014). CellNet takes as input a large number of gene expression profiles spanning several different cell types and several different experimental conditions. Using these data, it attempts to construct a gene regulatory network associated with each cell type using expression variation to infer regulatory links. The authors applied CellNet in two ways, both of which we believe argue against simple models of gene regulation. First, using CellNet, they show that interconverting cells from one type to another by expressing particular master transcription factors (in this case, fibroblasts to neurons via ectopic expression of Ascl1, Nr4a2, and Lmx1a) led to cells that still had traces of the fibroblast gene expression program. Direct differentiation from embryonic stem cells showed no such defects. This shows that activation of the network by these transcription factors was incomplete, and thus that differentiation depends on more than just the expression of a few major players. Second, in the case of B cell to macrophage conversion, they showed that CellNet could generate a list of candidate interventions to enhance conversion. Experiments showed that performing those interventions worked as predicted; it seems reasonable to assume that the more interventions one could perform simultaneously, the better the results. Again, these results suggest that properties such as cell fate depend on the values of many cellular parameters, and further, that precise manipulation of those properties may require control over all of those parameters. We think that the bias toward isolating single dominant factors stems from an inherent desire to develop scientific stories, which are invariably more satisfying when they have a single or few protagonists. Experimental geneticists typically model variants leading to big phenotypes with high penetrance, ignoring or perhaps not even detecting variants of lesser or partial effect. Molecular biologists apply many of the same experimental approaches that were well suited to working out the basic machinery of the cell but may be less well suited to understanding regulation. A particularly stark example of the limitations of this approach is in the mechanistic study of cancer, which has led to an incredible accumulation of knowledge about the molecular basis for regulating cellular processes such as proliferation, death, and disease (Hanahan and Weinberg 2000). Yet, with all this knowledge, we are still unable to cure most actual clinical human cancers, and there is a growing appreciation that detailed mechanistic models have largely failed to capture the full complexity of the disease (Weinberg 2014). Because of these biases, both scientific and methodological, it is still unclear how many forms of biological regulation are built from the sum of many smaller effects.

Previous Section Next Section Unbiased—and often indecipherable Genomics has also allowed us to remove some of these biases, perhaps most notably through the use of genome-wide association studies of quantitative traits, such as height or blood cholesterol levels. Here, the goal is to start with the quantitative trait, then look for the genetic variants underpinning its variation, many of which are in regulatory noncoding DNA. The results of genome-wide association studies have left us with the impression that the major players we seek in molecular biology exist but are rare, with most studies failing to find large-effect-size variants for most common quantitative traits. This has left us with just a few mechanistic crumbs and the general feeling that most traits are indeed composed of large numbers of R.A. Fisher's variants of small effect, as though phenotypes are composed largely from the little gray squares of Thompson's evolved circuits (Fig. 1B; Fisher 1930; Thompson 1997). Perhaps the most classic example is that of human height. Height has a strongly genetic basis, with a “narrow-sense” heritability of around 0.80 (Silventoinen et al. 2003; Visscher et al. 2006). The approach of the molecular biologist would be to then look for mutants that are abnormally large or small, which would identify pathways associated with gigantism or dwarfism. Yet these results would yield little understanding of the heritability of more common variation in height, which genome-wide association studies have the potential to reveal. However, recent genome-wide association studies show (e.g., Lango Allen et al. 2010) that even the combined effects of 180 identified variants only explain ∼12.5% of the genetic differences in human height. This is not to say that the genetic basis of height is magical: If one includes all possible variants, one can explain a large fraction of the heritability (Yang et al. 2010). Rather, this points to height being a composition of a very large number of very small effects, and the same story has come up in analyses of many other traits. And what of the molecular basis of normal variation in human height? Several experiments done both before the advent of genome-wide association studies and afterward as follow-ups on identified loci have suggested that several of the SNPs identified in these studies have functional effects in pathways that can plausibly be linked to height, such as mitosis, mesoderm and skeletal development, and a plethora of signaling pathways, including those controlled by various growth factors, among others (Wood et al. 2014). Still, there is no single pathway to point to that provides a simple story explaining height, and as such, no simple therapeutic intervention to enable us to manipulate height. This is not to say that genome-wide association studies have not revealed variants of immediate biomedical interest. There are many examples to choose from, including Musunuru and colleagues’ work taking a hit from a genome-wide association study, showing that the SNP changed expression of a particular gene, which then altered lipid levels in the blood (Musunuru et al. 2010). This example and others like it provides a beautiful arc from discovery to mechanism and is in many ways an ideal that the field aspires to. Yet, this is much more the exception than the rule, with perhaps most genetic variants having a spectrum of effect strengths. There are of course many debates as to exactly why the field of quantitative genetics is filled with more of a murky haze than a set of smoking guns, and we leave it to those more qualified than us to continue those debates. (We do wonder if these very complex mappings from genotype to phenotype may actually reflect the advantages of a distributed manner of encoding.) However, from our relative outsiders’ perspective, we believe these findings fit well with our general thesis that biological regulation is far less story-like than we would like it to be, and there is a distinct possibility that many if not most regulatory systems have almost no dominant, easily rationalizable stories to be found at all.

Previous Section Next Section Now that we know the unknown unknowns, what do we really know? Should we then dispense with the very notion of small-scale systems biology and plunge headlong into a data-first future? Ultimately, we think this depends on the nature of the question at hand and the type of understanding we hope to derive. At one end of the spectrum is a purely operational level of understanding, one in which we learn just enough to be able to manipulate cells in ways we find useful or amusing—something that may require less in the way of deep understanding. On the other end of the spectrum is the search for universal laws and design principles along the lines of Newton's laws of motion. How close are we to the latter? We think many of us got into basic science to pursue fundamental truths, and it was not uncommon for a time to hear the claim that we are on the cusp of a Newtonian revolution in biology. Perhaps. Certainly, as a purely theoretical matter, if we were able to measure absolutely everything about a cell, the laws of chemistry would likely enable us to produce a completely predictive model of cellular function, and there are promising attempts at simulating relatively simple organisms such as bacteria (Karr et al. 2012). Yet, the development of a complete model incorporating all the complexities of a metazoan cell seems very distant at this point, and so our search for fundamental truths must be for some simplified effective representation. It is, of course, an open question as to whether universal truths such as Newton's laws even exist for cellular regulation—and if they do exist, whether we will be able to understand them. We believe our earlier arguments further support the premise that systems engineered through evolution need not be modular nor follow well-defined design principles. It is true that computational studies have shown that evolution can potentially favor modular solutions (Variano and Lipson 2004; Kashtan and Alon 2005; Clune et al. 2013), but we wonder whether the constraints imposed by models cannot reflect the ability of natural systems (or even Thompson's circuits) to take advantage of complex underlying chemistry and physics. Either way, whether one wishes to find a few greater truths or a passel of smaller ones, we find we are in a state where we suffer not from a paucity of data, but from a paucity of frameworks—theories, really—by which to understand those data. Currently, most of our approaches to dealing with large amounts of data essentially boil down to statistical methods for extracting associations, often using increasingly sophisticated techniques from machine learning to try and generate hypotheses and insights. Yet, as Gautham Nair, a former postdoc in our laboratory, once quipped, “Would Newton have discovered the theory of gravity through machine learning?” As a related question, does the theory of gravity have a P-value? It is perhaps instructive to look at another example from physics: the Large Hadron Collider's search for the Higgs boson. The Large Hadron Collider produces data at an almost unfathomable rate, and yet the vast majority of it is discarded and deemed irrelevant. This is because the theoretical foundations of the experiment are so strong that we are able to parse this data down to the specific events that are most relevant to proving, for instance, that the Higgs boson exists. Of course, processing this data requires extremely sophisticated statistical treatment of the data, but that is more a matter of analysis than the derivation of scientific truth. We think it very unlikely that one would be able to derive all of particle physics just by drinking directly from the fire hose of particle collider data. Closer to home, one of our favorite examples of small-scale systems biology is Cai and colleagues’ lovely result showing that frequency modulation of bursts of nuclear localization can coordinate expression across a very large number of genes (Cai et al. 2008). It seems similarly unlikely that one could arrive at this result simply by combing through reams of high-throughput expression data. Our analogies have flaws, but we find most data-first counterarguments are unsatisfying. One might argue with our point about, for example, Newton's theory of gravitation, saying that it would have been impossible to even conceive of the theory without the huge collection of data on the movement of heavenly bodies. It is true that the data were there first, but it is unclear that all those data were required for the conception of the theory, or rather served as post hoc confirmation. In this case, all the data required would be those showing a discrepancy with the current model for planetary motion; similarly, the discovery of alternative splicing did not require deep RNA-sequencing. Another argument often cited as a benefit of data-first approaches is that they are not biased in favor of any particular outcome. Although we agree that the genomics tools applied in data-first approaches are extremely powerful tools for discovery (much as are genetic screens and biochemical purifications), we believe that an approach not directed toward any particular scientific question is unlikely to provide any conclusive answers (Brenner 2010; Weinberg 2010; Graur et al. 2013). (Although genomics-style research is perhaps most often criticized for data-first approaches, many other areas of biomedical research suffer the same issues, but are perhaps less well known or glamorous, thus attracting less controversy.) We think this underscores our belief that no matter what the technical approach, strong experimental design with a question in mind is still a requirement.

Previous Section Next Section Rise of the machines? With all of that said, largely statistical modes of analysis that dominate the analysis of large data sets these days are an easy target for scorn until we are faced with the challenge of actually analyzing said data ourselves. Why has deriving insight from data proven so challenging? Is it perhaps the limitations inherent to our own human brains? For instance, most human brains have limited capacity to reason beyond two (or sometimes three) dimensions. Indeed, it is for this reason that researchers have invested much effort into developing two-dimensional visualization techniques of high dimensional data like t-SNE (Van der Maaten and Hinton 2008) with applications in biology (Amir et al. 2013) in the hopes that our brains’ capacity for deriving insights from 2D presentations may somehow reveal something. However, just as taxonomy is not biology, so too classification is not understanding; and it is important to separate the visualization of data with our quest to understand it. Therein lies the challenge: There is no reason to believe that the biology of, say, gene regulation is inherently understandable in some 2D manifestation. Perhaps, however, there is hope that computation may develop to the point at which computers can actually help us develop insights directly from data. Lest this sound like a Pollyannaish vision of the future, it is worth mentioning that Hod Lipson's group has demonstrated the ability to algorithmically derive mathematical descriptions of physical laws—including, Newton's 2nd law (!)—directly from motion tracking of pendulums and other such devices (Schmidt and Lipson 2009). (It is a delightful irony that the group used genetic algorithms to make these discoveries.) Applications to biology may yield new biological laws we may never have envisioned otherwise (Schmidt et al. 2011). Or perhaps we may draw inspiration from advances in computer vision, in which very large data sets coupled with large neural networks have led to stunning advances in the ability of computers to parse natural images (Deng et al. 2009; Russakovsky et al. 2014), with these programs now able to identify objects in images with startling accuracy. Recent iterations are in fact also able to parse semantics from those images. Of course, such “narrow” artificial intelligence often still pales in comparison to the power of the adult human brain in general (although in some instances can outperform even the best human). However, computational architectures are also free from the constraints that our physiology imposes and may be able to “see” patterns in higher dimensions that we simply cannot intuit without help. CellNet (Cahan et al. 2014; Morris et al. 2014) and other network frameworks (Carter et al. 2013; Carvunis and Ideker 2014) may portend the arrival of such aids to intuition. Such methods are still in their infancy, but we believe they may ultimately provide the tools required to help us derive meaning from the highly multidimensional data that is increasingly ubiquitous in molecular biology. Whatever the approach may ultimately be, we believe that the complete reverse engineering of regulation in molecular biology will require fundamentally new computational aids that enable us to extract some order from the seemingly endless complexity we are now faced with. We also think that synthetic biology has the potential to inform our understanding. Currently, some synthetic reconstructions of biology are able to capture aspects of real biology to some degree, allowing us to test biological hypotheses in a rigorous fashion. It also may be that the incorporation of new, computer-based insights can harness complexity to yield a far greater degree of control over biological systems than is currently possible. This may reveal, however, the requirement of new forms of manipulations that enable us to produce complex, multifactorial perturbations.

Previous Section Next Section Is there any hope left for small-scale systems biology? What, then, to make of small-scale systems biology? Is it worth our continued pursuit? The answer for us is yes. We do think, though, that small-scale systems biology will look a bit different in the future. Currently, the most visible differences between small-scale systems biology and large-scale systems biology have been methodological, with a clear dividing line between fluorescent protein reporters, single molecule readouts, and tinkering with the genetics of model organisms on the one hand and large-scale consortium-driven omics approaches on the other. Yet these are just differences in technology, and the differences in style are perhaps driven more by the “design by committee” approach required for what used to be very expensive large-scale experiments. As omics technologies become cheaper, this gap is shrinking, and large-scale data is becoming more accessible to the do-it-yourself style more typically associated with small-scale systems biology. At the same time, developing new quantitative frameworks to understand what these data are telling us will still be critical, and we think it is important to keep an open mind as to what those frameworks may look like. We still believe in the goal of making quantitative models to reveal principles of cellular behavior; and perhaps through the incorporation of omics technology and new computational techniques to augment our intuition, we will be able to synthesize our small-scale models into a more complete picture. Of course, it is also possible that we may never be able to scale and integrate our models. Perhaps this is okay. Science is also about appreciating the beauty of solving puzzles, be they large or small.

Previous Section Next Section Acknowledgments We thank Jan Skotheim and Rob Phillips for discussions that informed the writing of this manuscript, and Uschi Symmons, Chris Hsiung, Paul Ginart, and other members of the Raj Laboratory for critical feedback. I.A.M. acknowledges support from a National Institutes of Health (NIH) MSTP T32 GM-07170 training grant, and A.R. acknowledges support from an NIH New Innovator grant 1DP2OD008514, a National Science Foundation NSF CAREER award, and an R33 EB019767 (NIH–NIBIB).