This study provides unprecedented details of the interactions of biomolecules in a complete cytoplasmic environment. Our model emphasizes that in vivo environments are significantly different from in vitro conditions and illustrates that the inclusion of the full physical environment and the presence of all cellular components exerts more than just the simple volume exclusion effect commonly associated with crowding. We find new evidence for native state perturbations in cellular environments as a result of ‘quinary’ protein-protein interactions consistent with recent NMR studies (Monteith et al., 2015) but also as a result of electrostatic factors due to interactions with ions and metabolites that compete with volume exclusion effects. Our results suggest that native-state perturbations towards functionally compromised states under in vivo conditions may be a more general phenomenon that should be taken into account when interpreting in vitro studies and relying on structures obtained via crystallography.

Another significant insight that is of biological importance is the observation of weak association of glycolytic enzymes in our model. While glycolytic enzymes dominate our cytoplasmic model, both of our explanations, reduced solvation energies and entropic sorting due to size differences, would apply to the majority of metabolic enzymes and we expect that metabolic enzymes as a whole are brought into closer proximity in the cellular environment, thereby generally enhancing metabolic rates. An intriguing question is whether biological systems have evolved to exhibit such characteristics. The large relative size of ribosomes and their large RNA contents may therefore be desirable features outside the immediate context of their function in translation.

Our simulations also allowed us to carry out a detailed analysis of diffusive properties that revealed significant heterogeneity in macromolecular diffusion as a function of the local environment and a surprisingly significant reduction of metabolite diffusion as a result of sticky interactions with macromolecular surfaces. The latter has implications for the number of metabolites actually present in bulk solvent and has consequences for the mechanism of ligand binding in cellular environments.

One recurring aspect in our study is the prominent role of electrostatics that is manifested in different forms, such as the differential solvation between metabolic proteins and RNAs and RNA-containing complexes and metabolite-protein interactions that alter protein structure via electrostatic screening but are also at least in part responsible for reducing the amount of freely-diffusing metabolites. These findings mirror in part ideas by Spitzer and Poolman that hypothesized electrochemical effects to be a major factor in organizing crowded cytoplasms (Spitzer and Poolman, 2005).

The effect of cellular environments on and by metabolites and metabolic enzymes is a major focus of the present study. The present work points at reduced effective ligand concentrations near the macromolecule surface and altered protein-ligand interactions under cellular conditions. Therefore, cell-focused in silico drug design protocols that capture competing interactions and altered kinetic properties under cellular conditions could offer significant advantages over current single-molecule protocols.

All of the results presented here were obtained via computer simulations that, although increasingly reliable and based on experimentally-driven models, still only represent theoretical predictions. One potential concern is that the structural models used here were obtained largely via homology modeling. While the accuracy of structure prediction has increased over the last decade (Kryshtafovych et al., 2014), using such models may affect the accuracy of our results reported here, in particular with respect to the effects of the cellular environment on protein stability. In order to diminish such effects, we focused our analysis on relative comparisons of the same homology models in the cellular environment and in dilute solvent which we believe is meaningful even if the models are only approximations of the real structures. At the same time, the analysis of diffusive properties and non-specific protein-protein interactions is expected to be driven mostly by overall shape, size, and electrostatics rather than specific structural details and, therefore, we expect those results not to be affected significantly by the use of homology models.

The simulations presented here are limited by relatively short simulation lengths due to computational resource constraints. The observed macromolecular diffusion was largely limited to motion within a local environment without enough time to trade places with other macromolecules and certainly without exploring a substantial fraction of the volumes of our simulation systems. The short simulation times affect the estimates of long-time diffusive behavior including the possibility of anomalous diffusion that would not be expected to appear until macromolecules actually exchange with each other. On the other hand, the analysis of native state destabilization and non-specific protein-protein interactions relies essentially on interactions within just the local environment that are being sampled extensively on the time scale of our simulations, while averaging over many different copies in different environments is assumed to provide equivalent statistics to following a single copy that diffuses over much longer time scales. We therefore expect that much longer simulations would primarily reduce the statistical uncertainties without fundamentally altering the results reported here with respect to macromolecular stability and interactions. However, we expect that longer time scales would allow sampling of specific macromolecular and ligand binding equilibria, e.g. tRNA binding to tRNA synthetases, protein oligomerization, or binding of metabolite substrates to their respective target protein active sites, none of which we observed in the course of our simulations.

Force field inaccuracies are another concern and this work is a true test of the transferability and compatibility of the CHARMM force field parameters for close molecular interactions under conditions where force field parameters have not been validated extensively and artifacts, e.g. overstabilization of protein-protein contacts, are possible (Petrov and Zagrovic, 2014). Therefore, additional simulations with other force fields are desirable and follow-up by experiments is essential. The suggested metabolite-induced compaction of PGK should be observable experimentally and it should also be possible to measure weak associations of metabolic enzymes in dense protein solutions with and without nucleic acids and with and without ribosomes using suitable fluorescence probes, for example. Variations of diffusive properties as a function of the local cellular environments and the diffusion of metabolites in cellular environments may be more difficult to determine experimentally but we hope that our work will stimulate experimental efforts to determine such properties.

The next step from the model presented here is a whole-cell model in full physical detail to include the genomic DNA, the cell membrane with embedded proteins, and cytoskeletal elements. Such a model would require in excess of 10 (Tanizaki et al., 2008) particles at an atomistic level of detail. As additional experimental data becomes available and computer platforms continue to increase in scale, this may become possible in the foreseeable future. Such a whole-cell model would bring to bear the tremendous advances in structural biology to make a complete connection between genotypes and phenotypes at the molecular level that is difficult to achieve with the empirical systems biology models in use today.