Pure electronic Stark effect model. Total effective electric fields

Here we assume a one-dimensional chromophore, interacting through its electronic system with the electric field of protein surrounding. We suppose that the chromophore structure, i.e. its bond lengths and angles do not change upon transition from one local surrounding to another. The difference of the chromophore dipole moments Δμ will contain a field-independent (vacuum) part Δμ 0 and a field-induced part, Δμ ind . In the point dipole approximation, we can write Δμ ind = ΔαE Δμ , where E Δμ is the projection of the electric field on the direction of the Δμ 0 vector and Δα is the change of the chromophore polarizability upon excitation. For simplicity, we will call E Δμ the field in what follows. In our model we assume that Δα does not depend on the field (no hyperpolarizability effects). We define a positive molecular x-axis along the direction from the center of the imidazolinone ring to the center of the phenol ring which virtually coincides with the direction of the Δμ 0 vector20,21,22,23,24,25. The total change of the permanent dipole moment (projection onto x-axis) then reads26:

(In our previous paper14 the coefficient ½ was used in the second term of eq. (1) following an erroneous presentation in some previous literature, see e.g. Ref. 26 A correct consideration27 should not contain it.)Equation (1) can be solved for the field to obtain:

This equation establishes a linear correlation between the experimentally measurable permanent dipole difference and the electric field. Since the field can be non-uniform across the length of the chromophore, we call it effective field, i.e. producing the same effect as the uniform field of magnitude E Δμ . This field also shifts the electronic transition frequency relative to that in vacuum via the second-order Stark effect, as follows27:

where h is the Planck’s constant. Substituting the expression for E Δμ (2) into (3) and expressing frequencies in wavenumbers ( = ν/c, where c is the speed of light), we obtain the relation between the experimentally measurable parameters and Δμ:

This equation shows that the transition frequency as a function of Δμ is described by a parabola with the extremum found at Δμ = 0. Its curvature is defined by −(2hcΔα)−1.

Figure 2 presents the experimental dependence of the Boltzmann-broadened 0–0 transition frequency on Δμ, for the anionic GFP chromophore in a variety of local environments (see Methods for the measurement details). Note that only the 0–0 2PA transition can be safely expected to reflect the dipole change, due to vibronic coupling effect28.

Figure 2 Dependence of the 0–0 transition frequency on the change of the permanent dipole moment for a GFP anionic chromophore in a number of different local environments. The blue square represents the model HBDI− chromophore in alkaline D 2 O solution. Standard deviations are shown by bars. Twenty six different FP variants under investigation are sub-divided into three different groups, according to the local structure of their chromophore surrounding: (1) The mutants derived from EGFP and mTFP, where the two chromophore oxygen atoms, i.e. at phenolate and imidazolinone rings, maintain five hydrogen bonds with the surrounding, similar to the original EGFP and mTFP63,64,65,66, are shown by green circles; (2) EGFP-type of mutants containing, among others, the T203I mutation, which causes a reduction of the number of hydrogen bonds from five to four in the chromophore cluster, are shown by orange pentagons; (3) Yellow FP derivatives of GFP containing, among others, the T203Y mutation, which along with the reduction of the number of hydrogen bonds also causes a π-stacking interaction of the chromophore with the Tyr203 phenol38, are shown by yellow circles. The inset shows the same data in the linearizing, ν versus Δμ2 coordinates. Full size image

Fitting of all experimental data to the function (full black line in Fig. 2) provides the coefficients A = −(2hcΔα)−1 = 72 ± 4 cm−1D−2 and = 19300 ± 70 cm−1. From the first coefficient, we obtain Δα = −35 ± 2 Å3. Substituting this value, as well as the transition frequency measured in the gas phase, = 20725 cm−1, Ref. 29, into the expression for C, we obtain Δμ 0 = 4.47 ± 0.16 D. Note that both Δμ 0 and Δα represent some effective values averaged over several different structural variants of the chromophore, e.g. including different bond-length patterns in different classes of proteins or different twisting conformations in water vs. protein environment. Despite this, the experimental values of Δμ 0 and Δα fall in the range predicted by quantum calculations for the HBDI− chromophore in vacuum (see Table 1 and Supplementary Tables S1 and S2).

Table 1 Calculated components and magnitude of Δ μ in vacuum and different HB-clusters of the HBDI− chromophore. Full size table

By knowing the values of the parameters Δα and Δμ 0 , we find the fields E Δμ from (2) for all the systems studied, see Supplementary Table S3. For some of the proteins, crystallographic structures are available and the corresponding fields can therefore be evaluated theoretically using MD simulations and these results can then be compared to experimental values. Note that the expected outcomes rely only on molecular mechanics (MM) treatment of the protein around the chromophore. In particular, the partial charges in the force field used (see Methods) are obtained by fitting the electrostatic potential (calculated at the Hartree-Fock level) on molecular surfaces and therefore this approach is expected to be sufficiently accurate. For such a comparison we have selected four proteins with very different (in magnitude and direction) electric fields, including YFP-variant citrine (pdb files: 3DPW and 1HUY), EGFP (pdb: 4EUL and 2Y0G), mTFP1.0 (pdb: 2HQK) and mTFP0.7 (pdb: 2OTB).

Figure 3(a) shows the changes of electrostatic potential as a function of distance along the straight line connecting the atoms CA2 and CE2. The potentials at the CE2, CD2, CG2, CB2 and CA2 atoms, closest to the line are shown versus the cumulative distance between the projections of the atoms on that line. Selection of these atoms is justified by the theoretical calculations25,30,31 demonstrating that the most significant change of electronic density upon the S 0 → S 1 excitation occurs at the CG2, CB2, CA2, CE2, CD2, C2, CE1, CD1 and OH atoms and that the vector connecting the CA2 and CE2 atoms is virtually co-directed with Δμ 0 . The three central atoms, CG2, CB2 and CA2, making the bridge between the phenolate and imidazolinone rings are known to be the main players in the charge-transfer process32,33,34. The electric field projected onto the CA2-CE2 direction should therefore be responsible for the induced part of Δμ and for the shifts of transition energy. To obtain E Δμ , we took a negative gradient of the calculated electrostatic potential function (i.e. the slopes of the straight lines in Fig. 3(a)). The correlation between the experimentally measured and simulated fields for four proteins is shown in Fig. 3(b). Note that both the magnitude and the sign of the field are reproduced in MD surprisingly well, especially for mTFP0.7, mTFP1.0 and EGFP. The experimental values vary in a wide range, from −8.6 MV/cm in mTFP0.7 to +7.4 MV/cm in EGFP, reaching +25.4 MV/cm in citrine, which represent typical order of magnitude in proteins12,13,35,36,37. The wide range of the fields observed is due to significant changes of local interactions. In particular, the increase of the component of the field directed from phenolate to imidazolinone (negative x) by ~16 MV/cm when going from EGFP to mTFP0.7 (Fig. 3(b)) can be qualitatively explained by the presence of the unique, positively charged H197 and negatively charged E148 residues near the chromophore in mTFP0.7 7,18.

Figure 3 (a) MD-simulated electrostatic potentials on the chromophore atoms as a function of the distance along the straight line connecting CE2 and C2 atoms (see Fig. 4 for atom notations) for six selected FP structures. For EGFP and citrine, the two sets of data points correspond to two different pdb files. The linear regressions based on the potentials of the five atoms are shown by straight lines. (b) Comparison of experimentally measured electric fields with the fields obtained from the MD simulations of potentials shown in (a). Dashed line indicates exact coincidence. (c) Same as in (a) but when calculating potentials, in addition to the chromophore the contribution from five or four (in case of citrine) amino acids or water molecules in close proximity to the chromophore were also excluded. (d) Comparison of electric fields obtained from experimentally measured Δμ values using the parameters Δμ HB and Δα either obtained from experiment (blue symbols) or calculated (green symbols) with the fields obtained from the MD simulations of potentials shown in (c). Standard deviations are shown by bars in each panel. Full size image

Our simple Stark effect model, Fig. 2, also explains the large red shift of yellow FP citrine vs. EGFP by the much smaller Δμ value of the former, i.e. 1.5 vs. 3.6 D. The intermediate red shifts in T203I mutants (orange pentagons in Fig. 2) are also due to a decrease of Δμ. Structurally, in citrine, the key mutation which significantly red-shifts the absorption is T203Y38. This substitution reduces the number of hydrogen bonds at the phenolate oxygen from 3 to 2 and, also, creates a new, π-π stacking interaction of the Tyr203 phenol group with the chromophore38. It is interesting that the main part of the red shift is due to the loss of a hydrogen bond because even aliphatic substitutions at the same position, i.e. T203V and T203I produce a similar, although slightly smaller, red shift (see Ref. 39 and Fig. 2). It has been proposed39 that the extra hydrogen bond, created by the threonine hydroxyl in EGFP, pulls the electron density to the phenolate oxygen, thus stabilizing the ground state and shifting the transition energy up as compared to the EGFP T203I mutant. This interaction is strongly localized near the OH atom and appears to be not fully detectable by averaging the potential changes over the further removed atoms CE2, CD2, CG2, CB2 and CA2, as is done in Fig. 3(a). On the other hand, it significantly changes Δμ of the chromophore and should therefore be observed as a contribution to the effective experimental field. In fact, the difference between the experimental and simulated fields in citrine ~10 MV/cm, see Fig. 3(b), is probably due to this unaccounted short-range interaction. In the next section we refine the model to separate the effects of the short- and long-range fields.

Refined model that includes structural changes of the chromophore. Long-range electric fields

In an attempt to refine the model and elucidate the role of the short-range interactions, which are probably the main cause of discrepancies between the effective experimental and simulated field values (cf. citrine), we now consider a cluster, consisting of the π-conjugated system of the chromophore itself and its closest surrounding. Within the cluster, the chromophore interacts specifically with the inner shell of several chemical groups. In particular, we consider the hydrogen bonding to the OH and O2 atoms of the chromophore as a main source of the short-range interactions. These hydrogen-bonded groups can perturb the initial (vacuum) dipole moments μ g and μ e . As a result, the dipole moment difference of the chromophore in the hydrogen-bonded cluster, Δμ HB , can differ from that observed in vacuum. We will further assume that the Δμ HB is invariant for a series of FP mutants with the same number and character of hydrogen bonds in the cluster.

To obtain the total dipole moment difference, Δμ, we should include, in addition to Δμ HB , a contribution that is induced by the long-range electric field, E’ Δμ , created by the outer parts of the protein. We use the prime sign to distinguish the long-range part of the field from the total field (E Δμ ). Using the same approach, as in (1)–(2) and again assuming that Δα does not depend on the long- and short-range field variations, we now obtain for the long-range field within a class of proteins with particular Δμ HB :

The unknown parameter Δμ HB can be obtained either from quantum mechanical calculations or from comparison of experimental data with the refined model, describing optical transitions in a series of proteins with the same local surrounding.

Quantum mechanical calculations of Δμ HB and Δα

For generality, we performed quantum mechanical calculations of the dipole moment differences in vacuum and in four different hydrogen-bonding clusters by using the same method (see Methods for details).

First, the chromophore with five hydrogen-bonded water molecules represented the cluster encountered in water solution (cluster (1)). The cluster (2) included the chromophore with five hydrogen-bonding contacts (5-HB cluster), corresponding to mTFP0.7 structure. It contained His163 (modeled with neutral imidazole group), Ser146 (modeled with the CH 3 OH group) and an H 2 O molecule, all three bonded to the phenolate oxygen, as well as Arg95 (modeled with positively charged guanidinium group) and another H 2 O molecule, both bound to the imidazolinone oxygen. The next cluster (3) corresponded to the EGFP where His148, Thr203 and a water molecule were H-bonded to the phenolate oxygen and Arg96 and Gln94 were H-bonded to the imidazolinone oxygen. The cluster (4), corresponding to citrine, contained the chromophore with His148 and H 2 O both H-bonded to the phenolate oxygen and R96 (H-bonded to imidazolinone oxygen) and Q94 (probably weakly or not H-bonded, but very close to imidazolinone oxygen). The Q94 is retained in the citrine cluster for easier comparison with EGFP.

We select the molecular frame, such that the x-axis is directed from the geometrical center of the imidazolinone ring to the geometrical center of the phenol ring that is virtually parallel to the (calculated) transition dipole vector. The y-axis is directed orthogonally and lies in the average molecular plane, as shown in the Fig. 4. The z-axis makes a right triad with x and y. The results of the calculations are presented in Table 1.

Figure 4 Schematic of the cluster, using mTFP0.7 structure (pdb file 2OTB) as an example that contains the HBDI− chromophore and five hydrogen-bonded groups. Full size image

The calculation shows that the Δμ HB vector is directed almost perfectly along the x-axis, similarly to what was found in previous calculations for the HBDI− chromophore in vacuum, water and GFP protein (Supplementary Table S1), thus justifying the one-dimensional model of the chromophore.

Note that the calculated value of Δμ in vacuum (4.35 D) is in good agreement with the experiment (4.47 ± 0.16 D). The addition of five hydrogen-bonded water molecules to the chromophore slightly increases the Δμ magnitude (~5.2 D). On the other hand, the Δμ value decreases (compared to vacuum) in the mTFP-derived 5-HB cluster to 3.3 D (entry 3) which, according to our analysis, can be mainly attributed to the positively-charged Arg95. Interestingly, removing all the surrounding groups, but keeping the chromophore structure the same as it was found after optimization in the cluster, produces only minor change of the Δμ, i.e. from 3.3 to 2.9 D, cf. entries 3 and 4 of Table 1. This important result demonstrates that in this case the change of Δμ in cluster vs. vacuum is primarily due to the change of the chromophore structure, namely to a different bond-length alternation pattern (cf. Ref. 17) and that the electronic redistribution is coupled to the structural change. Another interesting result is that in the citrine cluster, where there are only three hydrogen bonds (entry No. 6), the Δμ HB value is further significantly reduced compared to that observed in the 5-HB clusters.

To obtain the Δα value, the dependence of Δμ on E Δμ was calculated by applying a uniform electric field along the x-axis of the chromophore (Methods and Supplementary Fig. S1). The resulting value, Δα = −49 Å3, is in reasonable agreement with the experimental one (−35 Å3). Using the calculated parameters of Δα and Δμ HB (Table 1), as well as experimental values of Δμ, we evaluated the long-range field E′ Δμ by using eq. (5). For four selected proteins the results are presented in Fig. 3d (abscissa, green stars) and for the rest of the proteins – in Supplementary Table S3.

Experimental evaluation of Δμ HB

We also carried out an independent evaluation of the parameter Δμ HB for the subset of 5-HB proteins using experimental measurements. To this end, we considered the dependence of transition frequency on the electric field by separating the effects of structural and pure electronic responses. First, we assume that both short- and long-range electric fields perturb molecular structure such that the alternation of the single and double bond lengths changes accordingly. To obtain the dependence of transition frequency on the field and hence on Δμ, the resonance theory was applied in the simple two-forms two-states (2F2S) implementation40,41,42 adapted to the GFP chromophore in17,18,19,28,43. This model predicts the quadratic dependence of the optical transition frequency on the value of Δμ, Ref. 19. We then include the pure electronic interaction of the chromophore system with the long-range electrostatic field through the second-order Stark effect. As a result, the following dependence of the transition frequency on Δμ was obtained (see Supplementary Discussion):

with

where is the parameter describing the coupling between the two resonating forms19, , η is a dimensionless parameter whose physical meaning consists in the “screening” of the charge-transfer transition via redistribution of those electrons clouds which are not directly involved in the electron transfer related to Δμ19,44, μ is the transition dipole moment between the ground and excited states, Δμ c is a constant, corresponding to the difference between the dipole moments in the excited and ground states in a particular chromophore geometry corresponding to the effective bond length alternation equal to 0 (see Supplementary Fig. S2). The three coefficients, A, B and C were found from the best fit of the vs Δμ dependence for a subset of proteins with 5 HBs in the cluster to a second order polynomial (see Supplementary Fig. S3). Since there are four unknown parameters ( , δ, Δμ c and ΔΔμ HB ) in the system of three equations (7)–(9), we use an additional equation which is obtained by considering the 2F2S model in the zero field (vacuum), i.e. E’ Δμ = 0 and Δμ = Δμ 0 . This equation reads (see Supplementary Discussion):

where = 20725 cm−1 is the transition frequency in vacuum29. The resulting experimental parameters involved in the model were obtained by solving eqs. (7)–(10), (See Supplementary Discussion) and are presented in Table 2. Most of them were also independently calculated quantum-mechanically (Table 2) and agree well with the experiment.

Table 2 Experimental and calculated model parameters, describing the transition frequency as a function of Δμ in a series of FPs where the effective chromophore is presented as a cluster containing the HBDI− molecule and five hydrogen-bonded groups, i.e. representing mTFP- and EGFP-derived mutants (see Fig. 4). Full size table

Combining the values of Δμ HB = 3.4 D and Δα = −35 Å3, both found experimentally, the fields E ′ Δμ were calculated according to (5) (Supplementary Table S3).

Comparison of the long-range fields, obtained from Δμ and from MD simulations

To check these two sets of data (with Δμ HB and Δα either calculated or found from experimental measurements) against the MD simulations, the same group of mutants was selected as in Fig. 3(a)–(b). Now the charges on the chromophore and 5 amino acids/water molecules (4 in case of citrine) around it were disregarded in calculating the potentials. Figure 3(c) shows the variation of potentials along the straight line connecting the CE2-CD2-CG2-CB2-CA2 chain of atoms, which provided the fields for all the proteins.

Figure 3(d) compares the MD simulated fields (y-axis) with the fields obtained from model eq. (5). The correlation between the MD simulations and the model is good (particularly for the calculated Δμ HB and Δα, green stars). For citrine it is improved significantly, as compared to the case of the total effective field, cf. Figure 3(b). The long-range electric fields inside the proteins under study vary in the range from ~1.6 MV/cm in citrine to −13 MV/cm in mTFP 0.7. This variation is smaller than that of the total effective field, Fig. 3(b), which emphasizes the importance of the short-range interactions in the wide spectral tuning of the FP. The direction of the long-range field, found in the four mutants under study is opposite to that of Δμ, i.e. E ′ Δμ is directed from the phenolate to the imidazolinone. Note that in the 5-HB proteins, mTFP0.7, mTFP1.0 and EGFP, the field was shifted to more negative values by ~9 MV/cm, when the cluster was discarded (cf. Figure 3 (b,d), blue stars). This is because the positively charged R96 residue (EGFP notation), being a part of the cluster in the three proteins, created the short-range field in the positive direction, i.e. from imidazolinone to phenolate, see Fig. 4.

In the case of citrine, the long-range field value is much better reproduced by MD simulations and it is also much closer to that of EGFP, as compared to what was observed for the total effective field, cf. Figure 3(d,b). This reflects the fact that the large part of structural changes between the two mutants occur at a short range. In particular, the reduced number of hydrogen bonds in citrine vs. EGFP results in a significant decrease of Δμ, from 3.6 D (EGFP) to 1.5 D (citrine), which can be simulated by a large increase of the short-range contribution to the effective field. Note that the much lower value of Δμ in citrine is consistent with the previous observation of the smaller static first hyperbolarizability in EYFP compared to EGFP45,46. A little upward shift of the long-range field, by ~5 MV/cm, in citrine vs. EGFP, cf. green stars in Fig. 3(d), can be tentatively attributed to the dipole-induced contribution of the Tyr203 phenol group.

Field-induced broadening of the absorption spectrum

Simple inspection of Fig. 1 suggests that the blue shift of absorption in the series of mutants is accompanied by systematic broadening of the spectrum. Since the more blue-shifted spectra correspond to larger Δμ (Fig. 2), we hypothesized that the broadening is due to Stark-induced variations of transition frequencies in response to stochastic fluctuations of local fields. If we assume that the field’s fluctuations obey the normal distribution with the standard deviation δE Δμ and that δE Δμ does not vary much from one protein to another (which is supported by MD simulations, see Supplementary Tables S4 and S5), then larger Δμ should produce a larger spread of transition frequencies. In this case, the field-dependent part of the spectral broadening, , can be obtained by using Eqs. (3) and (2),

If, in addition to , there is also a field-independent broadening, due to e.g. low-frequency vibrations, which is described by a Gaussian envelope with standard deviation γ, then the total linewidth of the 0–0 transition will read:

Figure 5 shows an experimental correlation of w2 and Δμ2. From the slope and intercept of the linear regression, we obtain δE Δμ = (5 ± 0.3) MV/cm and γ = 270 ± 25 cm−1, respectively. The first value compares well with the standard deviation of the fields fluctuations found in MD simulations (~6–7 MV/cm, see Supplementary Tables S4 and S5). The second value is close to the linewidth of the 0–0 transition of the bare chromophore in vacuum (310 ± 45 cm−1)29. This justifies our assumption that the protein-to-protein variations of spectral widths can be explained by the electric field-induced variations of the Stark shifts of individual transitions.

Figure 5 Dependence of the square of the spectral width (standard deviation of the Gaussian envelope, see Methods) of the 0–0 1PA transition on the square of the change of permanent dipole moment for a series of 26 GFPs. The colors of symbols are the same as in Fig. 2. Full size image

Specific case: HBDI− chromophore in water. Solvent reaction field E = E R

The 0–0 transition of the HBDI− in D 2 O solution is significantly shifted to higher frequencies compared to all the proteins and the corresponding Δμ value reaches the largest value, 6.5 D, Fig. 2. Several high-level quantum calculations predict similar Δμ values for HBDI− in water (Supplementary Table S1). The CASPT2/CASSCF method treating solvent molecules explicitly16,22 provides a very good agreement with the experiment. Simulation of the solvent with a polarized continuum model gives a slightly overestimated result, i.e. 8.6 D24. Our own quantum mechanical calculations show that the addition of five hydrogen-bonded water molecules to a bare chromophore (three at the phenolate oxygen and two at the imidazolinone oxygen, cf. Ref. 22) slightly increases Δμ, from 4.4 D (in vacuum, both experimental and theoretical) to 5.2 D. Therefore the increase of Δμ from 4.4 D in vacuum (from 5.2 D in cluster) to Δμ = 6.5 D in bulk water may be attributed to the effect of a polarizable water solution around the chromophore (or cluster). The corresponding part of the dipole moment change, Δμ ind = 2.1 D (1.3 D), induced by the reaction field E R of the polarized solvent, can be presented as follows, cf. (1),

Here both Δμ ind and E R represent the projections of the corresponding vectors on the direction of Δμ 0 (Δμ HB ). Using experimental values of Δα = −35 Å3 and Δμ ind = 2.1 D (1.3 D), we find E R = −18 MV/cm (−11 MV/cm).

For comparison, we also use the classical polarizable dielectric model to estimate the reaction field of the solvent. In this model, the field E R is created by a chromophore with the ground-state dipole moment μ g and polarizability α g , that is placed in the center of a spherical cavity with radius a possessing the internal dielectric constant ε in , which in turn is immersed in a solvent with the dielectric constant ε. The field is assumed to be the same for the ground and excited states of a molecule and is presented as47:

where f is the cavity field factor, described as47,48:

An estimation, using ε = 79 and ε in = 247,48, as well as experimentally determined parameters (see Supplementary Discussion): a = 4.7 Å (for the cluster of the HBDI− chromophore with 6 hydrogen-bonded water molecules), α g = 22 Å3 and literature value μ g = −11 ± 2 D22,31,49, yields E R = −17 MV/cm. The latter value agrees well with those obtained above from the field-induced dipole moment change in cluster (−11 MV/cm) and in vacuum (−18 MV/cm). Due to symmetry reasons, the field E R is directed along the μ g vector. Since in HBDI− chromophore the permanent dipole moment decreases upon excitation20,50, the Δμ is antiparallel to μ g and therefore E R is antiparallel to Δμ. This means that E R found for water has the same direction (in the chromophore frame) and similar amplitude as the long-range field E’ Δμ inside the mTFP0.7 protein.

In conclusion, we have evaluated local electric fields in a series of fluorescent proteins with the same anionic GFP chromophore by using simultaneous measurements of 1PA and 2PA cross sections and the 0–0 transition frequency. In the first approximation, we considered the chromophore interacting with the rest of the protein and obtained the total effective protein field. In the second, - we considered a subset of FPs with similar local surrounding and selected a hydrogen-bonded cluster interacting with further-separated layers of the protein, which made it possible to obtain the long-range part of the field. Independent evaluations of the total and the long-range fields complement each other and provide the local, short-range field. We have shown that both total and long-range fields are well reproduced by MD simulations, which justifies our method of measurement. Our model quantitatively explains the spectral shifts of yellow and teal FPs versus EGFP as well as the spectral position of the isolated chromophore in water solution.