First published Wed Sep 12, 2001; substantive revision Thu Aug 15, 2019

The equation \(E = mc^2\) is, arguably, the most famous equation in 20 th century physics. To appreciate what Einstein’s famous result is about, and what it is not about, we begin in Section 1 with a description of the physics of mass-energy equivalence. In Section 2, we survey six distinct, though related, philosophical interpretations of mass-energy equivalence. We then discuss, in Section 3, the history of derivations of mass-energy equivalence and its philosophical importance. Section 4 is a brief and selective account of empirical confirmation of Einstein’s result that focuses on Cockcroft and Walton’s (1932) first confirmation of mass-energy equivalence and a more recent, and very accurate, confirmation by Rainville et al. (2005).

The two main philosophical questions surrounding Einstein’s equation, which are the focus of this entry, concern how we ought to understand the assertion that mass and energy are in some sense equivalent and how we ought to understand assertions concerning the convertibility of mass into energy (or vice versa).

Over a century after Einstein’s first derivation of mass-energy equivalence, as his famous result is called because one can select units in which one can express it with an equation of the form \(E = m\), the result continues to receive outstanding empirical support. Furthermore, as the physicist Wolfgang Rindler has pointed out, the result “has been found applicable and valid in many branches of physics, from electromagnetism to general relativity” (Rindler 1991, p. 74). Thus, from Rindler’s perspective, which is shared by many physicists, mass-energy equivalence “… is truly a new fundamental principle of physics” (Rindler 1991, p. 74).

Einstein correctly described the equivalence of mass and energy as “the most important upshot of the special theory of relativity” (Einstein 1919), for this result lies at the core of modern physics. Many commentators have observed that in Einstein’s first derivation of this famous result, he did not express it with the equation \(E = mc^2\). Instead, Einstein concluded that if an object, which is at rest relative to an inertial frame, either absorbs or emits an amount of energy \(L\), its inertial mass will correspondingly either increase or decrease by an amount \(L/c^2\). In Newtonian physics, inertial mass is construed as an intrinsic property of an object that measures the extent to which an object resists changes to its state of motion. So, Einstein’s conclusion that the inertial mass of an object changes if the object absorbs or emits energy was revolutionary and transformative. For as Einstein concluded “If the theory agrees with the facts, then radiation transmits inertia between emitting and absorbing bodies” (Einstein, 1905b). Yet, in Newtonian physics, inertia is not the kind of thing that can be transmitted between bodies.

1. The Physics of Mass-Energy Equivalence

A lot of philosophical attention has been paid to how the meaning of statements concerning lengths and times changes in the context of special relativity. For example, we have learned from Stein (1968) that a statement such as “the length of the table is 2m,” while perfectly meaningful in the context of Newtonian physics, is either elliptical at best or meaningless at worst in the context of special relativity. However, comparatively little philosophical attention has been paid to how special relativity brings about corresponding changes to the third fundamental dimension in basic physics: mass. Yet, assertions concerning mass, and other dynamical concepts in mechanics into which it figures centrally, notably momentum and kinetic energy, also have their meanings changed in special relativity.

To illustrate these changes, Section 1.1 reviews the concepts of mass, momentum and kinetic energy in Newtonian physics. In Section 1.2, we begin exploring the concepts of mass and energy in special relativity and state the notation we use in all further discussions, a step we must make explicit as discussions of the equivalence of mass and energy have historically used different notations, which can lead to conceptual confusion. In Section 1.3, we discuss the physical significance of mass-energy equivalence as it applies to the analysis of bodies (construed as indivisible wholes) and idealized versions of composite systems, while in Section 1.4 we discuss mass and energy in atomic physics. In Section 1.5, we present an elementary derivation of Einstein’s famous result with the hope that our readers will get a sense of “why” mass and energy are equivalent according to special relativity. Finally, in Section 1.6, we consider the relationship between the equivalence of mass and energy codified in the equation \(E_o = mc^2\) and the nearly identical, though conceptually quite different, ubiquitous equation \(E = mc^2\).

1.1 Review of Mass, Momentum, and Kinetic Energy in Newtonian Physics

In Newtonian physics, a typical physical object, such as a billiard ball, has associated with it a positive, real number called its mass, for which the symbol \(m\) is commonly used. When we are focusing on describing mathematically the motion of this object under the action of contact forces, such as when one tries to predict where a billiard ball will go after a collision, the mass in question is also called the inertial mass. In the special case in which one considers how the billiard ball moves in response to gravity, say as it falls toward the earth, the measure of how the billiard ball “responds” to the gravitational field is called the gravitational mass of the object. Using experiments in which he filled wooden boxes with different materials and suspended them from strings to construct pendulums, Newton discovered that inertial mass and gravitational mass are directly proportional. Physicists have since customarily treated inertial and gravitational mass as numerically equal.

The inertial mass in Newtonian physics (and even the gravitational mass) is routinely interpreted as an intrinsic property of an object. The inertial mass of an object is a measure of the body’s inertia, i.e., its tendency to resist changes to its state of motion in response to the action of any kind of force. Since at least the late nineteenth century and Mach’s criticisms of Newton’s physics, physicists have deprecated thinking of mass as the “quantity of matter.”

The physical intuition behind the Newtonian concept of inertial mass is basic if we can avail ourselves of the notion of a Newtonian force. Given two bodies \(B_1\) and \(B_2\) in suitably idealized conditions, if it takes twice the force for \(B_2\) to attain the same final velocity as \(B_1\), \(B_2\) has twice the mass of \(B_1\), i.e., \(B_2\) has twice the inertia of \(B_1\). In Newtonian physics, the inertial mass of an object, and even its gravitational mass, can only change by either physically removing a part of the object or attaching a part to the object to make a bigger whole.

Two important quantities for describing the motion of objects in Newtonian physics are momentum and kinetic energy. Unlike mass, each of these quantities is what we might call a relational, or extrinsic, quantity. Although all physical objects have some value (or amount) of momentum and kinetic energy, the value of each of these quantities depends on the inertial reference frame relative to which one is measuring these quantities.

Newton called momentum the “quantity of motion,” which is an apt label, because, very roughly, it is a measure of the extent to which an object is moving relative to an inertial reference frame. Our colloquial use of the word “momentum” is related to this intuition and to its formal definition as the product of the mass of an object times its velocity relative to some inertial reference frame. If object \(B_2\) has twice the mass as \(B_1\) but moves with the same velocity relative to a reference frame, \(B_2\) has twice the momentum.

Velocity, however, is a directed quantity, because it codifies not just the speed \(v\) with which the object moves, but also its direction. Velocity is formally represented by a vector \(\mathbf{v}\), the magnitude of which is the speed of the object \(v\), which we sometimes also call the “velocity” of the object while allowing the context to indicate that this is elliptical for “the magnitude of the velocity.” Momentum is therefore also a directed quantity, represented formally by a vector \(\mathbf{p}\). To paraphrase Taylor and Wheeler (1992, p. 191), the direction of momentum matters: A glancing blow is never as damaging as one that is head on. By analogy with velocity, the magnitude of the momentum is represented by the letter \(p\).

Finally, in Newtonian mechanics, an object in motion relative to a reference frame also has kinetic energy, or energy of motion. Kinetic energy, like all forms of energy, can be transformed into another kind of energy. So, for example, one could measure the kinetic energy of a moving billiard ball by having it collide with a ball of soft putty or a spring that brings the ball to rest and measuring the energy absorbed by the putty or spring. There is no single standard symbol for kinetic energy, though most elementary physics textbooks use \(KE\), more advanced books tend to use \(T\).

Unlike the momentum, kinetic energy is not a directed quantity. It is, like momentum, a function of the speed \(v\) of the object relative to an inertial frame. The faster an object moves relative to some inertial frame, the more momentum and kinetic energy it has. However, for an object that is not accelerating, i.e., an object that covers equal spaces in equal times, one can always find an inertial reference frame in which both the momentum and the kinetic energy of the object is zero.

Following Griffiths’ approach, we might say that so far we have merely defined some quantities (Griffiths 1999, p. 509 ff.). The physics really lies in the three corresponding conservation principles associated with these quantities: the principle of conservation of mass, the principle of conservation of energy, the principle of conservation of (linear) momentum. These principles contain the physics, because each one states that a certain quantity, mass, energy, or momentum, is conserved in all interactions. So, for example, if we consider the collision of two billiard balls, one can use the conservation of momentum to predict, given the initial motions and masses of the billiard balls, how they will move after the collision.

1.2 Mass and Energy in Relativity: Preliminaries and Notation

As Hecht has emphasized, Einstein never wrote down his famous result using the symbols “\(E\)” and “\(m\)” as they appear in the famous equation attributed to him (Hecht 2012). Although part of the reason is merely that Einstein used different letters for energy, mass, and the speed of light in his early papers discussing mass-energy equivalence (in 1905 and 1906), deeper reasons related to how we should understand the result quickly emerged.

In his review article on special relativity from 1907, Einstein shows that a body of mass \(\mu\) that has absorbed an amount of energy \(E_o\) as measured in its rest frame executes motion, in an inertial frame relative to which it moves with some velocity, as if its mass \(M\) was given by the expression (Einstein 1907b, p. 286):

\[ M = \mu + E_o /c^2 \]

In a footnote, Einstein explains the convention, which he had already adopted in an earlier paper (Einstein 1907a, p. 250), of using “the subscript ‘\(o\)’ to indicate that the quantity in question refers to a reference system that is at rest relative to the physical system considered” (Einstein 1907b, p. 286).

After 1907, Einstein’s notation crystalizes so that by 1921, in his Princeton Lectures, Einstein expresses his famous result by writing (Einstein 1922, p. 46):

\[\tag{Einstein's Equation} E_o = mc^2 \]

Our main task in the next section is to explain the physical significance of the equation \(E_o = mc^2\), which we will henceforth call “Einstein’s equation”, and its relationship to its iconic variant without the subscript “\(o\)” appearing beside the letter “\(E\)”.

However, it is important before going much further to note that in Einstein’s equation \(E_o = mc^2\), the symbol \(m\) is the mass of an object as measured in the inertial frame in which that object is at rest. Physicists also call this mass \(m\) the “rest-mass” of the object. The rest-mass of an object is numerically equal to its Newtonian inertial mass, though arguably the symbol \(m\) (or the corresponding term “mass”) has different meanings in Newtonian and relativistic physics (see, e.g., Kuhn 1962, p. 101 ff. and Torretti 1990, p. 65 ff.).

Furthermore, there are conceptual reasons why both Einstein and many contemporary physicists do not add a subscript “\(o\)” to \(m\) when denoting rest-mass. For example, Taylor and Wheeler argue that the phrase “rest-mass” engenders potential confusion as it might lead the reader to ask: What happens to the rest-mass when the object is moving? Answer: nothing (Taylor and Wheeler 1992, p. 251). The rest-mass of an object is an invariant quantity in special relativity; it has the same value for all inertial observers. Taylor and Wheeler quip: “In reality mass is mass is mass” (Taylor and Wheeler 1992, p. 251). From their perspective, which is quite standard now in physics textbooks and articles, there is no need to prefix the term “mass” with “rest,” because there is no other kind of mass worth speaking about in special relativity (see Section 1.6).

Henceforth, we will adopt the accepted convention, though it is admittedly not universally followed especially in older sources, of using the following symbols with their stated meanings (see Table 1):

Symbol Meaning \(E\) The total energy of a physical system, unless otherwise indicated \(E_o\) The rest energy of a physical system or an amount of energy as measured in the rest frame of an object (typically energy that is emitted or absorbed by an object) \(m\) The mass (i.e., rest-mass) of a physical system Table 1. Notation used in this entry

We will first focus on the mechanics of idealized point particles in special relativity. Regardless of the macroscopic objects they represent, we shall treat such particles, and by extension the corresponding physical objects, as un-analyzable wholes. We will then separately discuss “composite systems” or systems composed of such particles. Physicists use composite systems to approximate physical objects when they are interested in examining the “inner workings” of such objects, such as when they treat a gas as a collection of idealized particles.

1.3 The Physical Significance of Einstein’s Equation

Following Geroch, we can begin to explain the physical significance of Einstein’s equation \(E_o = mc^2\) by considering a very simple physical system. Imagine, as Geroch suggests, a brick being heated or a battery being charged (Geroch 2005, p. 198). Suppose we consider these objects in the inertial reference frame in which they are at rest. As the brick is heated, or the battery charged, it absorbs an amount of energy \(E_o\) as measured in its rest frame. Einstein’s equation tells us that the mass of the brick, or battery, after it absorbs an amount of energy \(E_o\) is increased exactly by an amount \(E_o /c^2\). The value of the mass of the hot brick or charged battery is greater than it was before it absorbed energy. So, for example, it takes just a little bit more force to move the charged battery than it did to move the uncharged battery. How much is “a little bit more”?

Geroch has a clever way to answer this question. Suppose we use the amount of energy \(E_o\), as measured in the rest-frame of the battery, to accelerate the battery (instead of charging it) so that it eventually reaches a final speed of 670 mph. Since 670 mph seems like something moving pretty fast, at least by the non-relativistic standards of human travel, one might think that one is using quite a bit of energy to accelerate the battery to that speed. Now suppose that instead of using the energy \(E_o\) to accelerate the battery, we use that same amount of energy \(E_o\) to charge the battery. The increase in the mass of the charged battery is \(E_o /c^2\), but because the speed of light is such a large number, approximately 670 million mph, “the mass of the battery would be increased by about 1 part in a million-million (i.e., by a fraction \(10^{-12}\)…)” (Geroch 2005, p. 199).

There is nothing unique about energy absorption in these examples. As the battery loses energy, say by powering a device, or the brick emits thermal energy as it cools, its mass decreases. Consequently, imagine a suitably idealized closed system in which two objects \(B_1\) and \(B_2\) are in a state of relative rest. If \(B_1\) radiates an amount of energy \(E_o\) and \(B_2\) fully absorbs that same amount of energy, the mass of \(B_1\) decreases by \(E_o /c^2\) while the mass of \(B_2\) increases by the same amount. This physical change in the masses of \(B_1\) and \(B_2\) is a novel prediction in special relativity. For in Newtonian physics, there is no relationship at all between the inertial mass of a body and the amount of energy it radiates or absorbs. This is why Einstein was led to conclude that “If the theory agrees with the facts, then radiation transmits inertia between emitting and absorbing bodies” (Einstein 1905b). If special relativity is supported by empirical evidence, the inertial mass of an object can change, not because we have chopped off a piece of the object or attached more stuff to it, but merely because the object has radiated or absorbed energy. To physicists and philosophers trained exclusively in the Newtonian tradition, this result would have seemed perhaps extraordinary but certainly revolutionary.

So far, we have been focusing on what physicists such as Baierlein (2007) call the incremental version of mass-energy equivalence, because we have focused on the strict correlation in special relativity between a change in the rest energy of a body \(E_o\) and a change in its mass \(m\). However, Einstein also emphasized two subtly different “readings” of his equation. First, as early as 1906, Einstein argued that when one considers physical systems in which there are electromagnetic processes, such as a “complex” of light being emitted from the inside wall of a freely-floating box and absorbed by the opposite wall, one could avoid a fundamental conflict with the laws of mechanics “if one ascribes the inertial mass \(E/V^2\) to any energy \(E\)” (Einstein 1906, p. 206. Note that \(V\) represents the speed of light. See also Taylor and Wheeler 1992, p. 254, for a detailed discussion of this example). This insight, sometimes expressed with talk about the “inertia of energy,” was an important step in the development of general relativity, because that theory uses the principle of equivalence, which states very roughly that inertial mass and gravitational mass are directly proportional, as a foundational principle.

At a time when physicists were moving toward regarding energy-carrying fields, such as the electromagnetic field, as entities in their own right, combining the principle of equivalence with the “inertia of energy” led Einstein to the insight that fields themselves could gravitate. So, for example, Einstein claims in 1907 to show that “radiation enclosed in a cavity possesses not only inertia but also weight” (Einstein 1907b, p. 288). A more contemporary and fairly common way of making the same point is to say that a mirror box with perfectly reflecting walls filled with light (conceived here for convenience as an energy-carrying disturbance in the electromagnetic field) is attracted by gravity by an amount that is greater than the rest mass \(M\) of the box itself. Specifically, if the energy of the light in the box is \(E_L\), gravity acts on the box not as if the box has a mass \(M\) but as if the box has a mass \(M + E_L /c^2\). Although the difference is numerically tiny, strictly speaking a balance with an empty mirror box on one side and an identical mirror box filled with light on the other would not be level.

Conversely, Einstein also emphasized reading his famous equation “in the other direction,” as it were. For example, in his review article from 1907, he says, “with respect to inertia, a mass \(\mu\) is equivalent to an energy content of magnitude \(\mu c^2\)” (Einstein 1907b, p. 287). Einstein uses the phrase “energy content” (translated from the original German “Energieinhalt”) to convey a notion that is new in special relativity: A physical object contains within it energy that, at least in principle, could be transformed into other forms of energy such as kinetic energy (as we now know only too well). Because this rest energy is contained within the boundaries of a body when we treat the object as a whole, physicists, such as Rindler, also call it “internal energy” (Rindler 1991, p. 71).

As Rindler suggests, it is not at all an idle or childish question to ask: Where does all that “internal energy” reside (Rindler 1991, p. 75)? Rindler’s answer to this question assumes that we wish to analyze a body all the way down to its subatomic components. He explains:

A very small part of this energy resides in the thermal motions of the molecules constituting the particle, and can be given up as heat; a part resides in the intermolecular and interatomic cohesion forces, and some of that can be given up in chemical explosions; another part may reside in excited atoms and escape in the form of radiation; much more resides in nuclear bonds and can also sometimes be set free, as in the atomic bomb. But by far the largest part of the energy (about 99 per cent) resides simply in the mass of the ultimate particles and cannot be further explained. Nevertheless, it too can be liberated under suitable conditions, e.g., when matter and antimatter annihilate each other (Rindler 1991, p.75).

However, when developing a theoretical description of a macroscopic object, it is seldom practical to construct that description by treating subatomic quantum objects as the fundamental constituents, as Rindler suggests.

Instead, when one analyzes a physical object theoretically by looking at its component parts, one has to decide the level of granularity at which one wishes to analyze the object. For example, if the object we are analyzing is a chunk of iron, and we are studying magnetism, it may be sufficient to analyze the iron by treating its magnetic domains as fundamental constituents. However, a magnetic domain is a chunk of iron that contains many, many atoms. For other purposes, we may elect to analyze the sample of iron all the way down to the atomic level. Of course, the atoms themselves have parts. So, for yet different purposes, we may elect to analyze the sample of iron at the sub-atomic level of quantum objects.

As Rindler’s description of where the rest energy resides makes clear, at each level of analysis, there are two main contributors to the rest energy of the “macroscopic” sample we are analyzing: (1) the energy equivalent of the sum of the rest-masses of the constituent elements considered as fundamental and (2) the sum of the energy “stored” or “carried” by the constituent elements. To perform this kind of analysis for a concrete object is a rather subtle affair. So physicists and philosophers writing about mass-energy equivalence tend to focus on the highly idealized notion of an ideal gas.

For the purposes of this kind of discussion, an ideal gas is composed of molecules that are treated as idealized point particles to each of which we assign a mass (i.e., a rest-mass). The molecules are treated as moving uniformly, i.e., with constant velocity, and as interacting only in perfectly elastic collisions. Consequently, if we consider a sample of gas contained in a massless vessel, the only energy “carried” by the constituent elements is the kinetic energy of the molecules.

When viewed from a Newtonian perspective, and assuming that the vessel containing the gas is itself massless, the mass of the vessel of gas is simply equal to the sum of the masses of the molecules. From a relativistic point of view, this last assertion states incorrectly that the rest-mass of the vessel of gas is equal to the sum of the rest-masses of the molecules. Yet, from a relativistic point of view, the rest-mass of the vessel of gas is equal to the sum of the rest-masses of the molecules plus the kinetic energy of the molecules divided by \(c^2\). Since, according to the Kinetic Theory of Gases, the temperature of the gas is proportional to the average kinetic energy of its molecules, if the gas temperature increases or decreases, the rest-mass of the vessel of gas increases or decreases accordingly by a tiny amount.

Assuming the principle of equivalence, which entails that we can measure inertial mass by measuring gravitational mass with a balance, we can illustrate the difference between the Newtonian and relativistic understanding of the ideal gas as follows. Imagine that we have two otherwise identical massless vessels filled with exactly the same amount and type of gas. In one vessel, the gas is at a temperature very near absolute zero, so its molecules have very little kinetic energy. In the other vessel, the gas is at a temperature of 500° C. Place these two vessels of gas on the ends of a balance. According to Newtonian physics, the balance will be level, because both gas samples have exactly the same mass. According to relativity, the balance will not be level and will be tipped on the side of the hot gas, because the high kinetic energy of the molecules contributes to the rest energy of the gas, which contributes, through Einstein’s equation, to the rest-mass of the vessel of gas.

1.4 Mass and Energy in Atomic Physics

Perhaps the most common examples used to illustrate Einstein’s equation concern collisions among sub-atomic objects. For our purposes, it is safe to treat atomic and sub-atomic objects as particles involved in collisions where the total number of particles may or may not be conserved.

The bombardment of a Lithium nucleus by protons is a historically significant and useful example for discussing mass-energy equivalence in collisions where the number of particles is conserved. Cockcroft and Walton (1932) were the first to observe the release of two \(\alpha\)-particles when a proton \(p\) collides with a \({^7}\Li\) nucleus. The reaction is symbolized as follows:

\[ p + {^7}\Li \rightarrow \alpha + \alpha \]

That the number of particles is conserved in this reaction becomes clear when we recognize that the \({^7}\Li\) nucleus consists of three protons and four neutrons and that each \(\alpha\)-particle consists of two protons and two neutrons.

In the bombardment of Lithium reaction above, the sum of the rest-masses of the reactants (the proton and the \({^7}\Li\) nucleus) is greater than the sum of the rest-masses of the products (the two \(\alpha\)-particles). However, the total kinetic energy of the reactants is less than the total kinetic energy of the products. Cockcroft and Walton’s experiment is routinely interpreted as demonstrating that the difference in the rest-masses of the products and reactants (times \(c^2)\) is equal to the difference in the kinetic energies of the products and reactants (but see Section 4 for further discussion of this experiment as a confirmation of mass-energy equivalence).

Descriptions of collisions among sub-atomic particles such as the bombardment of Lithium make it seem as though one must admit that mass is converted into energy. However, influenced perhaps by the widely-known discussion of mass-energy equivalence by Bondi and Spurgin (1987) (see Section 2.3.1), physicists now explain such reactions not as cases of mass being converted into energy, but merely as cases where energy has changed forms. Typically, in these types of reactions, the potential energy that “contributes” to the rest-mass of one (or possibly) more of the reactants is transformed in a non-controversial way to the kinetic energy of the products. As Baierlein (2007, p. 322) explains, in the case of the bombardment of \({^7}\Li\) with protons and its subsequent decomposition into two \(\alpha\)-particles, the apparently “excess” kinetic energy of the \(\alpha\)-particles did not simply “appear” out of nowhere. Instead, that energy was there all along as the potential and kinetic energy of the nucleons. In other words, one can explain the change in mass and energy in the bombardment reaction by saying (i) that the potential and kinetic energies of the nucleons that make up the \({^7}\Li\) nucleus contribute to its rest-mass and (ii) that the vast amount of energy of the \(\alpha\)-particles was not “created” in the reaction, or “converted” from mass, but was simply transformed from the various forms of energy the nucleons possess.

Collisions among sub-atomic particles and their corresponding anti-particle are not quite so easily explained as merely involving the re-arrangement of particles and re-distribution of energy. The most extreme example of this sort, and one that is often used in the physics literature, is pair annihilation. Consequently, let us consider a collision between an electron \(e^-\) and a positron \(e^+\), which yields two photons \(\gamma\). Symbolically, this annihilation reaction is written as follows:

According to the currently accepted Standard Model of particle physics, electrons and photons are both “fundamental particles,” by which physicists mean that such particles have no structure, i.e., such particles are not composed of other, smaller particles. Furthermore, the photons that are the products in the annihilation reaction have zero rest-mass. Thus, the rest-masses of the incoming electron and positron seems to “disappear” and an equivalent amount of energy “appears” as the energy of the outgoing photons. Of course, Einstein’s famous equation makes all of the correct predictions concerning the relevant masses and energies involved in this reaction. So, for example, the total energy of the two photons is equal to the sum of the kinetic energies of the electron and positron plus the sum of the rest-masses of the electron and positron multiplied by \(c^2\).

Finally, although mass and energy seem to “disappear” and “appear” respectively when we focus on the individual constituents of the physical system containing the incoming electron-positron pair and the outgoing photons, the mass and energy of the entire system remains the same throughout the interaction. Before the collision, the rest-mass of the system is simply the sum of the rest-masses of the electron and positron plus the mass-equivalent of the total kinetic energy of the particles. Consequently, the entire system (if we draw the boundary of the system around the reactants and products—which is, of course, a spatial and temporal boundary), has a non-zero rest-mass prior to the collision. However, after the collision, the system, which now consists of two photons moving in non-parallel directions, also has a non-zero rest-mass (for a detailed discussion concerning the rest-mass of systems of photons, see Taylor and Wheeler, 1992, p. 232).

1.5 Why Does \(E_o\) Equal \(mc^2\)?

A common way in which the question “Why does \(E_o\) equal \(mc^2\)?” is interpreted by philosophers and physicists is that it is a request for a derivation that shows that given certain physical principles, such as the principle of relativity and the principle of conservation of energy, Einstein’s equation is a logical consequence of those assumptions. We will discuss briefly the history of derivation of Einstein’s equation in Section 3.

However, in this section, we wish to present a rather simplified version of just one of Einstein’s derivations, published in 1946 (Einstein 1946). We will follow closely the simplified version of Einstein’s 1946 derivation developed by Ralph Baierlein (1991), who has used his derivation to teach Einstein’s equation to undergraduates who are not science majors. John Norton has used essentially the same simplified derivation in the Cambridge Companion to Einstein (Norton 2014).

As we will see, if Baierlein’s assessment of his own derivation is correct, the price to pay for this particular simplification is that the derivation ceases to be relativistic, in the sense that none of the core principles at the heart of special relativity seem to be required to derive Einstein’s equation. Nevertheless, it may help readers get a “feel” for why \(E_o = mc^2\) and it does familiarize those not already conversant in the methods of relativistic physics with a style of reasoning that is common in that field. Those interested in a presentation of Einstein’s own 1946 derivation, which explicitly shows all of the relativistic assumptions and key approximation steps Einstein takes, may be interested in consulting the exposition by Fernflores (Fernflores 2018, vol. II, §3.3).

Like Einstein, many physicists and philosophers who wish to derive Einstein’s equation do so by considering an idealized physical configuration. Typically, one considers a physical object \(B\) involved in a symmetric physical interaction. For example, in Einstein’s original derivation from 1905, \(B\) emits two equally energetic pulses of light in opposite directions. Mermin and Feigenbaum have shown how one can also derive Einstein’s equation by considering the case where \(B\) emits two physical bodies, instead of pulses of light (Mermin and Feigenbaum 1990).

In Einstein’s 1946 derivation, instead of emitting light, \(B\) absorbs two equally energetic pulses of light symmetrically. The goal in all of these approaches is to perform a “before” and “after” comparison and to show that “after” the body \(B\) absorbs or emits energy, its mass (i.e., rest-mass) increases or decreases according to Einstein’s equation.

Using a common heuristic in relativistic physics, one considers the physical interaction first in the inertial reference frame in which \(B\) is at rest, which is sometimes called the “rest frame.” One then compares the mathematical description of this interaction to the corresponding description from the perspective of a different inertial reference frame that moves with a constant velocity relative to the rest frame. Finally, using dynamical principles, such as the conservation of energy or conservation of momentum, and the two principles at the core of special relativity (i.e., the relativity principle and the light principle), one shows that the dynamical principles require that after \(B\) suffers the interaction in which it absorbs or emits energy, its mass (i.e., rest-mass) must change by an amount given by Einstein’s equation.

From a very general point of view, the reasoning in these types of derivations can be displayed schematically, if somewhat roughly, like this:

For a particular idealized physical interaction, if: certain conservation principles are true, such as the principle of conservation of energy and the principle of conservation of momentum,

the two principles at the core of special relativity are true, i.e., the principle of relativity and the light principle, and

a body, treated as an un-analyzed whole, absorbs or emits an amount of energy \(E_o\) as measured in its rest frame, then: its mass (i.e., rest-mass) increases or decreases by an amount \(E_o /c^2\).

Even at this very general level, one can see the limitations of these approaches. For example, since the analysis is based on a particular physical interaction \(\boldsymbol{I}\), one cannot immediately conclude that a body’s mass will change according to Einstein’s equation in a different type of physical interaction I’, for instance one in which electromagnetism plays no role.

Reasoning along these lines, Ohanian (2009, 2012) has argued that Einstein should not be credited with proving his famous equation. On the other hand, the physicist N. David Mermin (2011, 2012) claims that Ohanian’s demands for what counts as a “proof” in physics are too stringent. Even though we are considering a very specific interaction \(\boldsymbol{I}\), Mermin might say, we can understand that it is such a generic situation that we can confidently expect the result in all circumstances. Einstein himself believed that if \(\boldsymbol{I}\) involved a physical process in which there was an interaction between an electromagnetic field (say in the form of light) and an ordinary physical object, then \(\boldsymbol{I}\) was too specific to support the general conclusion that in all cases, a change in the rest energy of a body is accompanied by a change to its mass (i.e., rest-mass).

With all of these caveats in place, we are almost ready to understand Baierlein’s derivation of Einstein’s equation. Because Baierlein’s derivation involves analyzing a physical interaction in which an object emits light, i.e., electromagnetic radiation, we need first to state two dynamical properties of electromagnetic radiation. First, considered as an electromagnetic wave, like all waves, light carries energy. This is quite familiar to us today at an everyday level, especially because of all the “solar” devices we use. Second, light also carries momentum. As Baierlein reports, in the late nineteenth century, James Clerk Maxwell had already determined that a burst of light of energy \(E\) had momentum \(E/c\). So, for example, if a laser beam strikes a freely-floating mirror in outer space, the collision of light against mirror will impart a finite, non-zero momentum to the mirror. One can calculate exactly how the mirror will have its state of motion changed as a result of this interaction by using the principle of conservation of momentum.

Baierlein asks us to consider an atom that emits two photons of equal energy “back-to-back” in opposite directions. Although Baierlein uses the Planck-Einstein expression \(E = hf\) for the energy of a photon, where \(f\) is its frequency, and \(h\) is Planck’s constant, as he himself points out the right-hand side of that expression does not figure significantly in his derivation. Baierlein’s derivation, he tells us, works in exactly the same way if we just think of an object that emits two “bursts” of light construed as classical electromagnetic radiation. We will adopt this latter approach and consider a body \(B\) that emits two bursts of light in opposite directions, each with an energy \(E/2\) (this is the approach Norton 2014 uses). The total energy emitted by \(B\) is therefore simply \(E\).

We wish now to examine this emission of light by \(B\) from two different inertial frames. The first inertial frame we consider is simply the inertial frame in which \(B\) is at rest. Since we will not be performing many calculations using quantities defined relative to this inertial frame, we will label it \(K'\). The direction of motion of the second inertial frame, which we will label \(K\), is carefully chosen to simplify the calculations.

First, we choose \(K'\) so that \(B\) emits the two bursts of light along \(z'\)-axis in opposite directions. We then choose \(K\) so that \(K\) moves with velocity \(v\) in the negative direction along the \(x'\)-axis. Using standard conventions, this means that an observer at rest in \(K'\) (Alice) judges that the light emitted by \(B\) travels up and down the page. For an observer at rest in \(K\) (Bob), the atom \(B\) moves to the right with velocity \(v\) and the light is emitted toward the right making an angle \(\theta\) with the \(x\)-axis.

According to Baierlein, “symmetry alone requires that the atom \([B]\) remain at rest in Alice’s frame” (1991, p. 170). It also follows directly from this that since \(B\) remains at rest in \(K'\), the velocity of \(B\) does not change in \(K\) after the emission of light. However, as we will see shortly, the momentum of \(B\) does change in both \(K\) and \(K'\) because the light it emits carries away momentum. If we assume the classical definition for the momentum of \(B\) as the product of its mass \(m\) and its velocity \(v\), and \(v\) does not change, it follows that in order for the law of conservation of momentum to be satisfied, the mass (i.e., rest-mass) of \(B\) must change.

Let us now examine the light emission from the perspective of the inertial frame \(K\). Relative to \(K, B\) moves (to the right) with velocity \(v\). Because relative to \(K'\) the bursts of light are collinear, the \(x\)-component of the velocity of the light as measured in \(K\) must be \(v\). However, the velocity of the light also has a vertical component, i.e., a \(z\)-component. This is significant, because we wish to calculate the change in the momentum of \(B\).

Relative to \(K\), the momentum of \(B\) only changes along the \(x\)-direction, because the changes to the momentum along the \(z\)-direction are equal and opposite. Using elementary trigonometry, for one of the bursts of light, the momentum along the \(x\)-direction is:

\[ \frac{E}{2c} \cdot \cos \theta = \frac{E}{2c} \cdot \frac{v}{c}. \]

So, the total momentum of the emitted photons, relative to \(K\), is simply:

\[ \frac{E}{c} \cdot \frac{v}{c}, \]

which, by the principle of conservation of momentum, must be equal to the amount of momentum lost by \(B\).

Now, since the velocity of \(B\) does not change (in either \(K\) or \(K'\)), and if we assume the classical expression for the momentum of \(B\), we have,

\[ m \cdot v = \frac{E}{c^2} \cdot v, \]

or simply,

\[ E = mc^2. \]

Finally, since \(E\) is the amount of energy lost by \(B\) as measured in \(B'\)s rest-frame, we could more perspicuously write:

\[ E_o = mc^2 \]

We have thus shown that when \(B\) emits an amount of energy \(E_o\) (while remaining in its current state of inertial motion, which is a state of rest relative to \(K'\)), the mass (i.e., rest-mass) of \(B\) decreases by an amount \(E_o /c^2\).

Regardless of whether Ohanian (2009, 2011) is correct that Einstein’s own derivations do not constitute a “proof” of mass-energy equivalence because they consider physical configurations that are too specific, a new concern arises when one considers “simplified” versions of derivations of \(E_o = mc^2\) such as the one we have just reviewed. Baierlein concludes his derivation by praising, as one of its merits, that it “requires only the ratio of momentum to energy for electromagnetic radiation” (Baierlein 1991, p. 172). We have already used this aspect of the derivation in our presentation above.

However, Baierlein then goes on to state:

Moreover, this derivation—unlike Einstein’s 1905 derivation—makes no use of Lorentz transformations or other results from the special theory of relativity. In short, by 1873, Maxwell knew everything necessary to derive the equation \(\Delta E = (\Delta m)c^2\). All that was missing was a context of inquiry that would have led him to search for a connection between energy and inertia (Baierlein 1991, p. 172).

This is a remarkable conclusion, for, if correct, it suggests that when we use this kind of simplified version of Einstein’s 1946 derivation, we are not displaying how Einstein’s equation is a consequence of special relativity. By contrast, Einstein’s own 1946 derivation is explicitly relativistic (see Fernflores 2018, vol. II, Sec. 3.3).

1.6 Einstein’s Equation and the Iconic Equation

As we have seen, Einstein’s equation \(E_o = mc^2\) states that whenever there is a change in the rest energy of an object, there is a corresponding change in its mass (i.e., rest-mass). Although we have not exactly demonstrated it, Einstein’s equation is also interpreted as stating that any object with a non-zero mass (i.e., rest-mass) possesses a rest energy (also sometimes called “internal energy”).

However, in relativistic mechanics, i.e., in the study of the motions of idealized point particles that move in accordance with the theory of special relativity, an object’s total energy \(E\), which is defined as the sum of its kinetic energy and its rest energy, is given by the equation:

\[ E = m \gamma(v) c^2, \]

where \(\gamma(v)\) is to so-called “Lorentz factor.” This total energy \(E\) differs from the rest energy \(E_o\) for any object that moves with some velocity \(v\) relative to a given inertial frame. In such an inertial frame, an object will have a non-zero relativistic kinetic energy and its total energy \(E\) is given by the equation above. In the inertial frame in which such an object is at rest, however, the value of \(\gamma(v)\) becomes 1 and the total energy is equal to the rest energy \(E_o\), one might say, precisely because in that inertial frame the object’s relativistic kinetic energy is zero.

In the early and mid twentieth century, some physicists, such as Richard Feynman (1963, Vol. I, Sec. 16-4), defined a new quantity, which they labelled \(m\) and called the “relativistic mass,” as the product of the rest-mass of the object, which they labelled \(m_o\), and the Lorentz factor \(\gamma(v)\) like this:

\[ m = m_o \gamma(v). \]

Using these notational conventions, the iconic equation \(E = mc^2\) is the equation for the total energy \(E\) of a body as a function of its relativitic mass. This equation is not really of interest to us in discussing mass-energy equivalence. Furthermore, physicists today have generally given up this notational convention and deprecate the notion of “relativistic mass,” which, as Griffiths quips, “has gone the way of the two dollar bill” (1999, p. 510 fn. 8).

2. Philosophical interpretations of \(E_o = mc^2\)

There are three main philosophical questions concerning the interpretation of \(E_o = mc^2\) that have occupied philosophers and physicists:

Are mass and energy the same property of physical systems and is that what is meant by asserting that they are “equivalent”? Is mass “converted” into energy in some physical interactions, and if so, what is the relevant sense of “conversion”? Does \(E_o = mc^2\) have any ontological consequences, and if so, what are they?

Interpretations of mass-energy equivalence can be organized according to how they answer questions (1) and (2) above (Flores 2005). As we will see (in Section 2.5), interpretations that answer question (3) affirmatively assume that the answer to question (1) is yes.

The only combination of answers to questions (1) and (2) that is inconsistent is to say that mass and energy are the same property of physical systems but that the conversion of mass into energy (or vice versa) is a genuine physical process. All the other three combinations of answers to questions (1) and (2) are viable options and have been held, at one time or another, by physicists or philosophers as indicated by the examples given in Table 2.

Conversion No Conversion Same Property X Torretti (1996), Eddington (1929) Different Properties Rindler (1977)

(conversion is possible) Bondi & Spurgin (1987) Table 2. Interpretations of mass-energy equivalence

In this section, we will describe the merits and demerits of each of the interpretations in Table 2. Beyond these interpretations, we will also discuss two other types of interpretations of mass-energy equivalence that do not fit neatly in Table 2. First, we will discuss Lange’s (2001, 2002) interpretation, which holds that only mass is a real property of physical systems and that we convert mass into energy when we shift the level at which we analyze physical systems. Second, we will discuss two interpretations (one by Einstein and Infeld, 1938 and the other by Zahar, 1989), which we will call ontological interpretations, that attempt to answer question (3) above affirmatively. However, we begin this section by addressing what has formerly been a fairly common misconception concerning mass-energy equivalence.

2.1 Misconceptions about \(E_o = mc^2\)

Although it is far less common today, one still sometimes hears of Einstein’s equation entailing that matter can be converted into energy. Strictly speaking, this constitutes an elementary category mistake. In relativistic physics, as in classical physics, mass and energy are both regarded as properties of physical systems or properties of the constituents of physical systems. If one wishes to talk about the physical stuff that is the bearer of such properties, then one typically talks about either “matter” or “fields.” The distinction between “matter” and “fields” in modern physics is itself rather subtle in no small part because of the equivalence of mass and energy. Philosophically, to think of fields as stuff is also controversial.

Nevertheless, we can assert that whatever sense of “conversion” seems compelling between mass and energy, it will have to be a “conversion” between mass and energy, and not between matter and energy. Finally, our observation obtains even in so-called “annihilation” reactions where the entire mass of the incoming particles seems to “disappear” (see, for example, Baierlein (2007, p. 323)). Of course, the older terminology of “matter” and “anti-matter” in the description of annihilation reactions does not really help our philosophical understanding of mass-energy equivalence and is perhaps partly to blame for some of the misconceptions surrounding \(E_o = mc^2\).

2.2 Same-property interpretations of \(E_o = mc^2\)

The first interpretation we will consider asserts that mass and energy are the same property of physical systems. Consequently, there is no sense in which one of the properties is ever physically converted into the other.

Philosophers such as Torretti (1996) and physicists such as Eddington (1929) have adopted the same-property interpretation. For example, Eddington states that “it seems very probable that mass and energy are two ways of measuring what is essentially the same thing, in the same sense that the parallax and distance of a star are two ways of expressing the same property of location” (1929, p. 146). According to Eddington, the distinction between mass and energy is artificial. We treat mass and energy as different properties of physical systems because we routinely measure them using different units. However, one can measure mass and energy using the same units by choosing units in which \(c = 1\), i.e., units in which distances are measured in units of time (e.g., light-years). Once we do this, Eddington claims, the distinction between mass and energy disappears.

Like Eddington, Torretti points out that mass and energy seem to be different properties because they are measured in different units. Speaking against Bunge’s (1967) view that their numerical equivalence does not entail that mass and energy “are the same thing,” Torretti explains:

If a kitchen refrigerator can extract mass from a given jug of water and transfer it by heat radiation or convection to the kitchen wall behind it, a trenchant metaphysical distinction between the mass and the energy of matter does seem far fetched (1996, p. 307, fn. 13).

For Torretti, the very existence of physical processes in which the emission of energy by an object is correlated with the decrease in the object’s mass in accordance with Einstein’s equation speaks strongly against the view that mass and energy are somehow distinct properties of physical systems. Torretti continues:

Of course, if lengths and times are measured with different, unrelated units, the ‘mass’… differs conceptually from the ‘energy.’ But this difference can be understood as a consquence of the convenient but deceitful act of the mind by which we abstract time and space from nature (1996, p. 307, fn. 13).

Thus, this footnote in his masterly Relativity and Geometry suggests that, for Torretti, we are misled into using different units for mass and energy merely because of how we perceive space and time. As we have seen, one can use the same units for mass and energy by adopting the convention Torretti himself uses of selecting units in which \(c = 1\) (pp. 88–89). However, it may be useful to remember that merely using the same units for spatial and temporal intervals does not entail that space and time are treated “on a par” in special relativity; they are not, as is evident from the signature of the Minkwoski metric.

The main merit of Torretti’s view is that it takes very seriously the unification of space and time effected by special relativity and so famously announced in the opening lines of Minkowski (1908). It is also consistent with how mass and energy are treated in general relativity.

Interpretations such as Torretti’s and Eddington’s draw no further ontological conclusions from mass-energy equivalence. For example, neither Eddington nor Torretti make any explicit claim concerning whether properties are best understood as universals, or whether one ought to be a realist about such properties. Finally, by saying that mass and energy are the same, these thinkers are suggesting that the denotation of the terms “mass” and “energy” is the same, though they recognize that the connotation of these terms is clearly different.

2.3 Different-properties interpretations of \(E_o = mc^2\)

As we have displayed in Table 2, interpretations of mass-energy equivalence that hold that mass and energy are different properties disagree concerning whether there is some physical process by which mass is converted into energy (or vice versa). Although superficially Lange’s (2001, 2002) interpretation seems to fall in this category, as he certainly treats mass and energy as different properties, he differs from others in this category because Lange explicitly argues that only mass is a real property of physical systems. Consequently, we will discuss Lange’s interpretation separately below (in Section 2.3.3).

We will begin with a discussion of Bondi and Spurgin’s interpretation (in Section 2.3.1). They hold that mass and energy are distinct properties and that there is no such thing as the conversion of mass and energy. We will then discuss Rindler’s interpretation (in Section 2.3.2). He maintains that mass and energy are different properties but that genuine conversions of mass and energy are at least permitted by mass-energy equivalence.

2.3.1 Bondi and Spurgin’s Different-Properties, No-Conversion Interpretation

Bondi and Spurgin’s (1987) interpretation of mass-energy equivalence has been influential especially among physicists concerned with physics education. In an article where they complained about how students often misunderstand Einstein’s famous equation, Bondi and Spurgin argued that Einstein’s equation does not entail that mass and energy are the same property any more than the equation \(m = \varrho V\) (where \(m\) is mass, \(V\) is volume, and \(\varrho\) is density) entails that mass and volume are the same. Just as in the case of mass and volume, Bondi and Spurgin argue, mass and energy have different dimensions. Ultimately, this reduces to a disagreement with philosophers such as Torretti who would argue that time, as a dimension, is no different than any one of the spatial dimensions. Note well that this is not an issue about the units we use for measuring mass (or energy).

Everyone agrees that according to special relativity one can measure spatial intervals in units of time. We can do this because of the postulate of special relativity that states that the speed of light has the same value in all inertial frames. If we perform what amounts to a substitution of variables and take our spatial dimensions to be \(x_n^* = x_n /c\), where \(c\) is the speed of light and \(n = 1, 2, 3\), then we select units in which \(c = 1\).

However, one can consistently use units in which \(c = 1\) and hold that there is nevertheless a fundamental distinction between space and time as dimensions. On such a view, which is the view that Bondi and Spurgin seem implicitly to be defending, while time is distinct from any given spatial dimension, the contingent fact that \(c\) has the same value in all inertial frames allows us to perform the relevant substitution of variables. However, it does not follow from this that we ought to treat time on a par with any spatial dimension, or that we ought to treat the saptio-temporal interval as more fundamental (in the way Torretti does).

In their influential article, Bondi and Spurgin then examine a variety of cases of purported conversions of mass and energy. In each case, they show that the purported conversion of mass and energy is best understood merely as a transformation of energy. In general, Bondi and Spurgin argue, whenever we encounter a purported conversion of mass and energy, we can always explain what is taking place by looking at the constituents of the physical system in the reaction and examining how energy is proportioned among the constituents before and after the reaction takes place.

Explanations of purported “conversions” along the lines suggested by Bondi and Spurgin are now commonplace in the physics literature. These explanations have the merit of emphasizing that in many cases the mysteries of mass-energy equivalence do not concern one physical property magically being transfigured into another. However, the Bondi-Spurgin interpretation of mass-energy equivalence has the demerit that it fails to address reactions such as the electron-positron annihilation reaction. In such reactions, not only is the number of particles not conserved, but all of the particles involved are, by hypothesis, indivisible wholes. Thus, the energy liberated in such reactions cannot be explained as resulting from a transformation of the energy that was originally possessed by the constituents of the reacting particles. Of course, Bondi and Spurgin may simply be hoping that physics will reveal that particles such as electrons and positrons are not indivisible wholes after all. Indeed, they may even use annihilation reactions combined with their interpretation of mass-energy equivalence to argue that it cannot be the case that such particles are indivisible. Thus, we witness here explicitly just how closely related interpretations concerning mass-energy equivalence can be to views concerning the nature of matter.

The second demerit of the Bondi-Spurgin interpretation, which it shares with all other interpretations of mass-energy equivalence that hold that mass and energy are different properties, is that it remains silent about a central feature of physical systems it uses in explaining apparent conversions of mass and energy. In order to explain purported conversions along the lines suggested by Bondi-Spurgin, one must make the familiar assumption that the energy of the constituents of a system, be it potential energy or kinetic energy, “contributes” to the rest-mass of the system. Thus, for example, in the bombardment of \({^7}\Li\) reaction Bondi and Spurgin must explain the rest-mass of the \({^7}\Li\) in the familiar way, as arising from both the sum of the rest-masses of the nucleons, and the mass-equivalents of their energies. However, the Bondi-Spurgin interpretation offers no explanation concerning why the energy of the constituents of a physical system, be it potential energy or kinetic energy, manifest itself as part of the inertial mass of the system as a whole. Of course, one can always reply that even to ask for this type of explanation is to refuse to accept relativistic thinking fully: the potential and kinetic energies of the constituents contributes to the rest energy of the whole, and because of Einstein’s equation, contributes to the rest-mass of the whole.

As we shall see, Rindler’s interpretation of mass-energy equivalence attempts to address the first demerit of the Bondi-Spurgin interpretation, while Lange’s interpretation brings to the foreground that the energy of the constituents of a physical system “contributes” to that system’s inertial mass.

2.3.2 Rindler’s Different-Properties, Conversion Interpretation

Rindler’s interpretation of mass-energy equivalence is a slightly, though importantly, modified version of the Bondi-Spurgin interpretation. Rindler (for example, in 1977), agrees that there are many purported conversions that are best understood as mere transformations of one kind of energy into a different kind of energy.

However, for Rindler, there is nothing within special relativity itself that rules out the possibility that there exists fundamental, structureless particles (i.e., particles that are “atomic” in the philosophical sense of the term). If such particles exist, it is possible according to Einstein’s equation that some or all of the mass of such particles “disappears” and an equivalent amount of energy “appears” within the relevant physical system. Thus, Rindler seems to be suggesting that we should confine our interpretation of mass-energy equivalence to what we can deduce from special relativity. Thus, we should hold that Einstein’s equation at least allows for genuine conversions of mass into energy, in the sense that there may be cases where a certain amount of inertial mass “disappears” from within a physical system and a corresponding amount of energy “appears.” Furthermore, in such cases we cannot explain the reaction as merely involving a transformation of one kind of energy into another.

The merit of Rindler’s interpretation is that it confines the interpretation of Einstein’s equation to what we can validly infer from the postulates of special relativity. Unlike the interpretation proposed by Bondi and Spurgin, Rindler’s interpretation makes no assumptions about the constitution of matter but leaves that for future physics to determine.

2.3.3 Lange’s One-Property, No-Conversion Interpretation

Lange (2001, 2002) has suggested a rather unique interpretation of mass-energy equivalence. Lange begins his interpretation by arguing that rest-mass is the only real property of physical systems. This claim by itself suggests that there can be no such thing as a physical process by which mass is converted into energy, for as Lange asks “in what sense can mass be converted into energy when mass and energy are not on a par in terms of their reality?” (2002, p. 227, emphasis in original). Lange then goes on to argue that a careful analysis of purported conversions of mass-energy equivalence reveals that there is no physical process by which mass is ever converted into energy. Instead, Lange argues, the apparent conversion of mass into energy (or vice versa) is an illusion that arises when we shift our level of analysis in examining a physical system.

Lange seems to use a familiar argument from the Lorentz invariance of certain physical quantities to their “reality.” For Lange, if a physical quantity is not Lorentz invariant, then it is not real in the sense that it does not represent “the objective facts, on which all inertial frames agree” (2002, p. 209). Thus Lange uses Lorentz invariance as a necessary condition for the reality of a physical quantity. However, in several other places, for example when Lange argues for the reality of the Minkowski interval (2002, p. 219) or when he argues for the reality of rest-mass (2002, p. 223), Lange implicitly uses Lorentz invariance as a sufficient condition for the reality of a physical quantity. However, if Lange adopts Lorentz-invariance as both a necessary and sufficient condition for the reality of a physical quantity, then he is committed to the view that rest energy is real for the very same reasons he is committed to the view that rest-mass is real. Thus, Lange’s original suggestion that there can be no physical process of conversion between mass and energy because they have different ontological status seems challenged.

As it happens, Lange’s overall position is not seriously challenged by the ontological status of rest energy. Lange could easily grant that rest energy is a real property of physical systems and still argue (i) that there is no such thing as a physical process of conversion between mass and energy and (ii) that purported conversions result from shifting levels of analysis when we examine a physical system. It is his observations concerning (ii) that force us to face again the question of why the energy of the constituents of a physical system manifests itself as the mass of the system, though admittedly the question itself may simply reveal a failure fully to appreciate a relativistic description of composite systems.

One of the main examples that Lange uses to present his interpretation of mass-energy equivalence is the heating of an ideal gas, which we have already considered above (see Section 1.3). He also considers examples involving reactions among sub-atomic particles that, for our purposes, are very similar in the relevant respects to the example we have discussed concerning the bombardment and subsequent decomposition of a \({^7}\Li\) nucleus. In both cases, Lange essentially adopts the minimal interpretation we discussed in Section 1.3. In the case of the ideal gas, as we have seen, when the gas sample is heated and its inertial mass concurrently increases, this increase in rest-mass is not a result of the gas somehow being suddenly (or gradually) composed of molecules that are themselves more massive. The rest-mass of any individual molecule does not change. It is also not a result of the gas suddenly (or gradually) containing more molecules. Instead, the increased kinetic energy of the molecules of the gas constitutes an increase to the rest energy of the gas sample which, through Einstein’s equation, manifests as an increase in the gas sample’s inertial mass. Lange summarizes this feature of the increase in the gas sample’s inertial mass by saying:

… we have just seen that this “conversion” of energy into mass is not a real physical process at all. We “converted” energy into mass simply by changing our perspective on the gas: shifting from initially treating it as many bodies to treating it as a single body [emphasis in original] (p. 236, 2002).

Unfortunately, Lange’s characterization threatens to leave readers with the impression that if “we” had not shifted our perspective in the analysis of the gas, no change to the inertial mass of the gas sample would have ensued. Of course, it is unlikely that Lange means this. Lange would likely agree that even if no human beings are around to analyze a gas sample, the gas sample will respond in any physical interaction differently as a whole after it has absorbed some energy precisely because its inertial mass will have increased.

The merits of Lange’s view concerning the “conversion” of mass-energy equivalence are essentially the same as the merits of both the Bondi-Spurgin interpretation and Rindler’s interpretation. All these interpretations agree that there are important cases where we have now learned enough to assert confidently that purported “conversions” of mass and energy are merely cases where energy of one kind is transformed into energy of another kind. Aside from the comparatively minor issue concerning the “reality” of rest energy, the main demerit of Lange’s view is that it might potentially mislead unsuspecting readers.

2.4 Interpretations of \(E_o = mc^2\) and hypotheses concerning the nature of matter

The relationship between mass-energy equivalence and hypotheses concerning the nature of matter is twofold. First, as we have suggested, some of the interpretations of mass-energy equivalence seem to assume implicitly certain features of matter. Second, some philosophers and physicists, notably Einstein and Infeld (1938) and Zahar (1989), have argued that mass-energy equivalence has consequences concerning the nature of matter. In this section, we will discuss the first of these two relationships between \(E_o = mc^2\) and hypotheses concerning the nature of matter. We discuss the second relationship in the next section (Section 2.5).

To explain how some interpretations of mass-energy equivalence rest on assumptions concerning the nature of matter, we need first to recognize, as several authors have pointed out, e.g., Rindler (1977), Stachel and Torretti (1982), and Mermin and Feigenbaum (1990), that the relation one actually derives from the special relativity is:

where \(K\) is merely an additive factor that fixes the zero-point of energy and is conventionally set to zero and \(q\) is also routinely set to zero. However, unlike the convention to set \(K\) to zero, setting \(q = 0\) involves a hypothesis concerning the nature of matter, because it rules out the possibility that there exists matter that has mass but which is such that some of its mass can never be “converted” into energy.

The same-property interpretation of mass-energy equivalence rests squarely on the assumption that \(q = 0\). Mass and energy cannot be the same property if there exists matter that has mass some of which cannot ever, under any conditions, be “converted” into energy. However, one could argue that although the same-property interpretation makes this assumption, it is not an unjustified assumption. Currently, physicists do not have any evidence that there exists matter for which \(q\) is not equal to zero. Nevertheless, it seems important, from a philosophical point of view, to recognize that the same-property interpretation depends not only on what one can derive from the postulates of special relativity, but also on evidence from “outside” this theory.

Interpretations of \(E_o = mc^2\) that hold that mass and energy are distinct properties of physical systems need not, of course, assume that \(q\) is different from zero. Such interpretations can simply leave the value of \(q\) to be determined empirically, for as we have seen such interpretations argue for treating mass and energy as distinct properties on different grounds. Nevertheless, the Bondi-Spurgin interpretation does seem to adopt implicitly a hypothesis concerning the nature of matter.

According to Bondi and Spurgin, all purported conversions of mass and energy are cases where one type of energy is transformed into another kind of energy. This in turn assumes that we can, in all cases, understand a reaction by examining the constituents of physical systems. If we focus on reactions involving sub-atomic particles, for example, Bondi and Spurgin seem to assume that we can always explain such reactions by examining the internal structure of sub-atomic particles. However, if we ever find good evidence to support the view that some particles have no internal structure, as it now seems to be the case with electrons for example, then we either have to give up the Bondi-Spurgin interpretation or use the interpretation itself to argue that such seemingly structureless particles actually do contain an internal structure. Thus, it seems that the Bondi-Spurgin interpretation assumes something like the infinite divisibility of matter, which is clearly a hypothesis that lies “outside” special relativity.

2.5 Ontological interpretations of \(E_o = mc^2\)

Einstein and Infeld (1938) and Zahar (1989) have both argued that \(E_o = mc^2\) has ontological consequences. Both of the Einstein-Infeld and Zahar interpretations begin by adopting the same-property interpretation of \(E_o = mc^2\). Thus, according to both interpretations, mass and energy are the same properties of physical systems. Furthermore, both the Einstein-Infeld and Zahar interpretations use a rudimentary distinction between “matter” and “fields.” According to this somewhat dated distinction, classical physics includes two fundamental substances: matter, by which one means ponderable material stuff, and fields, by which one means physical fields such as the electromagnetic field. For both Einstein and Infeld and Zahar, matter and fields in classical physics are distinguished by the properties they bear. Matter has both mass and energy, whereas fields only have energy. However, since the equivalence of mass and energy entails that mass and energy are really the same physical property after all, say Einstein and Infeld and Zahar, one can no longer distinguish between matter and fields, as both now have both mass and energy.

Although both Einstein and Infeld and Zahar use the same basic argument, they reach slightly different conclusions. Zahar argues that mass-energy equivalence entails that the fundamental stuff of physics is a sort of “I-know-not-what” that can manifest itself as either matter or field. Einstein and Infeld, on the other hand, in places seem to argue that we can infer that the fundamental stuff of physics is fields. In other places, however, Einstein and Infeld seem a bit more cautious and suggest only that one can construct a physics with only fields in its ontology.

The demerits of either ontological interpretation of mass-energy equivalence are that it rests upon the same-property interpretation of \(E_o = mc^2\). As we have discussed above (see Section 2.4), while one can adopt the same-property interpretation, to do so one must make additional assumptions concerning the nature of matter. Furthermore, the ontological interpretation rests on what nowadays seems like a rather crude distinction between “matter” and “fields.” To be sure, mass-energy equivalence has figured prominently in physicists’ conception of matter in no small part because it does open up the door to a description of what we ordinarily regard as ponderable matter in terms of fields, since the energy of the field at one level can manifest itself as mass one level up. However, the inference from mass-energy equivalence to the fundamental ontology of modern physics seems far more subtle than either Enstein and Infeld or Zahar suggest.

3. History of Derivations of Mass-Energy Equivalence

Einstein first derived mass-energy equivalence from the principles of special relativity in a small article titled “Does the Inertia of a Body Depend Upon Its Energy Content?” (1905b). This derivation, along with others that followed soon after (e.g., Planck (1906), Von Laue (1911)), uses Maxwell’s theory of electromagnetism. (See Section 3.1.) However, as Einstein later observed (1935), mass-energy equivalence is a result that should be independent of any theory that describes a specific physical interaction. This is the main reason that led physicists to search for “purely dynamical” derivations, i.e., derivations that invoke only mechanical concepts such as energy and momentum, and the conservation principles that govern them. (See Section 3.2)

3.1 Derivations of \(E_o = mc^2\) that Use Maxwell’s Theory

Einstein’s original derivation of mass-energy equivalence is the best known in this group. Einstein begins with the following thought-experiment: a body at rest (in some inertial frame) emits two pulses of light of equal energy in opposite directions. Einstein then analyzes this “act of emission” from another inertial frame, which is in a state of uniform motion relative to the first. In this analysis, Einstein uses Maxwell’s theory of electromagnetism to calculate the physical properties of the light pulses (such as their intensity) in the second inertial frame. By comparing the two descriptions of the “act of emission”, Einstein arrives at his celebrated result: “the mass of a body is a measure of its energy-content; if the energy changes by \(L\), the mass changes in the same sense by \(L/9 \times 10^{20}\), the energy being measured in ergs, and the mass in grammes” (1905b, p. 71). A similar derivation using the same thought experiment but appealing to the Doppler effect was given by Langevin (1913) (see the discussion of the inertia of energy in Fox (1965, p. 8)).

Some philosophers and historians of science claim that Einstein’s first derivation is fallacious. For example, in The Concept of Mass, Jammer says: “It is a curious incident in the history of scientific thought that Einstein’s own derivation of the formula \(E = mc^2\), as published in his article in Annalen der Physik, was basically fallacious. . . the result of a petitio principii, the conclusion begging the question” (Jammer, 1961, p. 177). According to Jammer, Einstein implicitly assumes what he is trying to prove, viz., that if a body emits an amount of energy \(L\), its inertial mass will decrease by an amount \(\Delta m = L/c^2\). Jammer also accuses Einstein of assuming the expression for the relativistic kinetic energy of a body. If Einstein made these assumptions, he would be guilty of begging the question. However, Stachel and Torretti (1982) have shown convincingly that Einstein’s (1905b) argument is sound. They note that Einstein indeed derives the expression for the kinetic energy of an “electron” (i.e., a structureless particle with a net charge) in his earlier (1905a) paper. However, Einstein nowhere uses this expression in the (1905b) derivation of mass-energy equivalence. Stachel and Torretti also show that Einstein’s critics overlook two key moves that are sufficient to make Einstein’s derivation sound, since one need not assume that \(\Delta m = L/c^2\).

Einstein’s further conclusion that “the mass of a body is a measure of its energy content” (1905b, p. 71) does not, strictly speaking, follow from his argument. As Torretti (1996) and other philosophers and physicists have observed, Einstein’s (1905b) argument allows for the possibility that once a body’s energy store has been entirely used up (and subtracted from the mass using the mass-energy equivalence relation) the remainder is not zero. In other words, it is only an hypothesis in Einstein’s (1905b) argument, and indeed in all derivations of \(E_o = mc^2\) in special relativity, that no “exotic matter” exists that is not convertible into energy (see Ehlers, Rindler, Penrose, (1965) for a discussion of this point). However, particle-antiparticle anihilation experiments in atomic physics, which were first observed decades after 1905, strongly support “Einstein’s dauntless extrapolation” (Torretti, 1996, p. 112).

In general, derivations in this group use the same style of reasoning. One typically begins by considering an object that either absorbs or emits electromagnetic radiation (typically light) of total energy \(E_o\) in equal and opposite directions. Because light carries both energy and momentum, one then uses the conservation principles for those quantities and the standard heuristic in relativity of considering the same physical process from two different inertial frames that are in a state of relative motion to show that in order for the conservation principles to be satisfied, the mass (i.e., rest-mass) of the emitting or absorbing object must increase or decrease by an amount \(E_o /c^2\). For a more detailed description of a simplified derivation in this group, see Section 1.5

One of the few exceptions to this approach among derivations that use Maxwell’s theory is Einstein’s 1906 derivation (Einstein 1906). In this derivation, Einstein considers a freely-floating box. A burst of electromagnetic radiation of energy \(E_o\) is emitted inside the box from one wall toward a parallel wall. Einstein shows that the principle of mechanics that says that the motion of the center of mass of a body cannot change merely because of changes inside the body would be violated if one did not attribute an inertial mass \(E_o /c^2\) to the burst of electromagnetic radiation (see Taylor and Wheeler 1992, p. 254 for a detailed discussion of this example).

3.2 Purely Dynamical Derivations of \(E_o = mc^2\)

Purely dynamical derivations of \(E_o = mc^2\) typically proceed by analyzing an inelastic collision from the point of view of two inertial frames in a state of relative motion (the centre-of-mass frame, and an inertial frame moving with a relative velocity \(v)\). One of the first papers to appear following this approach is Perrin’s (1932). According to Rindler and Penrose (1965), Perrin’s derivation was based largely on Langevin’s “elegant” lectures, which were delivered at the Collège de France in Zurich around 1922. Einstein himself gave a purely dynamical derivation (Einstein, 1935), though he nowhere mentions either Langevin or Perrin. The most comprehensive derivation of this sort was given by Ehlers, Rindler and Penrose (1965). More recently, a purely dynamical version of Einstein’s original (1905b) thought experiment, where the particles that are emitted are not photons, has been given by Mermin and Feigenbaum (1990) and Mermin (2005).

Derivations in this group are distinctive because they demonstrate that mass-energy equivalence is a consequence of the changes to the structure of spacetime brought about by special relativity. The relationship between mass and energy is independent of Maxwell’s theory or any other theory that describes a specific physical interaction. We can get a glimpse of this by noting that to derive \(E_o = mc^2\) by analyzing a collision, one must first define the four-momentum \(\mathbf{p}\), the “space-part” of which is relativistic momentum \(\mathbf{p}_{\rel}\), and relativistic kinetic energy \(T_{\rel}\), since one cannot use the old Newtonian notions of momentum and kinetic energy.

In Einstein’s own purely dynamical derivation (1935), more than half of the paper is devoted to finding the mathematical expressions that define \(\mathbf{p}\) and \(T_{\rel}\). This much work is required to arrive at these expressions for two reasons. First, the changes to the structure of spacetime must be incorporated into the definitions of the relativistic quantities. Second, \(\mathbf{p}\) and \(T_{\rel}\) must be defined so that they reduce to their Newtonian counterparts in the appropriate limit. This last requirement ensures, in effect, that special relativity will inherit the empirical success of Newtonian physics. Once the definitions of \(\mathbf{p}\) and \(T_{\rel}\) are obtained, one derives mass-energy equivalence in a straight-forward way by analyzing a collision. (For a more detailed discussion of Einstein’s (1935), see Fernflores, 2018.)

At a very general level, purely dynamical derivations of Einstein’s equation and derivations that appeal to Maxwell’s theory really follow the same approach. In both styles of derivations, although it may not seem like it at first glance, we are dealing with one of the most basic dynamical interactions: a collision. So, for example, we can construe the physical configuration of Einstein’s original 1905 derivation (Einstein 1905b) as a collision in which the total number of objects is not conserved. This is even easier to do if one adopts a “particle” description of light. In both the purely dynamical derivations and the derivations that appeal to an interaction with electromagnetic radiation, one then examines the collision in question and shows that in order for dynamical principles to be satisfied, the relationship among the masses and energies of the objects involved in the collisions must satsify Einstein’s equation.

The main difference between the two approaches to deriving Einstein’s equation is that in derivations that consider a collision with light, one must use the dynamical properties of light, which are not themselves described by special relativity. For example, as we have seen, in Einstein’s 1946 derivation (see Section 1.5), we must appeal to the expression for the momentum of a burst of light.

4. Experimental Verification of Mass-Energy Equivalence

Cockcroft and Walton (1932) are routinely credited with the first experimental verification of mass-energy equivalence. Cockcroft and Walton examined a variety of reactions where different atomic nulcei are bombarded by protons. They focussed their attention primarily on the bombardment of \({^7}\Li\) by protons (see Section 1.4).

In their famous paper, Cockcroft and Walton noted that the sum of the rest-masses of the proton and the Lithium nucleus (i.e., the reactants) was \(1.0072 + 7.0104 = 8.0176\) amu. However, the sum of the rest-masses of the two \(\alpha\)-particles (i.e., the products) was 8.0022 amu. Thus, it seems as if an amount of mass of 0.0154 amu has “disappeared” from the reactants. Cockcroft and Walton also observed that the total energy (in the reference frame in which the \({^7}\Li\) nucleus is at rest) for the reactants was 125 KeV. However, the total kinetic energy of the \(\alpha\)-particles was observed to be 17.2 MeV. Thus, it seems as if an amount of energy of approximately 17 MeV has “appeared” in the reaction.

Implicitly referring to the equivalence of mass and energy, without explicitly mentioning either the result or Einstein by name, Cockcroft and Walton then simply assert that a mass 0.0154 amu “is equivalent to an energy liberation of \((14.3 \pm 2.7) \times 10^6\) Volts” (p. 236). They then implicitly suggest that this inferred value for the kinetic energy of the two resulting \(\alpha\)-particles is consistent with the observed value for the kinetic energy of the \(\alpha\)-particles. Cockcroft and Walton conclude that “the observed energies of the \(\alpha\)-particles are consistent with our hypothesis” (pp. 236–237). The hypothesis they set out to test, however, is not mass-energy equivalence, but rather than when a \({^7}\Li\) nucleus is bombarded with protons, the result is two \(\alpha\)-particles.

Stuewer (1993) has suggested that Cockcroft and Walton use mass-energy equivalence to confirm their hypothesis about what happens when \({^7}\Li\) is bombarded by protons. Hence, it does not seem we ought to regard this experiment as a confirmation of \(E_o = mc^2\). However, if we take some of the other evidence that Cockcroft and Walton provide concerning the identification of the products in the bombardment reaction as sufficient to establish that the products are indeed \(\alpha\)-particles, then we can interpret this experiment as a confirmation of mass-energy equivalence, which is how this experiment is often reported in the physics literature.

More recently, Rainville et al. (2005) have published the results of what they call “A direct test of \(E = mc^2\).” Their experiments test mass-energy equivalence “directly” by comparing the difference in the rest-masses in a neutron capture reaction with the energy of the emitted \(\gamma\)-rays. Specifically, Rainville et al. examine two reactions, one involving neutron capture by Sulphur \((\S)\), the other involving neutron capture by Silicon \((\Si)\):

\[\begin{align} n + {^{32}}\S &\rightarrow {^{33}}\S + \gamma \\ n + {^{28}}\Si &\rightarrow {^{29}}\Si + \gamma \end{align}\]

In these reactions, when the nucleus of an atom (in this case either \({^{32}}\S\) or \({^{28}}\Si)\) captures the neutron, a new isotope is created in an excited state. In returning to its ground state, the isotope emits a \(\gamma\)-ray. According to Einstein’s equation, the difference in the rest-masses of the neutron plus nucleus, on the one hand, and the new isotope in its ground state on the other hand, should be equal to the energy of the emitted photon. Thus, Rainville et al. test \(\Delta E = \Delta mc^2\) by making very accurate measurements of the rest-mass difference and the frequency, and hence energy, of the emitted photon. Rainville et al. report that their measurements show that Einstein’s equation obtains to an accuracy of at least 0.00004%.

5. Conclusion

In this entry, we have presented the physics of mass-energy equivalence as widely understood by both physicists and philosophers. We have also canvassed a variety of philosophical interpretations of mass-energy equivalence. Along the way, we have presented the merits and demerits of each interpretation. We have also presented a brief history of derivations of mass-energy equivalence to emphasize that the equivalence of mass and energy is a direct result to changes to the structure of spacetime imposed by special relativity. Finally, we have briefly and rather selectively discussed the empirical confirmation of mass-energy equivalence.