The Feynman Lectures on Physics Vol. I Ch. 17: Space-Time

17Space-Time

17–1The geometry of space-time

The theory of relativity shows us that the relationships of positions and times as measured in one coordinate system and another are not what we would have expected on the basis of our intuitive ideas. It is very important that we thoroughly understand the relations of space and time implied by the Lorentz transformation, and therefore we shall consider this matter more deeply in this chapter.

The Lorentz transformation between the positions and times $(x,y,z,t)$ as measured by an observer “standing still,” and the corresponding coordinates and time $(x',y',z',t')$ measured inside a “moving” space ship, moving with velocity $u$ are \begin{equation} \begin{aligned} x'&=\frac{x-ut}{\sqrt{1-u^2/c^2}},\\ y'&=y,\\[2ex] z'&=z,\\ t'&=\frac{t-ux/c^2}{\sqrt{1-u^2/c^2}}. \end{aligned} \label{Eq:I:17:1} \end{equation} Let us compare these equations with Eq. (11.5), which also relates measurements in two systems, one of which in this instance is rotated relative to the other: \begin{equation} \begin{alignedat}{4} &x'&&=x&&\cos\theta+y&&\sin\theta,\\ &y'&&=y&&\cos\theta-x&&\sin\theta,\\ &z'&&=z&&. \end{alignedat} \label{Eq:I:17:2} \end{equation} In this particular case, Moe and Joe are measuring with axes having an angle $\theta$ between the $x'$- and $x$-axes. In each case, we note that the “primed” quantities are “mixtures” of the “unprimed” ones: the new $x'$ is a mixture of $x$ and $y$, and the new $y'$ is also a mixture of $x$ and $y$.

An analogy is useful: When we look at an object, there is an obvious thing we might call the “apparent width,” and another we might call the “depth.” But the two ideas, width and depth, are not fundamental properties of the object, because if we step aside and look at the same thing from a different angle, we get a different width and a different depth, and we may develop some formulas for computing the new ones from the old ones and the angles involved. Equations (17.2) are these formulas. One might say that a given depth is a kind of “mixture” of all depth and all width. If it were impossible ever to move, and we always saw a given object from the same position, then this whole business would be irrelevant—we would always see the “true” width and the “true” depth, and they would appear to have quite different qualities, because one appears as a subtended optical angle and the other involves some focusing of the eyes or even intuition; they would seem to be very different things and would never get mixed up. It is because we can walk around that we realize that depth and width are, somehow or other, just two different aspects of the same thing.

Can we not look at the Lorentz transformations in the same way? Here also we have a mixture—of positions and the time. A difference between a space measurement and a time measurement produces a new space measurement. In other words, in the space measurements of one man there is mixed in a little bit of the time, as seen by the other. Our analogy permits us to generate this idea: The “reality” of an object that we are looking at is somehow greater (speaking crudely and intuitively) than its “width” and its “depth” because they depend upon how we look at it; when we move to a new position, our brain immediately recalculates the width and the depth. But our brain does not immediately recalculate coordinates and time when we move at high speed, because we have had no effective experience of going nearly as fast as light to appreciate the fact that time and space are also of the same nature. It is as though we were always stuck in the position of having to look at just the width of something, not being able to move our heads appreciably one way or the other; if we could, we understand now, we would see some of the other man’s time—we would see “behind,” so to speak, a little bit.

Thus we shall try to think of objects in a new kind of world, of space and time mixed together, in the same sense that the objects in our ordinary space-world are real, and can be looked at from different directions. We shall then consider that objects occupying space and lasting for a certain length of time occupy a kind of a “blob” in a new kind of world, and that we look at this “blob” from different points of view when we are moving at different velocities. This new world, this geometrical entity in which the “blobs” exist by occupying position and taking up a certain amount of time, is called space-time. A given point $(x,y,z,t)$ in space-time is called an event. Imagine, for example, that we plot the $x$-positions horizontally, $y$ and $z$ in two other directions, both mutually at “right angles” and at “right angles” to the paper (!), and time, vertically. Now, how does a moving particle, say, look on such a diagram? If the particle is standing still, then it has a certain $x$, and as time goes on, it has the same $x$, the same $x$, the same $x$; so its “path” is a line that runs parallel to the $t$-axis (Fig. 17–1 a). On the other hand, if it drifts outward, then as the time goes on $x$ increases (Fig. 17–1 b). So a particle, for example, which starts to drift out and then slows up should have a motion something like that shown in Fig. 17–1(c). A particle, in other words, which is permanent and does not disintegrate is represented by a line in space-time. A particle which disintegrates would be represented by a forked line, because it would turn into two other things which would start from that point.

Fig. 17–1.Three particle paths in space-time: (a) a particle at rest at $x = x_0$; (b) a particle which starts at $x = x_0$ and moves with constant speed; (c) a particle which starts at high speed but slows down; (d) a light path.

What about light? Light travels at the speed $c$, and that would be represented by a line having a certain fixed slope (Fig. 17–1 d).

Now according to our new idea, if a given event occurs to a particle, say if it suddenly disintegrates at a certain space-time point into two new ones which follow some new tracks, and this interesting event occurred at a certain value of $x$ and a certain value of $t$, then we would expect that, if this makes any sense, we just have to take a new pair of axes and turn them, and that will give us the new $t$ and the new $x$ in our new system, as shown in Fig. 17–2(a). But this is wrong, because Eq. (17.1) is not exactly the same mathematical transformation as Eq. (17.2). Note, for example, the difference in sign between the two, and the fact that one is written in terms of $\cos\theta$ and $\sin\theta$, while the other is written with algebraic quantities. (Of course, it is not impossible that the algebraic quantities could be written as cosine and sine, but actually they cannot.) But still, the two expressions are very similar. As we shall see, it is not really possible to think of space-time as a real, ordinary geometry because of that difference in sign. In fact, although we shall not emphasize this point, it turns out that a man who is moving has to use a set of axes which are inclined equally to the light ray, using a special kind of projection parallel to the $x'$- and $t'$-axes, for his $x'$ and $t'$, as shown in Fig. 17–2(b). We shall not deal with the geometry, since it does not help much; it is easier to work with the equations.

Fig. 17–2.Two views of a disintegrating particle.

17–2Space-time intervals

Although the geometry of space-time is not Euclidean in the ordinary sense, there is a geometry which is very similar, but peculiar in certain respects. If this idea of geometry is right, there ought to be some functions of coordinates and time which are independent of the coordinate system. For example, under ordinary rotations, if we take two points, one at the origin, for simplicity, and the other one somewhere else, both systems would have the same origin, and the distance from here to the other point is the same in both. That is one property that is independent of the particular way of measuring it. The square of the distance is $x^2 + y^2 + z^2$. Now what about space-time? It is not hard to demonstrate that we have here, also, something which stays the same, namely, the combination $c^2t^2 - x^2 - y^2 - z^2$ is the same before and after the transformation: \begin{equation} \label{Eq:I:17:3} c^2t'^2\!-x'^2\!-y'^2\!-z'^2\!=c^2t^2\!-x^2\!-y^2\!-z^2\!. \end{equation} This quantity is therefore something which, like the distance, is “real” in some sense; it is called the interval between the two space-time points, one of which is, in this case, at the origin. (Actually, of course, it is the interval squared, just as $x^2 + y^2 + z^2$ is the distance squared.) We give it a different name because it is in a different geometry, but the interesting thing is only that some signs are reversed and there is a $c$ in it.

Let us get rid of the $c$; that is an absurdity if we are going to have a wonderful space with $x$’s and $y$’s that can be interchanged. One of the confusions that could be caused by someone with no experience would be to measure widths, say, by the angle subtended at the eye, and measure depth in a different way, like the strain on the muscles needed to focus them, so that the depths would be measured in feet and the widths in meters. Then one would get an enormously complicated mess of equations in making transformations such as (17.2), and would not be able to see the clarity and simplicity of the thing for a very simple technical reason, that the same thing is being measured in two different units. Now in Eqs. (17.1) and (17.3) nature is telling us that time and space are equivalent; time becomes space; they should be measured in the same units. What distance is a “second”? It is easy to figure out from (17.3) what it is. It is $3\times10^8$ meters, the distance that light would go in one second. In other words, if we were to measure all distances and times in the same units, seconds, then our unit of distance would be $3\times10^8$ meters, and the equations would be simpler. Or another way that we could make the units equal is to measure time in meters. What is a meter of time? A meter of time is the time it takes for light to go one meter, and is therefore $1/3\times10^{-8}$ sec, or $3.3$ billionths of a second! We would like, in other words, to put all our equations in a system of units in which $c = 1$. If time and space are measured in the same units, as suggested, then the equations are obviously much simplified. They are \begin{gather} \begin{aligned} x'&=\frac{x-ut}{\sqrt{1-u^2}},\\ y'&=y,\\[1.5ex] z'&=z,\\ t'&=\frac{t-ux}{\sqrt{1-u^2}}. \end{aligned} \label{Eq:I:17:4}\\[2.25ex] t'^2\!-x'^2\!-y'^2\!-z'^2\!=t^2\!-x^2\!-y^2\!-z^2\!. \label{Eq:I:17:5} \end{gather} If we are ever unsure or “frightened” that after we have this system with $c=1$ we shall never be able to get our equations right again, the answer is quite the opposite. It is much easier to remember them without the $c$’s in them, and it is always easy to put the $c$’s back, by looking after the dimensions. For instance, in $\sqrt{1 - u^2}$, we know that we cannot subtract a velocity squared, which has units, from the pure number $1$, so we know that we must divide $u^2$ by $c^2$ in order to make that unitless, and that is the way it goes.

The difference between space-time and ordinary space, and the character of an interval as related to the distance, is very interesting. According to formula (17.5), if we consider a point which in a given coordinate system had zero time, and only space, then the interval squared would be negative and we would have an imaginary interval, the square root of a negative number. Intervals can be either real or imaginary in the theory. The square of an interval may be either positive or negative, unlike distance, which has a positive square. When an interval is imaginary, we say that the two points have a space-like interval between them (instead of imaginary), because the interval is more like space than like time. On the other hand, if two objects are at the same place in a given coordinate system, but differ only in time, then the square of the time is positive and the distances are zero and the interval squared is positive; this is called a time-like interval. In our diagram of space-time, therefore, we would have a representation something like this: at $45^\circ$ there are two lines (actually, in four dimensions these will be “cones,” called light cones) and points on these lines are all at zero interval from the origin. Where light goes from a given point is always separated from it by a zero interval, as we see from Eq. (17.5). Incidentally, we have just proved that if light travels with speed $c$ in one system, it travels with speed $c$ in another, for if the interval is the same in both systems, i.e., zero in one and zero in the other, then to state that the propagation speed of light is invariant is the same as saying that the interval is zero.

17–3Past, present, and future

Fig. 17–3.The space-time region surrounding a point at the origin.

The space-time region surrounding a given space-time point can be separated into three regions, as shown in Fig. 17–3. In one region we have space-like intervals, and in two regions, time-like intervals. Physically, these three regions into which space-time around a given point is divided have an interesting physical relationship to that point: a physical object or a signal can get from a point in region $2$ to the event $O$ by moving along at a speed less than the speed of light. Therefore events in this region can affect the point $O$, can have an influence on it from the past. In fact, of course, an object at $P$ on the negative $t$-axis is precisely in the “past” with respect to $O$; it is the same space-point as $O$, only earlier. What happened there then, affects $O$ now. (Unfortunately, that is the way life is.) Another object at $Q$ can get to $O$ by moving with a certain speed less than $c$, so if this object were in a space ship and moving, it would be, again, the past of the same space-point. That is, in another coordinate system, the axis of time might go through both $O$ and $Q$. So all points of region $2$ are in the “past” of $O$, and anything that happens in this region can affect $O$. Therefore region $2$ is sometimes called the affective past, or affecting past; it is the locus of all events which can affect point $O$ in any way.

Region $3$, on the other hand, is a region which we can affect from $O$, we can “hit” things by shooting “bullets” out at speeds less than $c$. So this is the world whose future can be affected by us, and we may call that the affective future. Now the interesting thing about all the rest of space-time, i.e., region $1$, is that we can neither affect it now from $O$, nor can it affect us now at $O$, because nothing can go faster than the speed of light. Of course, what happens at $R$ can affect us later; that is, if the sun is exploding “right now,” it takes eight minutes before we know about it, and it cannot possibly affect us before then.

What we mean by “right now” is a mysterious thing which we cannot define and we cannot affect, but it can affect us later, or we could have affected it if we had done something far enough in the past. When we look at the star Alpha Centauri, we see it as it was four years ago; we might wonder what it is like “now.” “Now” means at the same time from our special coordinate system. We can only see Alpha Centauri by the light that has come from our past, up to four years ago, but we do not know what it is doing “now”; it will take four years before what it is doing “now” can affect us. Alpha Centauri “now” is an idea or concept of our mind; it is not something that is really definable physically at the moment, because we have to wait to observe it; we cannot even define it right “now.” Furthermore, the “now” depends on the coordinate system. If, for example, Alpha Centauri were moving, an observer there would not agree with us because he would put his axes at an angle, and his “now” would be a different time. We have already talked about the fact that simultaneity is not a unique thing.

There are fortune tellers, or people who tell us they can know the future, and there are many wonderful stories about the man who suddenly discovers that he has knowledge about the affective future. Well, there are lots of paradoxes produced by that because if we know something is going to happen, then we can make sure we will avoid it by doing the right thing at the right time, and so on. But actually there is no fortune teller who can even tell us the present! There is no one who can tell us what is really happening right now, at any reasonable distance, because that is unobservable. We might ask ourselves this question, which we leave to the student to try to answer: Would any paradox be produced if it were suddenly to become possible to know things that are in the space-like intervals of region $1$?

17–4More about four-vectors

Let us now return to our consideration of the analogy of the Lorentz transformation and rotations of the space axes. We have learned the utility of collecting together other quantities which have the same transformation properties as the coordinates, to form what we call vectors, directed lines. In the case of ordinary rotations, there are many quantities that transform the same way as $x$, $y$, and $z$ under rotation: for example, the velocity has three components, an $x$, $y$, and $z$-component; when seen in a different coordinate system, none of the components is the same, instead they are all transformed to new values. But, somehow or other, the velocity “itself” has a greater reality than do any of its particular components, and we represent it by a directed line.

We therefore ask: Is it or is it not true that there are quantities which transform, or which are related, in a moving system and in a nonmoving system, in the same way as $x$, $y$, $z$, and $t$? From our experience with vectors, we know that three of the quantities, like $x$, $y$, $z$, would constitute the three components of an ordinary space-vector, but the fourth quantity would look like an ordinary scalar under space rotation, because it does not change so long as we do not go into a moving coordinate system. Is it possible, then, to associate with some of our known “three-vectors” a fourth object, that we could call the “time component,” in such a manner that the four objects together would “rotate” the same way as position and time in space-time? We shall now show that there is, indeed, at least one such thing (there are many of them, in fact): the three components of momentum, and the energy as the time component, transform together to make what we call a “four-vector.” In demonstrating this, since it is quite inconvenient to have to write $c$’s everywhere, we shall use the same trick concerning units of the energy, the mass, and the momentum, that we used in Eq. (17.4). Energy and mass, for example, differ only by a factor $c^2$ which is merely a question of units, so we can say energy is the mass. Instead of having to write the $c^2$, we put $E = m$, and then, of course, if there were any trouble we would put in the right amounts of $c$ so that the units would straighten out in the last equation, but not in the intermediate ones.

Thus our equations for energy and momentum are \begin{equation} \begin{alignedat}{2} &E&&=m=m_0/\sqrt{1-v^2},\\[1ex] &\FLPp&&=m\FLPv=m_0\FLPv/\sqrt{1-v^2}. \end{alignedat} \label{Eq:I:17:6} \end{equation} Also in these units, we have \begin{equation} \label{Eq:I:17:7} E^2-p^2=m_0^2. \end{equation} For example, if we measure energy in electron volts, what does a mass of $1$ electron volt mean? It means the mass whose rest energy is $1$ electron volt, that is, $m_0c^2$ is one electron volt. For example, the rest mass of an electron is $0.511\times10^6$ eV.

Now what would the momentum and energy look like in a new coordinate system? To find out, we shall have to transform Eq. (17.6), which we can do because we know how the velocity transforms. Suppose that, as we measure it, an object has a velocity $v$, but we look upon the same object from the point of view of a space ship which itself is moving with a velocity $u$, and in that system we use a prime to designate the corresponding thing. In order to simplify things at first, we shall take the case that the velocity $v$ is in the direction of $u$. (Later, we can do the more general case.) What is $v'$, the velocity as seen from the space ship? It is the composite velocity, the “difference” between $v$ and $u$. By the law which we worked out before, \begin{equation} \label{Eq:I:17:8} v'=\frac{v-u}{1-uv}. \end{equation} Now let us calculate the new energy $E'$, the energy as the fellow in the space ship would see it. He would use the same rest mass, of course, but he would use $v'$ for the velocity. What we have to do is square $v'$, subtract it from one, take the square root, and take the reciprocal: \begin{equation*} \begin{aligned} v'^2&=\frac{v^2-2uv+u^2}{1-2uv+u^2v^2},\\[1.5ex] 1-v'^2&=\frac{1-2uv+u^2v^2-v^2+2uv-u^2}{1-2uv+u^2v^2},\\[1ex] &=\frac{1-v^2-u^2+u^2v^2}{1-2uv+u^2v^2},\\[1.5ex] &=\frac{(1-v^2)(1-u^2)}{(1-uv)^2}. \end{aligned} \end{equation*} Therefore \begin{equation} \label{Eq:I:17:9} \frac{1}{\sqrt{1-v'^2}}=\frac{1-uv}{\sqrt{1-v^2}\sqrt{1-u^2}}. \end{equation}

The energy $E'$ is then simply $m_0$ times the above expression. But we want to express the energy in terms of the unprimed energy and momentum, and we note that \begin{equation*} E'=\frac{m_0-m_0uv}{\sqrt{1-v^2}\sqrt{1-u^2}}= \frac{(m_0/\sqrt{1-v^2})-(m_0v/\sqrt{1-v^2})u}{\sqrt{1-u^2}}, \end{equation*} \begin{align*} E'&=\frac{m_0-m_0uv}{\sqrt{1-v^2}\sqrt{1-u^2}}\\[2ex] &=\frac{(m_0/\sqrt{1-v^2})-(m_0v/\sqrt{1-v^2})u}{\sqrt{1-u^2}}, \end{align*} or \begin{equation} \label{Eq:I:17:10} E'=\frac{E-up_x}{\sqrt{1-u^2}}, \end{equation} which we recognize as being exactly of the same form as \begin{equation*} t'=\frac{t-ux}{\sqrt{1-u^2}}. \end{equation*} Next we must find the new momentum $p_x'$. This is just the energy $E'$ times $v'$, and is also simply expressed in terms of $E$ and $p$: \begin{equation*} p_x'=E'v'=\frac{m_0(1-uv)}{\sqrt{1-v^2}\sqrt{1-u^2}}\cdot \frac{v-u}{(1-uv)}= \frac{m_0v-m_0u}{\sqrt{1-v^2}\sqrt{1-u^2}}. \end{equation*} \begin{align*} p_x'=E'v'&=\frac{m_0(1-uv)}{\sqrt{1-v^2}\sqrt{1-u^2}}\cdot \frac{v-u}{(1-uv)}\\[2ex] &=\frac{m_0v-m_0u}{\sqrt{1-v^2}\sqrt{1-u^2}}. \end{align*} Thus \begin{equation} \label{Eq:I:17:11} p_x'=\frac{p_x-uE}{\sqrt{1-u^2}}, \end{equation} which we recognize as being of precisely the same form as \begin{equation*} x'=\frac{x-ut}{\sqrt{1-u^2}}. \end{equation*}

Thus the transformations for the new energy and momentum in terms of the old energy and momentum are exactly the same as the transformations for $t'$ in terms of $t$ and $x$, and $x'$ in terms of $x$ and $t$: all we have to do is, every time we see $t$ in (17.4) substitute $E$, and every time we see $x$ substitute $p_x$, and then the equations (17.4) will become the same as Eqs. (17.10) and (17.11). This would imply, if everything works right, an additional rule that $p_y' = p_y$ and that $p_z' = p_z$. To prove this would require our going back and studying the case of motion up and down. Actually, we did study the case of motion up and down in the last chapter. We analyzed a complicated collision and we noticed that, in fact, the transverse momentum is not changed when viewed from a moving system; so we have already verified that $p_y' = p_y$ and $p_z' = p_z$. The complete transformation, then, is \begin{equation} \begin{aligned} p_x'&=\frac{p_x-uE}{\sqrt{1-u^2}},\\ p_y'&=p_y,\\[1ex] p_z'&=p_z,\\ E'&=\frac{E-up_x}{\sqrt{1-u^2}}. \end{aligned} \label{Eq:I:17:12} \end{equation}

In these transformations, therefore, we have discovered four quantities which transform like $x$, $y$, $z$, and $t$, and which we call the four-vector momentum. Since the momentum is a four-vector, it can be represented on a space-time diagram of a moving particle as an “arrow” tangent to the path, as shown in Fig. 17–4. This arrow has a time component equal to the energy, and its space components represent its three-vector momentum; this arrow is more “real” than either the energy or the momentum, because those just depend on how we look at the diagram.

Fig. 17–4.The four-vector momentum of a particle.

17–5Four-vector algebra

The notation for four-vectors is different than it is for three-vectors. In the case of three-vectors, if we were to talk about the ordinary three-vector momentum we would write it $\FLPp$. If we wanted to be more specific, we could say it has three components which are, for the axes in question, $p_x$, $p_y$, and $p_z$, or we could simply refer to a general component as $p_i$, and say that $i$ could either be $x$, $y$, or $z$, and that these are the three components; that is, imagine that $i$ is any one of three directions, $x$, $y$, or $z$. The notation that we use for four-vectors is analogous to this: we write $p_\mu$ for the four-vector, and $\mu$ stands for the four possible directions $t$, $x$, $y$, or $z$.

We could, of course, use any notation we want; do not laugh at notations; invent them, they are powerful. In fact, mathematics is, to a large extent, invention of better notations. The whole idea of a four-vector, in fact, is an improvement in notation so that the transformations can be remembered easily. $A_\mu$, then, is a general four-vector, but for the special case of momentum, the $p_t$ is identified as the energy, $p_x$ is the momentum in the $x$-direction, $p_y$ is that in the $y$-direction, and $p_z$ is that in the $z$-direction. To add four-vectors, we add the corresponding components.

If there is an equation among four-vectors, then the equation is true for each component. For instance, if the law of conservation of three-vector momentum is to be true in particle collisions, i.e., if the sum of the momenta for a large number of interacting or colliding particles is to be a constant, that must mean that the sums of all momenta in the $x$-direction, in the $y$-direction, and in the $z$-direction, for all the particles, must each be constant. This law alone would be impossible in relativity because it is incomplete; it is like talking about only two of the components of a three-vector. It is incomplete because if we rotate the axes, we mix the various components, so we must include all three components in our law. Thus, in relativity, we must complete the law of conservation of momentum by extending it to include the time component. This is absolutely necessary to go with the other three, or there cannot be relativistic invariance. The conservation of energy is the fourth equation which goes with the conservation of momentum to make a valid four-vector relationship in the geometry of space and time. Thus the law of conservation of energy and momentum in four-dimensional notation is \begin{equation} \label{Eq:I:17:13} \sum_{\substack{\text{particles}\\\text{in}}}p_\mu= \sum_{\substack{\text{particles}\\\text{out}}}p_\mu \end{equation} or, in a slightly different notation \begin{equation} \label{Eq:I:17:14} \sum_ip_{i\mu}=\sum_jp_{j\mu}, \end{equation} where $i = 1$, $2$, … refers to the particles going into the collision, $j= 1$, $2$, … refers to the particles coming out of the collision, and $\mu = x$, $y$, $z$, or $t$. You say, “In which axes?” It makes no difference. The law is true for each component, using any axes.

In vector analysis we discussed one other thing, the dot product of two vectors. Let us now consider the corresponding thing in space-time. In ordinary rotation we discovered there was an unchanged quantity $x^2 + y^2 + z^2$. In four dimensions, we find that the corresponding quantity is $t^2 - x^2 - y^2 - z^2$ (Eq. 17.3). How can we write that? One way would be to write some kind of four-dimensional thing with a square dot between, like $A_\mu \boxdot B_\mu$; one of the notations which is actually used is \begin{equation} \label{Eq:I:17:15} \sideset{}{'}\sum_\mu A_\mu A_\mu=A_t^2-A_x^2-A_y^2-A_z^2. \end{equation} The prime on $\sum$ means that the first term, the “time” term, is positive, but the other three terms have minus signs. This quantity, then, will be the same in any coordinate system, and we may call it the square of the length of the four-vector. For instance, what is the square of the length of the four-vector momentum of a single particle? This will be equal to $p_t^2 - p_x^2 - p_y^2 - p_z^2$ or, in other words, $E^2 - p^2$, because we know that $p_t$ is $E$. What is $E^2 - p^2$? It must be something which is the same in every coordinate system. In particular, it must be the same for a coordinate system which is moving right along with the particle, in which the particle is standing still. If the particle is standing still, it would have no momentum. So in that coordinate system, it is purely its energy, which is the same as its rest mass. Thus $E^2 - p^2 = m_0^2$. So we see that the square of the length of this vector, the four-vector momentum, is equal to $m_0^2$.

From the square of a vector, we can go on to invent the “dot product,” or the product which is a scalar: if $a_\mu$ is one four-vector and $b_\mu$ is another four-vector, then the scalar product is \begin{equation} \label{Eq:I:17:16} \sideset{}{'}\sum a_\mu b_\mu=a_tb_t-a_xb_x-a_yb_y-a_zb_z. \end{equation} It is the same in all coordinate systems.

Finally, we shall mention certain things whose rest mass $m_0$ is zero. A photon of light, for example. A photon is like a particle, in that it carries an energy and a momentum. The energy of a photon is a certain constant, called Planck’s constant, times the frequency of the photon: $E = h\nu$. Such a photon also carries a momentum, and the momentum of a photon (or of any other particle, in fact) is $h$ divided by the wavelength: $p = h/\lambda$. But, for a photon, there is a definite relationship between the frequency and the wavelength: $\nu= c/\lambda$. (The number of waves per second, times the wavelength of each, is the distance that the light goes in one second, which, of course, is $c$.) Thus we see immediately that the energy of a photon must be the momentum times $c$, or if $c = 1$, the energy and momentum are equal. That is to say, the rest mass is zero. Let us look at that again; that is quite curious. If it is a particle of zero rest mass, what happens when it stops? It never stops! It always goes at the speed $c$. The usual formula for energy is $m_0/\sqrt{1 - v^2}$. Now can we say that $m_0 = 0$ and $v = 1$, so the energy is $0$? We cannot say that it is zero; the photon really can (and does) have energy even though it has no rest mass, but this it possesses by perpetually going at the speed of light!

We also know that the momentum of any particle is equal to its total energy times its velocity: if $c = 1$, $p = vE$ or, in ordinary units, $p = vE/c^2$. For any particle moving at the speed of light, $p = E$ if $c = 1$. The formulas for the energy of a photon as seen from a moving system are, of course, given by Eq. (17.12), but for the momentum we must substitute the energy divided by $c$ (or by $1$ in this case). The different energies after transformation means that there are different frequencies. This is called the Doppler effect, and one can calculate it easily from Eq. (17.12), using also $E = p$ and $E = h\nu$.

As Minkowski said, “Space of itself, and time of itself will sink into mere shadows, and only a kind of union between them shall survive.”