The Feynman Lectures on Physics Vol. II Ch. 2: Differential Calculus of Vector Fields

2Differential Calculus of Vector Fields

Review:

Chapter 11, Vol. I, Vectors

2–1Understanding physics

The physicist needs a facility in looking at problems from several points of view. The exact analysis of real physical problems is usually quite complicated, and any particular physical situation may be too complicated to analyze directly by solving the differential equation. But one can still get a very good idea of the behavior of a system if one has some feel for the character of the solution in different circumstances. Ideas such as the field lines, capacitance, resistance, and inductance are, for such purposes, very useful. So we will spend much of our time analyzing them. In this way we will get a feel as to what should happen in different electromagnetic situations. On the other hand, none of the heuristic models, such as field lines, is really adequate and accurate for all situations. There is only one precise way of presenting the laws, and that is by means of differential equations. They have the advantage of being fundamental and, so far as we know, precise. If you have learned the differential equations you can always go back to them. There is nothing to unlearn.

It will take you some time to understand what should happen in different circumstances. You will have to solve the equations. Each time you solve the equations, you will learn something about the character of the solutions. To keep these solutions in mind, it will be useful also to study their meaning in terms of field lines and of other concepts. This is the way you will really “understand” the equations. That is the difference between mathematics and physics. Mathematicians, or people who have very mathematical minds, are often led astray when “studying” physics because they lose sight of the physics. They say: “Look, these differential equations—the Maxwell equations—are all there is to electrodynamics; it is admitted by the physicists that there is nothing which is not contained in the equations. The equations are complicated, but after all they are only mathematical equations and if I understand them mathematically inside out, I will understand the physics inside out.” Only it doesn’t work that way. Mathematicians who study physics with that point of view—and there have been many of them—usually make little contribution to physics and, in fact, little to mathematics. They fail because the actual physical situations in the real world are so complicated that it is necessary to have a much broader understanding of the equations.

What it means really to understand an equation—that is, in more than a strictly mathematical sense—was described by Dirac. He said: “I understand what an equation means if I have a way of figuring out the characteristics of its solution without actually solving it.” So if we have a way of knowing what should happen in given circumstances without actually solving the equations, then we “understand” the equations, as applied to these circumstances. A physical understanding is a completely unmathematical, imprecise, and inexact thing, but absolutely necessary for a physicist.

Ordinarily, a course like this is given by developing gradually the physical ideas—by starting with simple situations and going on to more and more complicated situations. This requires that you continuously forget things you previously learned—things that are true in certain situations, but which are not true in general. For example, the “law” that the electrical force depends on the square of the distance is not always true. We prefer the opposite approach. We prefer to take first the complete laws, and then to step back and apply them to simple situations, developing the physical ideas as we go along. And that is what we are going to do.

Our approach is completely opposite to the historical approach in which one develops the subject in terms of the experiments by which the information was obtained. But the subject of physics has been developed over the past 200 years by some very ingenious people, and as we have only a limited time to acquire our knowledge, we cannot possibly cover everything they did. Unfortunately one of the things that we shall have a tendency to lose in these lectures is the historical, experimental development. It is hoped that in the laboratory some of this lack can be corrected. You can also fill in what we must leave out by reading the Encyclopedia Britannica, which has excellent historical articles on electricity and on other parts of physics. You will also find historical information in many textbooks on electricity and magnetism.

Scalar and vector fields— $\boldsymbol{T}$ and $\FLPh$

We begin now with the abstract, mathematical view of the theory of electricity and magnetism. The ultimate idea is to explain the meaning of the laws given in Chapter 1. But to do this we must first explain a new and peculiar notation that we want to use. So let us forget electromagnetism for the moment and discuss the mathematics of vector fields. It is of very great importance, not only for electromagnetism, but for all kinds of physical circumstances. Just as ordinary differential and integral calculus is so important to all branches of physics, so also is the differential calculus of vectors. We turn to that subject.

Listed below are a few facts from the algebra of vectors. It is assumed that you already know them. $\begin{align} \label{Eq:II:2:1} &\FLPA\,\cdot\,\FLPB=\text{scalar}=A_xB_x+A_yB_y+A_zB_z\\[1ex] \label{Eq:II:2:2} &\FLPA\times\FLPB=\text{vector}\\[1pt] &\begin{alignedat}{5} % ebook insert: \label{Eq:II:0:0} &\qquad(\FLPA\times\FLPB)_z&&=A_x&&B_y&&-A_y&&B_x\\[.25ex] % ebook insert: \label{Eq:II:0:0} &\qquad(\FLPA\times\FLPB)_x&&=A_y&&B_z&&-A_z&&B_y\\[.25ex] % ebook insert: \label{Eq:II:0:0} &\qquad(\FLPA\times\FLPB)_y&&=A_z&&B_x&&-A_x&&B_z \end{alignedat}\notag\\[1ex] % ebook break \label{Eq:II:2:3} &\FLPA\times\FLPA=\FLPzero\\[1ex] \label{Eq:II:2:4} &\FLPA\cdot(\FLPA\times\FLPB)=0\\[1ex] \label{Eq:II:2:5} &\FLPA\cdot(\FLPB\times\FLPC)=(\FLPA\times\FLPB)\cdot\FLPC\\[1ex] \label{Eq:II:2:6} &\FLPA\times(\FLPB\times\FLPC)=\FLPB(\FLPA\cdot\FLPC)-\FLPC(\FLPA\cdot\FLPB) \end{align}$

Also we will want to use the two following equalities from the calculus: $\begin{gather} \label{Eq:II:2:7} \Delta f(x,y,z)=\ddp{f}{x}\,\Delta x+\ddp{f}{y}\,\Delta y+\ddp{f}{z}\,\Delta z,\\[1ex] \label{Eq:II:2:8} \frac{\partial^2f}{\partial x\,\partial y}= \frac{\partial^2f}{\partial y\,\partial x}. \end{gather}$ The first equation (2.7) is, of course, true only in the limit that $\Delta x$ , $\Delta y$ , and $\Delta z$ go toward zero.

The simplest possible physical field is a scalar field. By a field, you remember, we mean a quantity which depends upon position in space. By a scalar field we merely mean a field which is characterized at each point by a single number—a scalar. Of course the number may change in time, but we need not worry about that for the moment. We will talk about what the field looks like at a given instant. As an example of a scalar field, consider a solid block of material which has been heated at some places and cooled at others, so that the temperature of the body varies from point to point in a complicated way. Then the temperature will be a function of $x$ , $y$ , and $z$ , the position in space measured in a rectangular coordinate system. Temperature is a scalar field.

Fig. 2–1. Temperature

$T$ is an example of a scalar field. With each point

$(x,y,z)$ in space there is associated a number

$T(x,y,z)$ . All points on the surface marked

$T=20^\circ$ (shown as a curve at

$z=0$ ) are at the same temperature. The arrows are samples of the heat flow vector

$\Figh$ .

One way of thinking about scalar fields is to imagine “contours” which are imaginary surfaces drawn through all points for which the field has the same value, just as contour lines on a map connect points with the same height. For a temperature field the contours are called “isothermal surfaces” or isotherms. Figure 2–1 illustrates a temperature field and shows the dependence of $T$ on $x$ and $y$ when $z=0$ . Several isotherms are drawn.

Fig. 2–2. The velocity of the atoms in a rotating object is an example of a vector field.

There are also vector fields. The idea is very simple. A vector is given for each point in space. The vector varies from point to point. As an example, consider a rotating body. The velocity of the material of the body at any point is a vector which is a function of position (Fig. 2–2). As a second example, consider the flow of heat in a block of material. If the temperature in the block is high at one place and low at another, there will be a flow of heat from the hotter places to the colder. The heat will be flowing in different directions in different parts of the block. The heat flow is a directional quantity which we call $\FLPh$ . Its magnitude is a measure of how much heat is flowing. Examples of the heat flow vector are also shown in Fig. 2–1.

Fig. 2–3. Heat flow is a vector field. The vector

$\Figh$ points along the direction of the flow. Its magnitude is the energy transported per unit time across a surface element oriented perpendicular to the flow, divided by the area of the surface element.

Let’s make a more precise definition of $\FLPh$ : The magnitude of the vector heat flow at a point is the amount of thermal energy that passes, per unit time and per unit area, through an infinitesimal surface element at right angles to the direction of flow. The vector points in the direction of flow (see Fig. 2–3). In symbols: If $\Delta J$ is the thermal energy that passes per unit time through the surface element $\Delta a$ , then $\begin{equation} \label{Eq:II:2:9} \FLPh=\frac{\Delta J}{\Delta a}\,\FLPe_f, \end{equation}$ where $\FLPe_f$ is a unit vector in the direction of flow.

Fig. 2–4. The heat flow through

$\Delta a_2$ is the same as through

$\Delta a_1$ .

The vector $\FLPh$ can be defined in another way—in terms of its components. We ask how much heat flows through a small surface at any angle with respect to the flow. In Fig. 2–4 we show a small surface $\Delta a_2$ inclined with respect to $\Delta a_1$ , which is perpendicular to the flow. The unit vector $\FLPn$ is normal to the surface $\Delta a_2$ . The angle $\theta$ between $\FLPn$ and $\FLPh$ is the same as the angle between the surfaces (since $\FLPh$ is normal to $\Delta a_1$ ). Now what is the heat flow per unit area through $\Delta a_2$ ? The flow through $\Delta a_2$ is the same as through $\Delta a_1$ ; only the areas are different. In fact, $\Delta a_1=\Delta a_2\cos\theta$ . The heat flow through $\Delta a_2$ is $\begin{equation} \label{Eq:II:2:10} \frac{\Delta J}{\Delta a_2}=\frac{\Delta J}{\Delta a_1}\cos\theta= \FLPh\cdot\FLPn. \end{equation}$ We interpret this equation: the heat flow (per unit time and per unit area) through any surface element whose unit normal is $\FLPn$ , is given by $\FLPh\cdot\FLPn$ . Equally, we could say: the component of the heat flow perpendicular to the surface element $\Delta a_2$ is $\FLPh\cdot\FLPn$ . We can, if we wish, consider that these statements define $\FLPh$ . We will be applying the same ideas to other vector fields.

2–3Derivatives of fields—the gradient

When fields vary in time, we can describe the variation by giving their derivatives with respect to $t$ . We want to describe the variations with position in a similar way, because we are interested in the relationship between, say, the temperature in one place and the temperature at a nearby place. How shall we take the derivative of the temperature with respect to position? Do we differentiate the temperature with respect to $x$ ? Or with respect to $y$ , or $z$ ?

Useful physical laws do not depend upon the orientation of the coordinate system. They should, therefore, be written in a form in which either both sides are scalars or both sides are vectors. What is the derivative of a scalar field, say $\ddpl{T}{x}$ ? Is it a scalar, or a vector, or what? It is neither a scalar nor a vector, as you can easily appreciate, because if we took a different $x$ -axis, $\ddpl{T}{x}$ would certainly be different. But notice: We have three possible derivatives: $\ddpl{T}{x}$ , $\ddpl{T}{y}$ , and $\ddpl{T}{z}$ . Since there are three kinds of derivatives and we know that it takes three numbers to form a vector, perhaps these three derivatives are the components of a vector: $\begin{equation} \label{Eq:II:2:11} \biggl(\ddp{T}{x},\ddp{T}{y},\ddp{T}{z}\biggr)\overset{?}{=}\text{a vector}. \end{equation}$

Of course it is not generally true that any three numbers form a vector. It is true only if, when we rotate the coordinate system, the components of the vector transform among themselves in the correct way. So it is necessary to analyze how these derivatives are changed by a rotation of the coordinate system. We shall show that (2.11) is indeed a vector. The derivatives do transform in the correct way when the coordinate system is rotated.

We can see this in several ways. One way is to ask a question whose answer is independent of the coordinate system, and try to express the answer in an “invariant” form. For instance, if $S=\FLPA\cdot\FLPB$ , and if $\FLPA$ and $\FLPB$ are vectors, we know—because we proved it in Chapter 11 of Vol. I—that $S$ is a scalar. We know that $S$ is a scalar without investigating whether it changes with changes in coordinate systems. It can’t, because it’s a dot product of two vectors. Similarly, if we have three numbers $B_1$ , $B_2$ , and $B_3$ and we find out that for every vector $\FLPA$ $\begin{equation} \label{Eq:II:2:12} A_xB_1+A_yB_2+A_zB_3=S, \end{equation}$ where $S$ is the same for any coordinate system, then it must be that the three numbers $B_1$ , $B_2$ , $B_3$ are the components $B_x$ , $B_y$ , $B_z$ of some vector $\FLPB$ .

Now let’s think of the temperature field. Suppose we take two points $P_1$ and $P_2$ , separated by the small interval $\Delta\FLPR$ . The temperature at $P_1$ is $T_1$ and at $P_2$ is $T_2$ , and the difference $\Delta T=T_2-T_1$ . The temperatures at these real, physical points certainly do not depend on what axis we choose for measuring the coordinates. In particular, $\Delta T$ is a number independent of the coordinate system. It is a scalar.

Fig. 2–5. The vector

$\Delta\FigR$ , whose components are

$\Delta x$ ,

$\Delta y$ , and

$\Delta z$ .

If we choose some convenient set of axes, we could write $T_1=T(x,y,z)$ and $T_2=T(x+\Delta x,y+\Delta y,z+\Delta z)$ , where $\Delta x$ , $\Delta y$ , and $\Delta z$ are the components of the vector $\Delta\FLPR$ (Fig. 2–5). Remembering Eq. (2.7), we can write $\begin{equation} \label{Eq:II:2:13} \Delta T=\ddp{T}{x}\,\Delta x+\ddp{T}{y}\,\Delta y+\ddp{T}{z}\,\Delta z. \end{equation}$ The left side of Eq. (2.13) is a scalar. The right side is the sum of three products with $\Delta x$ , $\Delta y$ , and $\Delta z$ , which are the components of a vector. It follows that the three numbers $\begin{equation*} \ddp{T}{x},\ddp{T}{y},\ddp{T}{z} \end{equation*}$ are also the $x$ -, $y$ -, and $z$ -components of a vector. We write this new vector with the symbol $\FLPgrad{T}$ . The symbol $\FLPnabla$ (called “del”) is an upside-down $\Delta$ , and is supposed to remind us of differentiation. People read $\FLPgrad{T}$ in various ways: “del- $T$ ,” or “gradient of $T$ ,” or “ $\grad T$ ;”¹ $\begin{equation} \label{Eq:II:2:14} \grad T=\FLPgrad{T}=\biggl(\ddp{T}{x},\ddp{T}{y},\ddp{T}{z}\biggr). \end{equation}$

Using this notation, we can rewrite Eq. (2.13) in the more compact form $\begin{equation} \label{Eq:II:2:15} \Delta T=\FLPgrad{T}\cdot\Delta\FLPR. \end{equation}$ In words, this equation says that the difference in temperature between two nearby points is the dot product of the gradient of $T$ and the vector displacement between the points. The form of Eq. (2.15) also illustrates clearly our proof above that $\FLPgrad{T}$ is indeed a vector.

Perhaps you are still not convinced? Let’s prove it in a different way. (Although if you look carefully, you may be able to see that it’s really the same proof in a longer-winded form!) We shall show that the components of $\FLPgrad{T}$ transform in just the same way that components of $\FLPR$ do. If they do, $\FLPgrad{T}$ is a vector according to our original definition of a vector in Chapter 11 of Vol. I. We take a new coordinate system $x'$ , $y'$ , $z'$ , and in this new system we calculate $\ddpl{T}{x'}$ , $\ddpl{T}{y'}$ , and $\ddpl{T}{z'}$ . To make things a little simpler, we let $z=z'$ , so that we can forget about the $z$ -coordinate. (You can check out the more general case for yourself.)

Fig. 2–6. (a) Transformation to a rotated coordinate system. (b) Special case of an interval

$\Delta\FigR$ parallel to the

$x$ -axis.

We take an $x'y'$ -system rotated an angle $\theta$ with respect to the $xy$ -system, as in Fig. 2–6(a). For a point $(x,y)$ the coordinates in the prime system are $\begin{alignat}{3} \label{Eq:II:2:16} &x'&&=\phantom{-}x\cos\theta&&+y\sin\theta,\\[1ex] \label{Eq:II:2:17} &y'&&=-x\sin\theta&&+y\cos\theta. \end{alignat}$ Or, solving for $x$ and $y$ , $\begin{alignat}{3} \label{Eq:II:2:18} &x&&=x'\cos\theta&&-y'\sin\theta,\\[1ex] \label{Eq:II:2:19} &y&&=x'\sin\theta&&+y'\cos\theta. \end{alignat}$ If any pair of numbers transforms with these equations in the same way that $x$ and $y$ do, they are the components of a vector.

Now let’s look at the difference in temperature between the two nearby points $P_1$ and $P_2$ , chosen as in Fig. 2–6(b). If we calculate with the $x$ - and $y$ -coordinates, we would write $\begin{equation} \label{Eq:II:2:20} \Delta T=\ddp{T}{x}\,\Delta x \end{equation}$ —since $\Delta y$ is zero.

What would a computation in the prime system give? We would have written $\begin{equation} \label{Eq:II:2:21} \Delta T=\ddp{T}{x'}\,\Delta x'+\ddp{T}{y'}\,\Delta y'. \end{equation}$ Looking at Fig. 2–6(b), we see that $\begin{equation} \label{Eq:II:2:22} \Delta x'=\phantom{-}\Delta x\cos\theta \end{equation}$ and $\begin{equation} \label{Eq:II:2:23} \Delta y'=-\Delta x\sin\theta, \end{equation}$ since $\Delta y'$ is negative when $\Delta x$ is positive. Substituting these in Eq. (2.21), we find that $\begin{align} \label{Eq:II:2:24} \Delta T&=\ddp{T}{x'}\,\Delta x\cos\theta-\ddp{T}{y'}\,\Delta x\sin\theta\\[1ex] \label{Eq:II:2:25} &=\biggl(\ddp{T}{x'}\cos\theta-\ddp{T}{y'}\sin\theta\biggr)\Delta x. \end{align}$ Comparing Eq. (2.25) with (2.20), we see that $\begin{equation} \label{Eq:II:2:26} \ddp{T}{x}=\ddp{T}{x'}\cos\theta-\ddp{T}{y'}\sin\theta. \end{equation}$ This equation says that $\ddpl{T}{x}$ is obtained from $\ddpl{T}{x'}$ and $\ddpl{T}{y'}$ , just as $x$ is obtained from $x'$ and $y'$ in Eq. (2.18). So $\ddpl{T}{x}$ is the $x$ -component of a vector. The same kind of arguments would show that $\ddpl{T}{y}$ and $\ddpl{T}{z}$ are $y$ - and $z$ -components. So $\FLPgrad{T}$ is definitely a vector. It is a vector field derived from the scalar field $T$ .

The operator $\FLPnabla$

Now we can do something that is extremely amusing and ingenious—and characteristic of the things that make mathematics beautiful. The argument that $\grad T$ , or $\FLPgrad{T}$ , is a vector did not depend upon what scalar field we were differentiating. All the arguments would go the same if $T$ were replaced by any scalar field. Since the transformation equations are the same no matter what we differentiate, we could just as well omit the $T$ and replace Eq. (2.26) by the operator equation $\begin{equation} \label{Eq:II:2:27} \ddp{}{x}=\ddp{}{x'}\cos\theta-\ddp{}{y'}\sin\theta. \end{equation}$ We leave the operators, as Jeans said, “hungry for something to differentiate.”

Since the differential operators themselves transform as the components of a vector should, we can call them components of a vector operator. We can write $\begin{equation} \label{Eq:II:2:28} \FLPnabla=\biggl(\ddp{}{x},\ddp{}{y},\ddp{}{z}\biggr), \end{equation}$ which means, of course, $\begin{equation} \label{Eq:II:2:29} \nabla_x=\ddp{}{x},\quad\nabla_y=\ddp{}{y},\quad\nabla_z=\ddp{}{z}. \end{equation}$ We have abstracted the gradient away from the $T$ —that is the wonderful idea.

You must always remember, of course, that $\FLPnabla$ is an operator. Alone, it means nothing. If $\FLPnabla$ by itself means nothing, what does it mean if we multiply it by a scalar—say $T$ —to get the product $T\FLPnabla$ ? (One can always multiply a vector by a scalar.) It still does not mean anything. Its $x$ -component is $\begin{equation} \label{Eq:II:2:30} T\ddp{}{x}, \end{equation}$ which is not a number, but is still some kind of operator. However, according to the algebra of vectors we would still call $T\FLPnabla$ a vector.

Now let’s multiply $\FLPnabla$ by a scalar on the other side, so that we have the product $(\FLPgrad{T})$ . In ordinary algebra $\begin{equation} \label{Eq:II:2:31} T\FLPA=\FLPA T, \end{equation}$ but we have to remember that operator algebra is a little different from ordinary vector algebra. With operators we must always keep the sequence right, so that the operations make proper sense. You will have no difficulty if you just remember that the operator $\FLPnabla$ obeys the same convention as the derivative notation. What is to be differentiated must be placed on the right of the $\FLPnabla$ . The order is important.

Keeping in mind this problem of order, we understand that $T\FLPnabla$ is an operator, but the product $\FLPgrad{T}$ is no longer a hungry operator; the operator is completely satisfied. It is indeed a physical vector having a meaning. It represents the spatial rate of change of $T$ . The $x$ -component of $\FLPgrad{T}$ is how fast $T$ changes in the $x$ -direction. What is the direction of the vector $\FLPgrad{T}$ ? We know that the rate of change of $T$ in any direction is the component of $\FLPgrad{T}$ in that direction (see Eq. (2.15)). It follows that the direction of $\FLPgrad{T}$ is that in which it has the largest possible component—in other words, the direction in which $T$ changes the fastest. The gradient of $T$ has the direction of the steepest uphill slope (in $T$ ).

Operations with $\FLPnabla$

Can we do any other algebra with the vector operator $\FLPnabla$ ? Let us try combining it with a vector. We can combine two vectors by making a dot product. We could make the products $\begin{equation*} (\text{a vector})\cdot\FLPnabla,\quad\text{or}\quad\FLPdiv{(\text{a vector})}. \end{equation*}$ The first one doesn’t mean anything yet, because it is still an operator. What it might ultimately mean would depend on what it is made to operate on. The second product is some scalar field. ( $\FLPA\cdot\FLPB$ is always a scalar.)

Let’s try the dot product of $\FLPnabla$ with a vector field we know, say $\FLPh$ . We write out the components: $\begin{equation} \label{Eq:II:2:32} \FLPdiv{\FLPh}=\nabla_xh_x+\nabla_yh_y+\nabla_zh_z \end{equation}$ or $\begin{equation} \label{Eq:II:2:33} \FLPdiv{\FLPh}=\ddp{h_x}{x}+\ddp{h_y}{y}+\ddp{h_z}{z}. \end{equation}$ The sum is invariant under a coordinate transformation. If we were to choose a different system (indicated by primes), we would have² $\begin{equation} \label{Eq:II:2:34} \FLPnabla'\cdot\FLPh=\ddp{h_{x'}}{x'}+\ddp{h_{y'}}{y'}+\ddp{h_{z'}}{z'}, \end{equation}$ which is the same number as would be gotten from Eq. (2.33), even though it looks different. That is, $\begin{equation} \label{Eq:II:2:35} \FLPnabla'\cdot\FLPh=\FLPdiv{\FLPh} \end{equation}$ for every point in space. So $\FLPdiv{\FLPh}$ is a scalar field, which must represent some physical quantity. You should realize that the combination of derivatives in $\FLPdiv{\FLPh}$ is rather special. There are all sorts of other combinations like $\ddpl{h_y}{x}$ , which are neither scalars nor components of vectors.

The scalar quantity $\FLPdiv{(\text{a vector})}$ is extremely useful in physics. It has been given the name the divergence. For example, $\begin{equation} \label{Eq:II:2:36} \FLPdiv{\FLPh}=\ndiv\FLPh=\text{“divergence of $\FLPh$.”} \end{equation}$ As we did for $\FLPgrad{T}$ , we can ascribe a physical significance to $\FLPdiv{\FLPh}$ . We shall, however, postpone that until later.

First, we wish to see what else we can cook up with the vector operator $\FLPnabla$ . What about a cross product? We must expect that $\begin{equation} \label{Eq:II:2:37} \FLPcurl{\FLPh}=\text{a vector}. \end{equation}$ It is a vector whose components we can write by the usual rule for cross products (see Eq. (2.2)): $\begin{equation} \label{Eq:II:2:38} (\FLPcurl{\FLPh})_z= \nabla_x h_y -\nabla_y h_x =\ddp{h_y}{x} -\ddp{h_x}{y}. \end{equation}$ Similarly, $\begin{equation} \label{Eq:II:2:39} (\FLPcurl{\FLPh})_x = \nabla_y h_z -\nabla_z h_y =\ddp{h_z}{y} -\ddp{h_y}{z}\phantom{.} \end{equation}$ and $\begin{equation} \label{Eq:II:2:40} (\FLPcurl{\FLPh})_y = \nabla_z h_x -\nabla_x h_z =\ddp{h_x}{z} -\ddp{h_z}{x}. \end{equation}$

The combination $\FLPcurl{\FLPh}$ is called “the curl of $\FLPh$ .” The reason for the name and the physical meaning of the combination will be discussed later.

Summarizing, we have three kinds of combinations with $\FLPnabla$ : $\begin{alignat*}{3} &\FLPgrad{T}&&=\grad T\;&&=\text{a vector},\\[1ex] &\FLPdiv{\FLPh}&&=\ndiv\FLPh&&=\text{a scalar},\\[1ex] &\FLPcurl{\FLPh}\;&&=\curl\FLPh&&=\text{a vector}. \end{alignat*}$ Using these combinations, we can write about the spatial variations of fields in a convenient way—in a way that is general, in that it doesn’t depend on any particular set of axes.

As an example of the use of our vector differential operator $\FLPnabla$ , we write a set of vector equations which contain the same laws of electromagnetism that we gave in words in Chapter 1. They are called Maxwell’s equations. $\begin{gather} \notag \textit{Maxwell’s Equations}\\[1ex] \label{Eq:II:2:41} \begin{alignedat}{2} &(1)&\quad \FLPdiv{\FLPE}\;&=\frac{\rho}{\epsO}\\[.5ex] &(2)&\quad\FLPcurl{\FLPE}\;&=-\ddp{\FLPB}{t}\\[.5ex] &(3)&\quad\FLPdiv{\FLPB}\;&=0\\[.5ex] &(4)&\quad c^2\,\FLPcurl{\FLPB}\;&=\ddp{\FLPE}{t}+\frac{\FLPj}{\epsO} \end{alignedat} \end{gather}$ where $\rho$ (rho), the “electric charge density,” is the amount of charge per unit volume, and $\FLPj$ , the “electric current density,” is the rate at which charge flows through a unit area per second. These four equations contain the complete classical theory of the electromagnetic field. You see what an elegantly simple form we can get with our new notation!

2–6The differential equation of heat flow

Let us give another example of a law of physics written in vector notation. The law is not a precise one, but for many metals and a number of other substances that conduct heat it is quite accurate. You know that if you take a slab of material and heat one face to temperature $T_2$ and cool the other to a different temperature $T_1$ the heat will flow through the material from $T_2$ to $T_1$ [Fig. 2–7(a)]. The heat flow is proportional to the area $A$ of the faces, and to the temperature difference. It is also inversely proportional to $d$ , the distance between the plates. (For a given temperature difference, the thinner the slab the greater the heat flow.) Letting $J$ be the thermal energy that passes per unit time through the slab, we write $\begin{equation} \label{Eq:II:2:42} J=\kappa(T_2-T_1)\,\frac{A}{d}. \end{equation}$ The constant of proportionality $\kappa$ (kappa) is called the thermal conductivity.

Fig. 2–7. (a) Heat flow through a slab. (b) An infinitesimal slab parallel to an isothermal surface in a large block.

What will happen in a more complicated case? Say in an odd-shaped block of material in which the temperature varies in peculiar ways? Suppose we look at a tiny piece of the block and imagine a slab like that of Fig. 2–7(a) on a miniature scale. We orient the faces parallel to the isothermal surfaces, as in Fig. 2–7(b), so that Eq. (2.42) is correct for the small slab.

If the area of the small slab is $\Delta A$ , the heat flow per unit time is $\begin{equation} \label{Eq:II:2:43} \Delta J=\kappa\,\Delta T\,\frac{\Delta A}{\Delta s}, \end{equation}$ where $\Delta s$ is the thickness of the slab. Now $\Delta J/\Delta A$ we have defined earlier as the magnitude of $\FLPh$ , whose direction is the heat flow. The heat flow will be from $T_1+\Delta T$ toward $T_1$ and so it will be perpendicular to the isotherms, as drawn in Fig. 2–7(b). Also, $\Delta T/\Delta s$ is just the rate of change of $T$ with position. And since the position change is perpendicular to the isotherms, our $\Delta T/\Delta s$ is the maximum rate of change. It is, therefore, just the magnitude of $\FLPgrad{T}$ . Now since the direction of $\FLPgrad{T}$ is opposite to that of $\FLPh$ , we can write (2.43) as a vector equation: $\begin{equation} \label{Eq:II:2:44} \FLPh=-\kappa\,\FLPgrad{T}. \end{equation}$ (The minus sign is necessary because heat flows “downhill” in temperature.) Equation (2.44) is the differential equation of heat conduction in bulk materials. You see that it is a proper vector equation. Each side is a vector if $\kappa$ is just a number. It is the generalization to arbitrary cases of the special relation (2.42) for rectangular slabs. Later we should learn to write all sorts of elementary physics relations like (2.42) in the more sophisticated vector notation. This notation is useful not only because it makes the equations look simpler. It also shows most clearly the physical content of the equations without reference to any arbitrarily chosen coordinate system.

2–7Second derivatives of vector fields

So far we have had only first derivatives. Why not second derivatives? We could have several combinations: $\begin{equation} \begin{alignedat}{2} &(\text{a})&\quad&\FLPdiv{(\FLPgrad{T})}\\[.5ex] &(\text{b})&&\FLPcurl{(\FLPgrad{T})}\\[.5ex] &(\text{c})&&\FLPgrad{(\FLPdiv{\FLPh})}\\[.5ex] &(\text{d})&&\FLPdiv{(\FLPcurl{\FLPh})}\\[.5ex] &(\text{e})&&\FLPcurl{(\FLPcurl{\FLPh})} \end{alignedat} \label{Eq:II:2:45} \end{equation}$ You can check that these are all the possible combinations.

Let’s look first at the second one, (b). It has the same form as $\begin{equation*} \FLPA\times(\FLPA T)=(\FLPA\times\FLPA)T=\FLPzero, \end{equation*}$ since $\FLPA\times\FLPA$ is always zero. So we should have $\begin{equation} \label{Eq:II:2:46} \curl(\grad T)=\FLPcurl{(\FLPgrad{T})}=\FLPzero. \end{equation}$ We can see how this equation comes about if we go through once with the components: $\begin{align} [\FLPcurl{(\FLPgrad{T})}]_z&= \nabla_x(\FLPgrad{T})_y-\nabla_y(\FLPgrad{T})_x\notag\\[1ex] \label{Eq:II:2:47} &=\ddp{}{x}\biggl(\ddp{T}{y}\biggr)-\ddp{}{y}\biggl(\ddp{T}{x}\biggr), \end{align}$ which is zero (by Eq. (2.8)). It goes the same for the other components. So $\FLPcurl{(\FLPgrad{T})}=\FLPzero$ , for any temperature distribution—in fact, for any scalar function.

Now let us take another example. Let us see whether we can find another zero. The dot product of a vector with a cross product which contains that vector is zero: $\begin{equation} \label{Eq:II:2:48} \FLPA\cdot(\FLPA\times\FLPB)=0, \end{equation}$ because $\FLPA\times\FLPB$ is perpendicular to $\FLPA$ , and so has no components in the direction $\FLPA$ . The same combination appears in (d) of (2.45), so we have $\begin{equation} \label{Eq:II:2:49} \FLPdiv{(\FLPcurl{\FLPh})}=\ndiv(\curl\FLPh)=0. \end{equation}$ Again, it is easy to show that it is zero by carrying through the operations with components.

Now we are going to state two mathematical theorems that we will not prove. They are very interesting and useful theorems for physicists to know.

In a physical problem we frequently find that the curl of some quantity—say of the vector field $\FLPA$ —is zero. Now we have seen (Eq. (2.46)) that the curl of a gradient is zero, which is easy to remember because of the way the vectors work. It could certainly be, then, that $\FLPA$ is the gradient of some quantity, because then its curl would necessarily be zero. The interesting theorem is that if the $\curl\FLPA$ is zero, then $\FLPA$ is always the gradient of something—there is some scalar field $\psi$ (psi) such that $\FLPA$ is equal to $\grad\psi$ . In other words, we have the $\begin{alignat}{2} \kern-6em\text{Theorem:}\notag\\[3pt] &\text{If}&\FLPnabla\times\FLPA&=\FLPzero\notag\\[3pt] &\text{there is a}&\psi&\notag\\[3pt] \label{Eq:II:2:50} &\text{such that}\quad&\FLPA=\FLPnabla\psi&. \end{alignat}$

There is a similar theorem if the divergence of $\FLPA$ is zero. We have seen in Eq. (2.49) that the divergence of a curl of something is always zero. If you come across a vector field $\FLPD$ for which $\ndiv\FLPD$ is zero, then you can conclude that $\FLPD$ is the curl of some vector field $\FLPC$ . $\begin{alignat}{2} \kern-6em\text{Theorem:}\notag\\[3pt] &\text{If}&\FLPnabla\cdot\FLPD&=0\notag\\[3pt] &\text{there is a}&\FLPC&\notag\\[3pt] \label{Eq:II:2:51} &\text{such that}\quad&\FLPD=\FLPnabla&\times\FLPC. \end{alignat}$

In looking at the possible combinations of two $\FLPnabla$ operators, we have found that two of them always give zero. Now we look at the ones that are not zero. Take the combination $\FLPdiv{(\FLPgrad{T})}$ , which was first on our list. It is not, in general, zero. We write out the components: $\begin{equation*} \FLPgrad{T}=\FLPi\nabla_xT+\FLPj\nabla_yT+\FLPk\nabla_zT. \end{equation*}$ Then $\begin{align} \FLPdiv{(\FLPgrad{T})}&= \nabla_x(\nabla_xT)+\nabla_y(\nabla_yT)+\nabla_z(\nabla_zT)\notag\\[1ex] \label{Eq:II:2:52} &=\frac{\partial^2T}{\partial x^2}+\frac{\partial^2T}{\partial y^2}+ \frac{\partial^2T}{\partial z^2}, \end{align}$ which would, in general, come out to be some number. It is a scalar field.

You see that we do not need to keep the parentheses, but can write, without any chance of confusion, $\begin{equation} \label{Eq:II:2:53} \FLPdiv{(\FLPgrad{T})}=\FLPdiv{\FLPgrad{T}}=(\FLPdiv{\FLPnabla})T=\nabla^2T. \end{equation}$ We look at $\nabla^2$ as a new operator. It is a scalar operator. Because it appears often in physics, it has been given a special name—the Laplacian. $\begin{equation} \label{Eq:II:2:54} \text{Laplacian}=\nabla^2=\frac{\partial^2}{\partial x^2}+\frac{\partial^2}{\partial y^2}+ \frac{\partial^2}{\partial z^2}. \end{equation}$

Since the Laplacian is a scalar operator, we may operate with it on a vector—by which we mean the same operation on each component in rectangular coordinates: $\begin{equation*} \nabla^2\FLPh=(\nabla^2h_x,\nabla^2h_y,\nabla^2h_z). \end{equation*}$

Let’s look at one more possibility: $\FLPcurl{(\FLPcurl{\FLPh})}$ , which was (e) in the list (2.45). Now the curl of the curl can be written differently if we use the vector equality (2.6): $\begin{equation} \label{Eq:II:2:55} \FLPA\times(\FLPB\times\FLPC)=\FLPB(\FLPA\cdot\FLPC)-\FLPC(\FLPA\cdot\FLPB). \end{equation}$ In order to use this formula, we should replace $\FLPA$ and $\FLPB$ by the operator $\FLPnabla$ and put $\FLPC=\FLPh$ . If we do that, we get $\begin{equation*} \FLPcurl{(\FLPcurl{\FLPh})}=\FLPgrad{(\FLPdiv{\FLPh})}- \FLPh(\FLPdiv{\FLPnabla})\ldots\text{???} \end{equation*}$ Wait a minute! Something is wrong. The first two terms are vectors all right (the operators are satisfied), but the last term doesn’t come out to anything. It’s still an operator. The trouble is that we haven’t been careful enough about keeping the order of our terms straight. If you look again at Eq. (2.55), however, you see that we could equally well have written it as $\begin{equation} \label{Eq:II:2:56} \FLPA\times(\FLPB\times\FLPC)= \FLPB(\FLPA\cdot\FLPC)-(\FLPA\cdot\FLPB)\FLPC. \end{equation}$ The order of terms looks better. Now let’s make our substitution in (2.56). We get $\begin{equation} \label{Eq:II:2:57} \FLPcurl{(\FLPcurl{\FLPh})}=\FLPgrad{(\FLPdiv{\FLPh})}- (\FLPdiv{\FLPnabla})\FLPh. \end{equation}$ This form looks all right. It is, in fact, correct, as you can verify by computing the components. The last term is the Laplacian, so we can equally well write $\begin{equation} \label{Eq:II:2:58} \FLPcurl{(\FLPcurl{\FLPh})}=\FLPgrad{(\FLPdiv{\FLPh})}-\nabla^2\FLPh. \end{equation}$

We have had something to say about all of the combinations in our list of double $\FLPnabla$ 's, except for (c), $\FLPgrad{(\FLPdiv{\FLPh})}$ . It is a possible vector field, but there is nothing special to say about it. It’s just some vector field which may occasionally come up.

It will be convenient to have a table of our conclusions: $\begin{equation} \begin{alignedat}{2} &(\text{a})&&\FLPdiv{(\FLPgrad{T})}=\nabla^2T=\text{a scalar field}\\[.5ex] &(\text{b})&&\FLPcurl{(\FLPgrad{T})}=\FLPzero\\[.5ex] &(\text{c})&&\FLPgrad{(\FLPdiv{\FLPh})}=\text{a vector field}\\[.5ex] &(\text{d})&&\FLPdiv{(\FLPcurl{\FLPh})}=0\\[.5ex] &(\text{e})&&\FLPcurl{(\FLPcurl{\FLPh})}= \FLPgrad{(\FLPdiv{\FLPh})}-\nabla^2\FLPh\\[.5ex] &(\text{f})&\quad(&\FLPdiv{\FLPnabla})\FLPh=\nabla^2\FLPh=\text{a vector field} \end{alignedat} \label{Eq:II:2:59} \end{equation}$ You may notice that we haven’t tried to invent a new vector operator $(\FLPcurl{\FLPnabla})$ . Do you see why?

2–8Pitfalls

We have been applying our knowledge of ordinary vector algebra to the algebra of the operator $\FLPnabla$ . We have to be careful, though, because it is possible to go astray. There are two pitfalls which we will mention, although they will not come up in this course. What would you say about the following expression, that involves the two scalar functions $\psi$ and $\phi$ (phi): $\begin{equation*} (\FLPgrad{\psi})\times(\FLPgrad{\phi})? \end{equation*}$ You might want to say: it must be zero because it’s just like $\begin{equation*} (\FLPA a)\times(\FLPA b), \end{equation*}$ which is zero because the cross product of two equal vectors $\FLPA\times\FLPA$ is always zero. But in our example the two operators $\FLPnabla$ are not equal! The first one operates on one function, $\psi$ ; the other operates on a different function, $\phi$ . So although we represent them by the same symbol $\FLPnabla$ , they must be considered as different operators. Clearly, the direction of $\FLPgrad{\psi}$ depends on the function $\psi$ , so it is not likely to be parallel to $\FLPgrad{\phi}$ : $\begin{equation*} (\FLPgrad{\psi})\times(\FLPgrad{\phi})\neq0\quad(\text{generally}). \end{equation*}$ Fortunately, we won’t have to use such expressions. (What we have said doesn’t change the fact that $\FLPcurl{\FLPgrad{\psi}}=\FLPzero$ for any scalar field, because here both $\FLPnabla$ ’s operate on the same function.)

Pitfall number two (which, again, we need not get into in our course) is the following: The rules that we have outlined here are simple and nice when we use rectangular coordinates. For example, if we have $\nabla^2\FLPh$ and we want the $x$ -component, it is $\begin{equation} \label{Eq:II:2:60} (\nabla^2\FLPh)_x=\biggl(\frac{\partial^2}{\partial x^2}+\frac{\partial^2}{\partial y^2}+\frac{\partial^2}{\partial z^2}\biggr)h_x=\nabla^2h_x. \end{equation}$ The same expression would not work if we were to ask for the radial component of $\nabla^2\FLPh$ . The radial component of $\nabla^2\FLPh$ is not equal to $\nabla^2h_r$ . The reason is that when we are dealing with the algebra of vectors, the directions of the vectors are all quite definite. But when we are dealing with vector fields, their directions are different at different places. If we try to describe a vector field in, say, polar coordinates, what we call the “radial” direction varies from point to point. So we can get into a lot of trouble when we start to differentiate the components. For example, even for a constant vector field, the radial component changes from point to point.

It is usually safest and simplest just to stick to rectangular coordinates and avoid trouble, but there is one exception worth mentioning: Since the Laplacian $\nabla^2$ , is a scalar, we can write it in any coordinate system we want to (for example, in polar coordinates). But since it is a differential operator, we should use it only on vectors whose components are in a fixed direction—that means rectangular coordinates. So we shall express all of our vector fields in terms of their $x$ -, $y$ -, and $z$ -components when we write our vector differential equations out in components.

In our notation, the expression $(a,b,c)$ represents a vector with components $a$ , $b$ , and $c$ . If you like to use the unit vectors $\FLPi$ , $\FLPj$ , and $\FLPk$ , you may write $\begin{equation*} \FLPgrad{T}=\FLPi\ddp{T}{x}+\FLPj\ddp{T}{y}+\FLPk\ddp{T}{z}. \end{equation*}$ ↩
We think of $\FLPh$ as a physical quantity that depends on position in space, and not strictly as a mathematical function of three variables. When $\FLPh$ is “differentiated” with respect to $x$ , $y$ , and $z$ , or with respect to $x'$ , $y'$ , and $z'$ , the mathematical expression for $\FLPh$ must first be expressed as a function of the appropriate variables. ↩