Einstein equation in the Newtonian limit

Reference: Moore, Thomas A., A General Relativity Workbook, University Science Books (2013) – Chapter 21; Problem 21.1.

The Einstein equation is

\displaystyle  R^{ij}=\kappa\left(T^{ij}-\frac{1}{2}g^{ij}T\right)+\Lambda g^{ij} \ \ \ \ \ (1)


where we have yet to determine the constant {\kappa}. To do this, we need to show that the Einstein equation reduces to Newton’s law of gravity for weak gravitational fields. Actually, there are three conditions that should hold in the Newtonian limit. First, as we’ve said, the gravitational field is weak, meaning that spacetime is nearly flat. Second, objects should travel with a speed much less than the speed of light (the spatial four-velocity components {u^{i}\ll1} for {i=x,y,z}). The second condition implies that the only non-negligible component of the stress-energy tensor {T^{ij}} is {T^{tt}}. For example, for a perfect fluid, we can assume that it’s effectively at rest, so the off-diagonal elements are all zero. For the diagonal spatial compoments, we have (for {T^{zz}}; the other 2 components have the analogous formulas):

\displaystyle  T^{zz}=\frac{1}{L^{3}}\int dp^{x}\int dp^{y}\int\left(p^{z}\right)^{2}\frac{N\left(p\right)}{p^{t}}dp^{z} \ \ \ \ \ (2)

This equation is for a cubic volume of side length {L} containing {N\left(p\right)} particles of momentum {p}. Since {p^{z}=mu^{z}} and {u^{z}\ll1}, {T^{zz}\approx0} (the requirement of a weak gravitational field means that {N\left(p\right)} can’t be very large, since we can’t have that much mass). As the spatial diagonal elements {T^{ii}=P}, the pressure and {T^{tt}=\rho}, the energy density, this condition translates to {\rho\gg P}.

With these assumptions, we can try to show that the relativistic equation of geodesic deviation reduces to the Newtonian version. That is, we want to show that

\displaystyle  \ddot{\mathbf{n}}^{i}=-R_{\; j\ell m}^{i}u^{m}u^{j}n^{\ell} \ \ \ \ \ (3)


reduces to

\displaystyle  \ddot{n}^{i}=-\eta^{ij}\left(\partial_{k}\partial_{j}\Phi\right)n^{k}\mbox{ (for }i,j,k=x,y,z\mbox{)} \ \ \ \ \ (4)


where {n^{i}} is the separation of two infinitesimally close geodesics (this is the tidal force.) and {\Phi} is the Newtonian gravitational potential.

Starting with 3, we can eliminate terms where {m\ne t} or {j\ne t} (because {u^{i}\approx0} for {i\ne t}) to get

\displaystyle  \ddot{n}^{i}=-R_{\; t\ell t}^{i}n^{\ell} \ \ \ \ \ (5)

where we’re now considering only the spatial components: {i,\ell=x,y,z}. Note that summing {\ell} over {x,y,z} is the same as summing it over {t,x,y,z} since due to the anti-symmetry of the Riemann tensor under interchange of its last two indices, {R_{\; ttt}^{i}=-R_{\; ttt}^{i}=0}. Also, because we’re in the non-relativistic limit, the proper time and coordinate time are essentially the same thing: {\tau\approx t}, so the time derivative is with respect to {t}.

Comparing this to 4, we have (renaming {\ell} to {k} in the last equation):

\displaystyle  R_{\; tkt}^{i}\approx\eta^{ij}\left(\partial_{k}\partial_{j}\Phi\right) \ \ \ \ \ (6)

Newton’s law of gravity in differential form is

\displaystyle   \nabla^{2}\Phi \displaystyle  = \displaystyle  4\pi G\rho\ \ \ \ \ (7)
\displaystyle  \displaystyle  = \displaystyle  \eta^{ij}\partial_{i}\partial_{j}\Phi\ \ \ \ \ (8)
\displaystyle  \displaystyle  \approx \displaystyle  R_{\; tit}^{i}\ \ \ \ \ (9)
\displaystyle  \displaystyle  = \displaystyle  R_{tt} \ \ \ \ \ (10)

(Again, the contraction over index {i} in the second to last line can be taken over 3 or 4 coordinates, since {R_{\; ttt}^{i}=0}.) We can raise both indices on the Ricci tensor in the usual way:

\displaystyle   R^{tt} \displaystyle  = \displaystyle  g^{ti}g^{tj}R_{ij}\ \ \ \ \ (11)
\displaystyle  \displaystyle  \approx \displaystyle  \eta^{ti}\eta^{tj}R_{ij}\ \ \ \ \ (12)
\displaystyle  \displaystyle  = \displaystyle  \left(-1\right)^{2}R_{tt}\ \ \ \ \ (13)
\displaystyle  \displaystyle  = \displaystyle  R_{tt} \ \ \ \ \ (14)

so we can write 10 in upper index form as

\displaystyle  \nabla^{2}\Phi\approx R^{tt} \ \ \ \ \ (15)

Now looking back at 1 and using the condition that {T^{tt}=\rho} is the only significant entry in the stress-energy tensor, we have

\displaystyle  T=g_{ij}T^{ij}=\eta_{ij}T^{ij}=-\rho \ \ \ \ \ (16)

so

\displaystyle   R^{tt} \displaystyle  = \displaystyle  \kappa\left(T^{tt}-\frac{1}{2}\eta^{tt}T\right)+\Lambda\eta^{tt}\ \ \ \ \ (17)
\displaystyle  \displaystyle  = \displaystyle  \frac{\kappa}{2}\rho-\Lambda\ \ \ \ \ (18)
\displaystyle  \displaystyle  \approx \displaystyle  \nabla^{2}\Phi\ \ \ \ \ (19)
\displaystyle  \displaystyle  = \displaystyle  4\pi G\rho \ \ \ \ \ (20)

Comparing the second and fourth lines, we see that {\Lambda\approx0} and

\displaystyle  \kappa=8\pi G \ \ \ \ \ (21)

and the Einstein equation becomes

\displaystyle  R^{ij}=8\pi G\left(T^{ij}-\frac{1}{2}g^{ij}T\right) \ \ \ \ \ (22)


or in its original form

\displaystyle  G^{ij}=8\pi GT^{ij} \ \ \ \ \ (23)

Actually, we can’t take {\Lambda=0}; all we can say is that for gravitational systems on the scale of the solar system (where Newtonian theory works well, except in the case of the orbit of Mercury) {\Lambda\ll4\pi G\rho}. To get a feel for how small {\Lambda} needs to be, suppose we have a spherical gravitational potential in empty space ({\rho=0}). Then in spherical coordinates

\displaystyle   \nabla^{2}\Phi \displaystyle  = \displaystyle  \frac{1}{r^{2}}\frac{d}{dr}\left(r^{2}\frac{d\Phi}{dr}\right)\ \ \ \ \ (24)
\displaystyle  \displaystyle  \approx \displaystyle  -\Lambda\ \ \ \ \ (25)
\displaystyle  \frac{d}{dr}\left(r^{2}\frac{d\Phi}{dr}\right) \displaystyle  \approx \displaystyle  -\Lambda r^{2}\ \ \ \ \ (26)
\displaystyle  \frac{d\Phi}{dr} \displaystyle  \approx \displaystyle  -\frac{\Lambda}{3}r \ \ \ \ \ (27)

This is the radial component (the only non-zero component) of the gradient of the potential, and the negative gradient of the gravitational potential is the gravitational field, which is the acceleration of gravity. In the solar system, Newton’s theory says that the acceleration due to the sun is

\displaystyle  g=\frac{GM_{s}}{r^{2}} \ \ \ \ \ (28)

so if {\Lambda\ne0} but its effect is not felt in Newton’s theory, we must have

\displaystyle  \frac{\Lambda}{3}r\ll\frac{GM_{s}}{r^{2}} \ \ \ \ \ (29)

in order for Newton’s theory to be valid in the solar system. Distances in the solar system are around {r\approx10^{12}\mbox{ m}}, {GM_{s}\approx1500\mbox{ m}} and {G} in general relativistic units is {7.426\times10^{-28}\mbox{ m kg}^{-1}} so this means

\displaystyle  \frac{\Lambda}{8\pi G}\ll\frac{1500}{8\pi\left(7.426\times10^{-28}\right)\left(10^{12}\right)^{3}}=2.4\times10^{-7}\mbox{ kg m}^{-3} \ \ \ \ \ (30)

Gravity can’t exist in 3 spacetime dimensions either

Reference: Moore, Thomas A., A General Relativity Workbook, University Science Books (2013) – Chapter 21; Problem 21.6.

We’ve seen that the Einstein equation doesn’t allow gravity to exist in 2 spacetime dimensions. Here we’ll look at a demonstration that gravity also cannot exist in 3 spacetime dimensions of {t}, {x} and {y}.

Because of the symmetries of the Riemann tensor, there are six independent, (possibly) non-zero components in 3 dimensions. We can take these components to be {R_{xtxt}}, {R_{ytyt}}, {R_{xtyt}}, {R_{txxy}}, {R_{txyt}} and {R_{tyxy}}. As before, we use the Einstein equation

\displaystyle  R^{ij}=\kappa\left(T^{ij}-\frac{1}{2}g^{ij}T\right)+\Lambda g^{ij} \ \ \ \ \ (1)

with {\Lambda=0} and consider a vacuum so that {T^{ij}=T=0}, meaning that {R^{ij}=0}. We’ll look at a local inertial frame (LIF), where the metric is {g^{ij}=\eta^{ij}}. Then we get

\displaystyle   R_{ij} \displaystyle  = \displaystyle  R_{\; iaj}^{a}\ \ \ \ \ (2)
\displaystyle  \displaystyle  = \displaystyle  \eta^{ak}R_{kiaj}\ \ \ \ \ (3)
\displaystyle  \displaystyle  = \displaystyle  -R_{titj}+R_{xixj}+R_{yiyj} \ \ \ \ \ (4)

Now we look at the 6 independent components of {R_{ij}}. Because {R_{ijkm}=-R_{jikm}=-R_{ijmk}}, any component with either the first two indices or last two indices equal is zero, so we get

\displaystyle   R_{tt} \displaystyle  = \displaystyle  R_{xtxt}+R_{ytyt}=0\ \ \ \ \ (5)
\displaystyle  R_{xx} \displaystyle  = \displaystyle  -R_{txtx}+R_{yxyx}=0\ \ \ \ \ (6)
\displaystyle  R_{yy} \displaystyle  = \displaystyle  -R_{tyty}+R_{xyxy}=0\ \ \ \ \ (7)
\displaystyle  R_{tx} \displaystyle  = \displaystyle  R_{ytyx}=0\ \ \ \ \ (8)
\displaystyle  R_{ty} \displaystyle  = \displaystyle  R_{xtxy}=0\ \ \ \ \ (9)
\displaystyle  R_{xy} \displaystyle  = \displaystyle  -R_{txty}=0 \ \ \ \ \ (10)

The last 3 equations show that 3 of the Riemann components are zero. The first 3 equations can be rewritten using the symmetries of the Riemann tensor:

\displaystyle   R_{tt} \displaystyle  = \displaystyle  R_{xtxt}+R_{ytyt}=0\ \ \ \ \ (11)
\displaystyle  R_{xx} \displaystyle  = \displaystyle  -R_{xtxt}+R_{xyxy}=0\ \ \ \ \ (12)
\displaystyle  R_{yy} \displaystyle  = \displaystyle  -R_{ytyt}+R_{xyxy}=0 \ \ \ \ \ (13)

Solving these equations gives

\displaystyle  R_{xtxt}=R_{ytyt}=R_{xyxy}=0 \ \ \ \ \ (14)

Thus all 6 components of the Riemann tensor are zero, showing that 3d spacetime must be flat and gravity cannot exist in 3 spacetime dimensions. (As usual, a tensor equation valid in a LIF is valid in all coordinate systems, so the conclusion is general.)

Gravity can’t exist in 2 spacetime dimensions

Reference: Moore, Thomas A., A General Relativity Workbook, University Science Books (2013) – Chapter 21; Problem 21.5.

One consequence of the Einstein equation is that gravity cannot exist in a vacuum in a universe with fewer than 4 dimensions (3 space and 1 time). Here we’ll look at a demonstration of this for 2 dimensions of {t} and {x}.

Because of the symmetries of the Riemann tensor, there is only one independent, (possibly) non-zero component in 2 dimensions. We can take this component to be {R_{xtxt}}. We start with the Einstein equation

\displaystyle  R^{ij}=\kappa\left(T^{ij}-\frac{1}{2}g^{ij}T\right)+\Lambda g^{ij} \ \ \ \ \ (1)

For this argument, we’ll assume {\Lambda=0} (whether or not it is actually zero is still a subject of debate, but we do know it’s very small). In a vacuum, the stress-energy tensor is zero (since there is no matter or energy), so in a vacuum {R^{ij}=0}. We’ll now see what this implies about the Riemann tensor (which is the only thing we can use to show conclusively whether spacetime is flat or curved). If {R^{ij}=0} then its lowered form is also zero:

\displaystyle  R_{ij}=g_{ia}g_{jb}R^{ab}=0 \ \ \ \ \ (2)

From the definition of the Ricci tensor

\displaystyle   R_{ij} \displaystyle  = \displaystyle  R_{\; iaj}^{a}\ \ \ \ \ (3)
\displaystyle  \displaystyle  = \displaystyle  g^{ak}R_{kiaj}\ \ \ \ \ (4)
\displaystyle  \displaystyle  = \displaystyle  g^{tt}R_{titj}+g^{tx}R_{tixj}+g^{xt}R_{xitj}+g^{xx}R_{xixj} \ \ \ \ \ (5)

Because {R_{ijkm}=-R_{jikm}=-R_{ijmk}}, any component with either the first two indices or last two indices equal is zero. Therefore

\displaystyle   R_{tt} \displaystyle  = \displaystyle  0+0+0+g^{xx}R_{xtxt}=0\ \ \ \ \ (6)
\displaystyle  R_{tx} \displaystyle  = \displaystyle  0+0+g^{xt}R_{xttx}+0=0\ \ \ \ \ (7)
\displaystyle  R_{xt} \displaystyle  = \displaystyle  0+g^{tx}R_{txxt}+0+0=0\ \ \ \ \ (8)
\displaystyle  R_{xx} \displaystyle  = \displaystyle  g^{tt}R_{txtx}+0+0+0=0 \ \ \ \ \ (9)

All four of the Riemann components in these equations are either equal to {R_{xtxt}} or to {-R_{xtxt}} so we see that this component must be zero (since all four components of the metric can’t be zero). Therefore, the Riemann tensor is identically zero and space is flat in the vacuum. Flat space means no gravitational field, so gravity can’t exist in two dimensions.

Einstein tensor of zero implies a zero Ricci tensor

Reference: Moore, Thomas A., A General Relativity Workbook, University Science Books (2013) – Chapter 21; Problem 21.2.

We wish to prove that {G^{ij}=0} if and only if {R^{ij}=0}. The Einstein tensor is defined as

\displaystyle G^{ij}\equiv R^{ij}-\frac{1}{2}g^{ij}R \ \ \ \ \ (1)

 

Clearly if the Ricci tensor {R^{ij}=0} then {G^{ij}=0} (since the curvature scalar is the contraction of the Ricci tensor: {R=g_{ij}R^{ij}}). To prove the converse, suppose {G^{ij}=0}. Then we can multiply both sides by {g_{ij}} to get

\displaystyle 0 \displaystyle = \displaystyle g_{ij}R^{ij}-\frac{1}{2}g_{ij}g^{ij}R\ \ \ \ \ (2)
\displaystyle \displaystyle = \displaystyle R-2R\ \ \ \ \ (3)
\displaystyle R \displaystyle = \displaystyle 0 \ \ \ \ \ (4)

since {g_{ij}g^{ij}=4}. With {G^{ij}=0} and {R=0}, 1 tells us that {R^{ij}=0}. QED.

Einstein equation for a perfect fluid

Reference: Moore, Thomas A., A General Relativity Workbook, University Science Books (2013) – Chapter 21; Problem 21.3.

The Einstein equation can be written as

\displaystyle  R^{ij}=\kappa\left(T^{ij}-\frac{1}{2}g^{ij}T\right)+\Lambda g^{ij} \ \ \ \ \ (1)

For a perfect fluid the stress-energy tensor is

\displaystyle  T^{ij}=\left(\rho_{0}+P_{0}\right)u^{i}u^{j}+P_{0}g^{ij} \ \ \ \ \ (2)


where {u^{i}} is the four-velocity, {\rho_{0}} is the energy density and {P_{0}} is the pressure.
We have for the stress-energy scalar:

\displaystyle   T \displaystyle  = \displaystyle  \left(\rho_{0}+P_{0}\right)g_{ij}u^{i}u^{j}+P_{0}g_{ij}g^{ij}\ \ \ \ \ (3)
\displaystyle  \displaystyle  = \displaystyle  -\left(\rho_{0}+P_{0}\right)+4P_{0}\ \ \ \ \ (4)
\displaystyle  \displaystyle  = \displaystyle  3P_{0}-\rho_{0} \ \ \ \ \ (5)

since {g_{ij}g^{ij}=4} and {g_{ij}u^{i}u^{j}=-1} in any coordinate system.

The stress energy term in 1 is therefore

\displaystyle   T^{ij}-\frac{1}{2}g^{ij}T \displaystyle  = \displaystyle  \left(\rho_{0}+P_{0}\right)u^{i}u^{j}+P_{0}g^{ij}-\frac{1}{2}g^{ij}\left(3P_{0}-\rho_{0}\right)\ \ \ \ \ (6)
\displaystyle  \displaystyle  = \displaystyle  \left(\rho_{0}+P_{0}\right)u^{i}u^{j}+\frac{1}{2}g^{ij}\left(\rho_{0}-P_{0}\right) \ \ \ \ \ (7)

In a local orthogonal frame (LOF) in which the fluid is at rest {u^{t}=1}, {u^{i}=0} for {i=x,y,z} and {g^{ij}=\eta^{ij}}, so

\displaystyle   T^{tt}-\frac{1}{2}g^{tt}T \displaystyle  = \displaystyle  \rho_{0}+P_{0}-\frac{1}{2}\left(\rho_{0}-P_{0}\right)\ \ \ \ \ (8)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{2}\left(\rho_{0}+3P_{0}\right)\ \ \ \ \ (9)
\displaystyle  T^{xx}-\frac{1}{2}g^{xx}T \displaystyle  = \displaystyle  \frac{1}{2}\left(\rho_{0}-P_{0}\right)\ \ \ \ \ (10)
\displaystyle  T^{yy}-\frac{1}{2}g^{yy}T \displaystyle  = \displaystyle  \frac{1}{2}\left(\rho_{0}-P_{0}\right)\ \ \ \ \ (11)
\displaystyle  T^{zz}-\frac{1}{2}g^{zz}T \displaystyle  = \displaystyle  \frac{1}{2}\left(\rho_{0}-P_{0}\right) \ \ \ \ \ (12)

Einstein equation: alternative form

Reference: Moore, Thomas A., A General Relativity Workbook, University Science Books (2013) – Chapter 21; Box 21.3.

The general relativistic generalization of Newton’s law of gravity is

\displaystyle  G^{ij}+\Lambda g^{ij}=\kappa T^{ij} \ \ \ \ \ (1)

where the Einstein tensor is defined in terms of the Ricci tensor and the curvature scalar as

\displaystyle  G^{ij}\equiv R^{ij}-\frac{1}{2}g^{ij}R \ \ \ \ \ (2)

We can write this in a different form that is sometimes easier to use in calculations. Eliminating {G^{ij}} we have

\displaystyle  R^{ij}-\frac{1}{2}g^{ij}R+\Lambda g^{ij}=\kappa T^{ij} \ \ \ \ \ (3)

Multiplying both sides by {g_{ij}} we get

\displaystyle  g_{ij}R^{ij}-\frac{1}{2}g_{ij}g^{ij}R+\Lambda g_{ij}g^{ij}=\kappa g_{ij}T^{ij} \ \ \ \ \ (4)

Because the tensor {g_{ij}} is the inverse of {g^{ij}}, their product gives the identity matrix of rank 4 (this can be seen by doing the calculation in a local inertial frame where {g_{ij}=\eta_{ij}} and noting that since it’s a tensor equation, it’s valid in all coordinate systems). That is

\displaystyle  g_{ij}g^{jk}=\delta_{\; i}^{k} \ \ \ \ \ (5)

so if we contract the {\delta_{\; j}^{k}} tensor we just sum up its diagonal elements and since these are all 1 (and there are four rows), we get

\displaystyle  \delta_{\; k}^{k}=4 \ \ \ \ \ (6)

Returning to 4 we get

\displaystyle   g_{ij}R^{ij}-2R+4\Lambda \displaystyle  = \displaystyle  \kappa g_{ij}T^{ij} \ \ \ \ \ (7)

Since the curvature scalar is given by

\displaystyle  R\equiv g_{ij}R^{ij} \ \ \ \ \ (8)

and the stress-energy scalar is

\displaystyle  T\equiv g_{ij}T^{ij} \ \ \ \ \ (9)

we get

\displaystyle  -R+4\Lambda=\kappa T \ \ \ \ \ (10)

Multiplying this by {-\frac{1}{2}g^{ij}} and subtracting from 3 we have

\displaystyle   R^{ij}-\Lambda g^{ij} \displaystyle  = \displaystyle  \kappa\left(T^{ij}-\frac{1}{2}g^{ij}T\right) \ \ \ \ \ (11)

Isolating the Ricci tensor gives us the alternative form of the Einstein equation:

\displaystyle  \boxed{R^{ij}=\kappa\left(T^{ij}-\frac{1}{2}g^{ij}T\right)+\Lambda g^{ij}} \ \ \ \ \ (12)

Einstein tensor and Einstein equation

Reference: Moore, Thomas A., A General Relativity Workbook, University Science Books (2013) – Chapter 21; Box 21.2.

Our first attempt at constructing a tensor generalization of Newton’s law of gravitation failed because the total derivative of the Ricci tensor is not, in general, zero. Recall that the general idea is to look for an equation of form

\displaystyle  G^{ij}=\kappa T^{ij} \ \ \ \ \ (1)


where {G^{ij}} depends on the Riemann tensor and the metric. At this stage, there’s no particular reason to select any special form for this tensor, but if possible, we’d like it to be linear in the Riemann tensor. Also remember that {G^{ij}} must satisfy the conditions:

  1. It must be symmetric: ({G^{ij}=G^{ji}}).
  2. It must be of rank 2.
  3. Its total derivative must be zero: {\nabla_{j}G^{ij}=0}.

To that end, let’s try a formula as follows:

\displaystyle  R^{ij}+bg^{ij}R+\Lambda g^{ij} \ \ \ \ \ (2)


where {R} is the curvature scalar

\displaystyle  R=g^{ab}R_{ab} \ \ \ \ \ (3)

and {b} and {\Lambda} are constants. Note that although {R} is a scalar, it is not a constant, so we can’t just merge the last two terms.

This form for {G^{ij}} satisfies the first two conditions above, since everything on the RHS is of rank 2 and is also symmetric. So it just remains to show that the total derivative is zero. Since {\nabla_{j}g^{ij}=0} always, the problem reduces to showing that

\displaystyle  \nabla_{j}\left(R^{ij}+bg^{ij}R\right)=0 \ \ \ \ \ (4)

To do this, we start with the Bianchi identity

\displaystyle  \nabla_{s}R_{abmn}+\nabla_{n}R_{absm}+\nabla_{m}R_{abns}=0 \ \ \ \ \ (5)

Multiplying through by {g^{gs}g^{am}g^{bn}} we get (the metrics can be taken inside the derivatives since their derivatives are zero, so they act as constants):

\displaystyle  \nabla_{s}g^{gs}g^{am}g^{bn}R_{abmn}+\nabla_{n}g^{gs}g^{am}g^{bn}R_{absm}+\nabla_{m}g^{gs}g^{am}g^{bn}R_{abns}=0 \ \ \ \ \ (6)

In the first term, we have

\displaystyle   g^{am}g^{bn}R_{abmn} \displaystyle  = \displaystyle  g^{bn}R_{\; bmn}^{m}\ \ \ \ \ (7)
\displaystyle  \displaystyle  = \displaystyle  g^{bn}R_{bn}\ \ \ \ \ (8)
\displaystyle  \displaystyle  = \displaystyle  R \ \ \ \ \ (9)

so we get

\displaystyle  \nabla_{s}g^{gs}R+\nabla_{n}g^{gs}g^{am}g^{bn}R_{absm}+\nabla_{m}g^{gs}g^{am}g^{bn}R_{abns}=0 \ \ \ \ \ (10)

We can now use the symmetries of the Riemann tensor to simplify the last two terms. In the second term, we use {R_{absm}=R_{smab}=-R_{msab}} and in the third term we use {R_{abns}=R_{nsab}=-R_{nsba}}:

\displaystyle  \nabla_{s}g^{gs}R-\nabla_{n}g^{gs}g^{am}g^{bn}R_{msab}-\nabla_{m}g^{gs}g^{am}g^{bn}R_{nsba}=0 \ \ \ \ \ (11)

The Ricci tensor with lowered indices is

\displaystyle  R_{ab}=g^{cd}R_{dacb} \ \ \ \ \ (12)

so to raise both its indices we have

\displaystyle  R^{gs}=g^{ga}g^{sb}g^{mn}R_{namb} \ \ \ \ \ (13)

Comparing this with the second term in 11 we can map the indices in that term as follows: {m\rightarrow n}, {s\rightarrow a}, {a\rightarrow m}, {n\rightarrow s} and {b\rightarrow b}. Thus the second term is the same as

\displaystyle   \nabla_{n}g^{gs}g^{am}g^{bn}R_{msab} \displaystyle  = \displaystyle  \nabla_{s}g^{ga}g^{sb}g^{mn}R_{namb}\ \ \ \ \ (14)
\displaystyle  \displaystyle  = \displaystyle  \nabla_{s}R^{gs} \ \ \ \ \ (15)

Similarly, in the third term we can map {n\rightarrow n}, {s\rightarrow a}, {b\rightarrow m}, {a\rightarrow b} and {m\rightarrow s} to get

\displaystyle   \nabla_{m}g^{gs}g^{am}g^{bn}R_{nsba} \displaystyle  = \displaystyle  \nabla_{s}g^{ga}g^{sb}g^{mn}R_{namb}\ \ \ \ \ (16)
\displaystyle  \displaystyle  = \displaystyle  \nabla_{s}R^{gs} \ \ \ \ \ (17)

Thus 11 becomes

\displaystyle  \nabla_{s}g^{gs}R-2\nabla_{s}R^{gs}=0 \ \ \ \ \ (18)

or

\displaystyle  \nabla_{s}\left(R^{gs}-\frac{1}{2}g^{gs}R\right)=0 \ \ \ \ \ (19)

Thus from 2 we can satisfy {\nabla_{j}\left(R^{ij}+bg^{ij}R+\Lambda g^{ij}\right)=0} if {b=-\frac{1}{2}}. In practice, {G^{ij}} is defined as just the first two terms:

\displaystyle  \boxed{G^{ij}\equiv R^{ij}-\frac{1}{2}g^{ij}R} \ \ \ \ \ (20)

This is known as the Einstein tensor and the equation

\displaystyle  \boxed{G^{ij}+\Lambda g^{ij}=\kappa T^{ij}} \ \ \ \ \ (21)

is the Einstein equation. This is the general relativistic replacement for Newton’s law of gravity.

Einstein equation: trying the Ricci tensor as a solution

Reference: Moore, Thomas A., A General Relativity Workbook, University Science Books (2013) – Chapter 21; Box 21.1.

We can now start looking at a derivation of the Einstein equation, which is the generalization of Newton’s formula for the gravitational force. In Newtonian theory, gravity is an attractive, conservative, inverse-square force so (apart from the sign) it is mathematically identical to the electrostatic force, which means we can write a differential form of Newton’s gravitational theory using Gauss’s law. That is

\displaystyle \nabla\cdot\mathbf{g}=-4\pi G\rho \ \ \ \ \ (1)

where {\mathbf{g}} is the gravitational field, {\rho} is the mass density and {G} is the gravitational constant. The minus sign occurs because gravity is attractive, whereas the electric force for like charges is repulsive. Because the force is conservative, {\mathbf{g}} can be written as the gradient of a potential so an alternative form of the equation is

\displaystyle \nabla\cdot\left(-\nabla\Phi\right)=-\nabla^{2}\Phi=-4\pi G\rho \ \ \ \ \ (2)

or

\displaystyle \boxed{\nabla^{2}\Phi=4\pi G\rho} \ \ \ \ \ (3)

 

The derivation of the Einstein equation is, like so many derivations in relativity, based on a plausibility argument. We want to find a tensor equation that generalizes Newton’s equation, and we want this tensor equation to reduce to Newton’s equation in the weak field limit.

First, the generalization of Newtonian mass density {\rho} is the stress-energy tensor {T^{ij}} we can try replacing the RHS of 3 by {\kappa T^{ij}} where {\kappa} is a scalar constant. Since the RHS is now a rank-2 tensor, the LHS must also be a rank-2 tensor, so we must have an equation like

\displaystyle G^{ij}=\kappa T^{ij} \ \ \ \ \ (4)

 


where the form of {G^{ij}} needs to be determined. To do this, think about what we want the theory to do. The idea behind general relativity is that the energy density in a region of space should determine the curvature of the space in that region. The Riemann tensor and the metric tensor describe the curvature of space-time, so it makes sense that {G^{ij}} could depend on these two tensors.

Suppose we try to express {G^{ij}} solely in terms of the Riemann tensor. What other constraints can we impose to narrow things down? First, since the Riemann tensor is rank 4 and {G^{ij}} is rank 2, we’ll need to contract the Riemann tensor to get rid of 2 of its indices. One candidate is the Ricci tensor, defined as the contraction of the Riemann tensor over its first and third indices:

\displaystyle R_{\; bac}^{a} \displaystyle = \displaystyle g^{ad}R_{dbac}\ \ \ \ \ (5)
\displaystyle \displaystyle \equiv \displaystyle R_{bc} \ \ \ \ \ (6)

To use {R_{bc}}, we need to raise both its indices, so we get

\displaystyle R^{ij} \displaystyle = \displaystyle g^{ib}g^{jc}R_{bc}\ \ \ \ \ (7)
\displaystyle \displaystyle = \displaystyle g^{ib}g^{jc}R_{cb}\ \ \ \ \ (8)
\displaystyle \displaystyle = \displaystyle g^{ic}g^{jb}R_{bc}\ \ \ \ \ (9)
\displaystyle \displaystyle = \displaystyle R^{ji} \ \ \ \ \ (10)

where in the third line, we’ve swapped the dummy indices {b} and {c}. Since {T^{ij}} is symmetric, {G^{ij}} must also be symmetric, but since {R^{ij}=R^{ji}}, this condition is satisfied.

From conservation of energy and momentum, we know that {\nabla_{i}T^{ij}=0}, so we must also have {\nabla_{i}G^{ij}=0} (since {\kappa} is a constant). This is where we run into a snag. The condition {\nabla_{i}G^{ij}=0} must apply everywhere, in every reference frame, so it must apply the origin of a locally inertial frame (LIF). In a LIF, the Riemann tensor reduces to

\displaystyle R_{nj\ell m}=\frac{1}{2}\left(\partial_{\ell}\partial_{j}g_{mn}+\partial_{m}\partial_{n}g_{j\ell}-\partial_{\ell}\partial_{n}g_{jm}-\partial_{m}\partial_{j}g_{\ell n}\right) \ \ \ \ \ (11)

The Ricci tensor is then

\displaystyle R^{ik} \displaystyle = \displaystyle g^{ij}g^{km}R_{jm}\ \ \ \ \ (12)
\displaystyle \displaystyle = \displaystyle g^{ij}g^{km}g^{\ell n}R_{nj\ell m}\ \ \ \ \ (13)
\displaystyle \displaystyle = \displaystyle \frac{1}{2}g^{ij}g^{km}g^{\ell n}\left(\partial_{\ell}\partial_{j}g_{mn}+\partial_{m}\partial_{n}g_{j\ell}-\partial_{\ell}\partial_{n}g_{jm}-\partial_{m}\partial_{j}g_{\ell n}\right) \ \ \ \ \ (14)

In a LIF, the first derivatives of {g^{ij}} are all zero (by definition of the LIF), so

\displaystyle \nabla_{i}R^{ik}=\frac{1}{2}g^{ij}g^{km}g^{\ell n}\nabla_{i}\left(\partial_{\ell}\partial_{j}g_{mn}+\partial_{m}\partial_{n}g_{j\ell}-\partial_{\ell}\partial_{n}g_{jm}-\partial_{m}\partial_{j}g_{\ell n}\right) \ \ \ \ \ (15)

In a LIF, the total derivative {\nabla_{i}} reduces to the ordinary derivative {\partial_{i}} so we get

\displaystyle \nabla_{i}R^{ik}=\frac{1}{2}g^{ij}g^{km}g^{\ell n}\left(\partial_{i}\partial_{\ell}\partial_{j}g_{mn}+\partial_{i}\partial_{m}\partial_{n}g_{j\ell}-\partial_{i}\partial_{\ell}\partial_{n}g_{jm}-\partial_{i}\partial_{m}\partial_{j}g_{\ell n}\right) \ \ \ \ \ (16)

The indexes in the first term can be relabelled by swapping {i} with {\ell} and {j} with {n} to give

\displaystyle \frac{1}{2}g^{ij}g^{km}g^{\ell n}\partial_{i}\partial_{\ell}\partial_{j}g_{mn}=\frac{1}{2}g^{\ell n}g^{km}g^{ij}\partial_{\ell}\partial_{i}\partial_{n}g_{mj} \ \ \ \ \ (17)

which is the negative of the third term (since {g_{mj}=g_{jm}} and the order of the partial derivatives doesn’t matter), so these two terms cancel and we’re left with

\displaystyle \nabla_{i}R^{ik}=\frac{1}{2}g^{ij}g^{km}g^{\ell n}\left(\partial_{i}\partial_{m}\partial_{n}g_{j\ell}-\partial_{i}\partial_{m}\partial_{j}g_{\ell n}\right) \ \ \ \ \ (18)

The only way this can be identically zero is if we could swap {n} with {j} in the first term and have it equal the negative of the second term. However, if we try this, we get

\displaystyle \frac{1}{2}g^{ij}g^{km}g^{\ell n}\partial_{i}\partial_{m}\partial_{n}g_{j\ell}=\frac{1}{2}g^{in}g^{km}g^{\ell j}\partial_{i}\partial_{m}\partial_{j}g_{n\ell} \ \ \ \ \ (19)

The partial derivatives match up with those in the second term, but the product of the three metric tensors doesn’t, so in general this isn’t zero, meaning that

\displaystyle \nabla_{i}R^{ik}\ne0 \ \ \ \ \ (20)

Thus setting {G^{ij}=R^{ij}} won’t work, and we’ll need to try something else.

Rectangular resonant cavity

References: Griffiths, David J. (2007), Introduction to Electrodynamics, 3rd Edition; Pearson Education – Problem 9.38.

A wave guide consisting of a completely enclosed volume is known as a resonant cavity. The simplest resonant cavity is created by taking a rectangular wave guide and closing off the ends to form a rectangular box with dimensions of {a} in the {x} direction, {b} in the {y} direction and {d} in the {z} direction. To find the fields, we can’t just assume that the wave propagates in the {z} direction with the same {z} dependence for all three components of each field, that is, we can’t take the waves to be

\displaystyle   \tilde{\mathbf{E}} \displaystyle  = \displaystyle  \tilde{\mathbf{E}}_{0}\left(x,y\right)e^{i\left(kz-\omega t\right)}\ \ \ \ \ (1)
\displaystyle  \tilde{\mathbf{B}} \displaystyle  = \displaystyle  \tilde{\mathbf{B}}_{0}\left(x,y\right)e^{i\left(kz-\omega t\right)} \ \ \ \ \ (2)

This time, there has to be an explicit dependence on {z} in {\tilde{\mathbf{E}}_{0}} and {\tilde{\mathbf{B}}_{0}}, so we take the waves to be

\displaystyle   \tilde{\mathbf{E}} \displaystyle  = \displaystyle  \tilde{\mathbf{E}}_{0}\left(x,y,z\right)e^{-i\omega t}\ \ \ \ \ (3)
\displaystyle  \tilde{\mathbf{B}} \displaystyle  = \displaystyle  \tilde{\mathbf{B}}_{0}\left(x,y,z\right)e^{-i\omega t} \ \ \ \ \ (4)

We can apply the two curl Maxwell equations to get

\displaystyle   \nabla\times\tilde{\mathbf{E}} \displaystyle  = \displaystyle  -\frac{\partial\tilde{\mathbf{B}}}{\partial t}\ \ \ \ \ (5)
\displaystyle  \displaystyle  = \displaystyle  i\omega\tilde{\mathbf{B}}\ \ \ \ \ (6)
\displaystyle  \nabla\times\tilde{\mathbf{B}} \displaystyle  = \displaystyle  \frac{1}{c^{2}}\frac{\partial\tilde{\mathbf{E}}}{\partial t}\ \ \ \ \ (7)
\displaystyle  \displaystyle  = \displaystyle  -i\frac{\omega}{c^{2}}\tilde{\mathbf{E}} \ \ \ \ \ (8)

As the curl affects only the spatial coordinates, we can cancel off {e^{-i\omega t}} from both sides of these equations to get

\displaystyle   \nabla\times\tilde{\mathbf{E}}_{0} \displaystyle  = \displaystyle  i\omega\tilde{\mathbf{B}}_{0}\ \ \ \ \ (9)
\displaystyle  \nabla\times\tilde{\mathbf{B}}_{0} \displaystyle  = \displaystyle  -i\frac{\omega}{c^{2}}\tilde{\mathbf{E}}_{0} \ \ \ \ \ (10)

Taking the curl of the first of these equations, we get

\displaystyle   \nabla\times\left(\nabla\times\tilde{\mathbf{E}}_{0}\right) \displaystyle  = \displaystyle  \nabla\times\left(\nabla\cdot\tilde{\mathbf{E}}_{0}\right)-\nabla^{2}\tilde{\mathbf{E}}_{0}\ \ \ \ \ (11)
\displaystyle  \displaystyle  = \displaystyle  -\nabla^{2}\tilde{\mathbf{E}}_{0}\ \ \ \ \ (12)
\displaystyle  \displaystyle  = \displaystyle  i\omega\nabla\times\tilde{\mathbf{B}}_{0}\ \ \ \ \ (13)
\displaystyle  \displaystyle  = \displaystyle  \frac{\omega^{2}}{c^{2}}\tilde{\mathbf{E}}_{0} \ \ \ \ \ (14)

Therefore we get

\displaystyle  \nabla^{2}\tilde{\mathbf{E}}_{0}=-\frac{\omega^{2}}{c^{2}}\tilde{\mathbf{E}}_{0} \ \ \ \ \ (15)

giving the same differential equation for each component. We can use separation of variables to solve them. For the {x} component, let {E_{x}=X\left(x\right)Y\left(y\right)Z\left(z\right)}. Then

\displaystyle   X"YZ+XY"Z+XYZ" \displaystyle  = \displaystyle  -\frac{\omega^{2}}{c^{2}}XYZ\ \ \ \ \ (16)
\displaystyle  \frac{X"}{X}+\frac{Y"}{Y}+\frac{Z"}{Z} \displaystyle  = \displaystyle  -\frac{\omega^{2}}{c^{2}} \ \ \ \ \ (17)

As usual, each term on the LHS must be equal to a constant, and the solution of the resulting 3 equations is a sum of a sine and a cosine, so we have

\displaystyle   X\left(x\right) \displaystyle  = \displaystyle  A\sin k_{xx}x+H\cos k_{xx}x\ \ \ \ \ (18)
\displaystyle  Y\left(y\right) \displaystyle  = \displaystyle  C\sin k_{xy}y+D\cos k_{xy}y\ \ \ \ \ (19)
\displaystyle  Z\left(z\right) \displaystyle  = \displaystyle  F\sin k_{xz}z+G\cos k_{xz}z \ \ \ \ \ (20)

with similar equations for the other components of {\mathbf{E}}. The double subscript such as {k_{xy}} means that this {k} belongs to {E_{x}} in the function {Y\left(y\right)}.

The boundary conditions are

\displaystyle   \mathbf{E}^{\parallel} \displaystyle  = \displaystyle  0\ \ \ \ \ (21)
\displaystyle  B^{\perp} \displaystyle  = \displaystyle  0 \ \ \ \ \ (22)

at all boundaries, so we can use this to constrain the {k_{i}}s. {E_{x}} must be zero at {y=0,b} and {z=0,d} which means

\displaystyle   D \displaystyle  = \displaystyle  G=0\ \ \ \ \ (23)
\displaystyle  k_{xy} \displaystyle  = \displaystyle  \frac{n\pi}{b}\ \ \ \ \ (24)
\displaystyle  k_{xz} \displaystyle  = \displaystyle  \frac{\ell\pi}{d} \ \ \ \ \ (25)

We can’t, at this stage, put any constraints on {k_{xx}} since it doesn’t figure in any of the boundary conditions. Putting it all together, and condensing the constants into {X\left(x\right)}, we get

\displaystyle  E_{x}=\left(A\sin k_{xx}x+H\cos k_{xx}x\right)\sin\frac{n\pi}{b}y\sin\frac{\ell\pi}{d}z \ \ \ \ \ (26)

For {E_{y}} we get the same solutions 18, 19 and 20 as above. {E_{y}} must be zero at {x=0,a} and {z=0,d} so we get

\displaystyle   H \displaystyle  = \displaystyle  G=0\ \ \ \ \ (27)
\displaystyle  k_{yx} \displaystyle  = \displaystyle  \frac{m\pi}{a}\ \ \ \ \ (28)
\displaystyle  k_{yz} \displaystyle  = \displaystyle  \frac{\ell\pi}{d} \ \ \ \ \ (29)

This gives

\displaystyle  E_{y}=\left(C\sin k_{yy}y+D\cos k_{yy}y\right)\sin\frac{m\pi}{a}x\sin\frac{\ell\pi}{d}z \ \ \ \ \ (30)

Finally, for {E_{z}} the boundary conditions require it to be zero at {x=0,a} and {y=0,b} so we get

\displaystyle   H \displaystyle  = \displaystyle  D=0\ \ \ \ \ (31)
\displaystyle  k_{zx} \displaystyle  = \displaystyle  \frac{m\pi}{a}\ \ \ \ \ (32)
\displaystyle  k_{zy} \displaystyle  = \displaystyle  \frac{n\pi}{b} \ \ \ \ \ (33)

This gives

\displaystyle  E_{z}=\left(F\sin k_{zz}z+G\cos k_{zz}z\right)\sin\frac{m\pi}{a}x\sin\frac{n\pi}{b}y \ \ \ \ \ (34)

Now we can invoke Gauss’s law in vacuum, which states that {\nabla\cdot\mathbf{E}=0}. This gives

\displaystyle   \nabla\cdot\mathbf{E} \displaystyle  = \displaystyle  k_{xx}\left(A\cos k_{xx}x-H\sin k_{xx}x\right)\sin\frac{n\pi}{b}y\sin\frac{\ell\pi}{d}z+\ \ \ \ \ (35)
\displaystyle  \displaystyle  \displaystyle  k_{yy}\left(C\cos k_{yy}y-D\sin k_{yy}y\right)\sin\frac{m\pi}{a}x\sin\frac{\ell\pi}{d}z+\nonumber
\displaystyle  \displaystyle  \displaystyle  k_{zz}\left(F\cos k_{zz}z-G\sin k_{zz}z\right)\sin\frac{m\pi}{a}x\sin\frac{n\pi}{b}y\nonumber
\displaystyle  \displaystyle  = \displaystyle  0 \ \ \ \ \ (36)

This equation must be true for all values of {\left(x,y,z\right)} so if we choose {x=0} we get {A=0}, or if {y=0} then {C=0}, or if {z=0} then {F=0}, so we have

\displaystyle   \nabla\cdot\mathbf{E} \displaystyle  = \displaystyle  -k_{xx}H\sin k_{xx}x\sin\frac{n\pi}{b}y\sin\frac{\ell\pi}{d}z+\ \ \ \ \ (37)
\displaystyle  \displaystyle  \displaystyle  -k_{yy}D\sin k_{yy}y\sin\frac{m\pi}{a}x\sin\frac{\ell\pi}{d}z+\nonumber
\displaystyle  \displaystyle  \displaystyle  -k_{zz}G\sin k_{zz}z\sin\frac{m\pi}{a}x\sin\frac{n\pi}{b}y\nonumber
\displaystyle  \displaystyle  = \displaystyle  0 \ \ \ \ \ (38)

We should now be able to conclude that arguments of the sine functions for each variable are equal, that is, that {k_{xx}=\frac{m\pi}{a}} and so on. I haven’t been able to find a proof of this, although it seems to have something to do with Fourier analysis. The argument is along the lines of: the only way an expansion of sines can be zero for all points is if they are all the same sine and they add up to zero identically. Given that, we get

\displaystyle   -\left(\frac{m\pi}{a}H+\frac{n\pi}{b}D+\frac{\ell\pi}{d}G\right)\sin\frac{m\pi}{a}x\sin\frac{n\pi}{b}y\sin\frac{\ell\pi}{d}z \displaystyle  = \displaystyle  0\ \ \ \ \ (39)
\displaystyle  \frac{m\pi}{a}H+\frac{n\pi}{b}D+\frac{\ell\pi}{d}G \displaystyle  = \displaystyle  0 \ \ \ \ \ (40)

We can therefore simplify the notation by defining

\displaystyle   k_{x} \displaystyle  = \displaystyle  \frac{m\pi}{a}\ \ \ \ \ (41)
\displaystyle  k_{y} \displaystyle  = \displaystyle  \frac{n\pi}{b}\ \ \ \ \ (42)
\displaystyle  k_{z} \displaystyle  = \displaystyle  \frac{\ell\pi}{d} \ \ \ \ \ (43)

So the electric field is

\displaystyle   \tilde{\mathbf{E}} \displaystyle  = \displaystyle  He^{-i\omega t}\cos k_{x}x\sin k_{y}y\sin k_{z}z\hat{\mathbf{x}}+De^{-i\omega t}\sin k_{x}x\cos k_{y}y\sin k_{z}z\hat{\mathbf{y}}+\ \ \ \ \ (44)
\displaystyle  \displaystyle  \displaystyle  Ge^{-i\omega t}\sin k_{x}x\sin k_{y}y\cos k_{z}z\hat{\mathbf{z}}\nonumber

The magnetic field can be found from the Maxwell equation

\displaystyle   \nabla\times\tilde{\mathbf{E}} \displaystyle  = \displaystyle  -\frac{\partial\tilde{\mathbf{B}}}{\partial t}\ \ \ \ \ (45)
\displaystyle  \displaystyle  = \displaystyle  \left(Gk_{y}-Dk_{z}\right)e^{-i\omega t}\sin k_{x}x\cos k_{y}y\cos k_{z}z\hat{\mathbf{x}}+\ \ \ \ \ (46)
\displaystyle  \displaystyle  \displaystyle  \left(Hk_{z}-Gk_{x}\right)e^{-i\omega t}\cos k_{x}x\sin k_{y}y\cos k_{z}z\hat{\mathbf{y}}+\nonumber
\displaystyle  \displaystyle  \displaystyle  \left(Dk_{x}-Hk_{y}\right)e^{-i\omega t}\sin k_{x}x\cos k_{y}y\cos k_{z}z\hat{\mathbf{z}}\nonumber

Integrating with respect to time gives

\displaystyle   \tilde{\mathbf{B}} \displaystyle  = \displaystyle  -\frac{i}{\omega}\left(Gk_{y}-Dk_{z}\right)e^{-i\omega t}\sin k_{x}x\cos k_{y}y\cos k_{z}z\hat{\mathbf{x}}-\ \ \ \ \ (47)
\displaystyle  \displaystyle  \displaystyle  \frac{i}{\omega}\left(Hk_{z}-Gk_{x}\right)e^{-i\omega t}\cos k_{x}x\sin k_{y}y\cos k_{z}z\hat{\mathbf{y}}-\nonumber
\displaystyle  \displaystyle  \displaystyle  \frac{i}{\omega}\left(Dk_{x}-Hk_{y}\right)e^{-i\omega t}\sin k_{x}x\cos k_{y}y\cos k_{z}z\hat{\mathbf{z}}\nonumber

In all cases, 17 requires that

\displaystyle  k_{x}^{2}+k_{y}^{2}+k_{z}^{2}=\frac{\omega^{2}}{c^{2}} \ \ \ \ \ (48)

so the resonant frequency for mode {mn\ell} is

\displaystyle   \omega_{mn\ell} \displaystyle  = \displaystyle  c\sqrt{\left(\frac{m\pi}{a}\right)^{2}+\left(\frac{n\pi}{b}\right)^{2}+\left(\frac{\ell\pi}{d}\right)^{2}}\ \ \ \ \ (49)
\displaystyle  \displaystyle  = \displaystyle  c\pi\sqrt{\left(\frac{m}{a}\right)^{2}+\left(\frac{n}{b}\right)^{2}+\left(\frac{\ell}{d}\right)^{2}} \ \ \ \ \ (50)

Total internal reflection

References: Griffiths, David J. (2007), Introduction to Electrodynamics, 3rd Edition; Pearson Education – Problem 9.37.

When we looked at the behaviour of waves passing from one medium to another at an angle, one of the consequences was Snell’s law of refraction which says

\displaystyle \frac{\sin\theta_{I}}{\sin\theta_{T}}=\frac{n_{2}}{n_{1}} \ \ \ \ \ (1)

where {\theta_{I}} is the angle of incidence (angle between the wave vector and the normal to the surface) in medium 1 with index of refraction {n_{1}}, and {\theta_{T}} is the angle of the refracted wave in medium 2. If {n_{1}>n_{2}}, that is, the wave is incident from a medium (such as water) with a higher index of refraction than medium 2 (such as air), then we can reach a critical incident angle {\theta_{c}} where the refracted angle is {\pi/2}, so that the refracted wave moves parallel to the interface. This happens when

\displaystyle \sin\theta_{c}=\frac{n_{2}}{n_{1}} \ \ \ \ \ (2)

If {\theta_{I}>\theta_{c}}, no wave is transmitted through the interface and the entire wave is reflected back into medium 1. This is known as total internal reflection.

To see what happens in this case, we can follow through the same derivation as before, except we allow the wave vector {\mathbf{k}_{T}} of the transmitted wave to be complex. That is, for a given {\theta_{T}} we say that (assuming that the incident and transmitted plane is the {xz} plane):

\displaystyle \mathbf{k}_{T}=k_{T}\left(\sin\theta_{T}\hat{\mathbf{x}}+\cos\theta_{T}\hat{\mathbf{z}}\right) \ \ \ \ \ (3)

The wave vector has, as usual, the magnitude of

\displaystyle k_{T}=\frac{\omega}{v_{2}}=\frac{\omega n_{2}}{c} \ \ \ \ \ (4)

Now suppose {\theta_{I}>\theta_{c}}. In that case,

\displaystyle \sin\theta_{T} \displaystyle = \displaystyle \frac{n_{1}}{n_{2}}\sin\theta_{I}>\frac{n_{1}}{n_{2}}\frac{n_{2}}{n_{1}}=1\ \ \ \ \ (5)
\displaystyle \cos\theta_{T} \displaystyle = \displaystyle \sqrt{1-\sin^{2}\theta_{T}}\ \ \ \ \ (6)
\displaystyle \displaystyle = \displaystyle i\sqrt{\sin^{2}\theta_{T}-1}\ \ \ \ \ (7)
\displaystyle \displaystyle = \displaystyle i\sqrt{\frac{n_{1}^{2}}{n_{2}^{2}}\sin^{2}\theta_{I}-1}\ \ \ \ \ (8)
\displaystyle \mathbf{k}_{T} \displaystyle = \displaystyle k_{T}\left(\frac{n_{1}}{n_{2}}\sin\theta_{I}\hat{\mathbf{x}}+i\sqrt{\frac{n_{1}^{2}}{n_{2}^{2}}\sin^{2}\theta_{I}-1}\hat{\mathbf{z}}\right)\ \ \ \ \ (9)
\displaystyle \displaystyle = \displaystyle \frac{\omega n_{2}}{c}\left(\frac{n_{1}}{n_{2}}\sin\theta_{I}\hat{\mathbf{x}}+i\sqrt{\frac{n_{1}^{2}}{n_{2}^{2}}\sin^{2}\theta_{I}-1}\hat{\mathbf{z}}\right)\ \ \ \ \ (10)
\displaystyle \displaystyle = \displaystyle k\hat{\mathbf{x}}+i\kappa\hat{\mathbf{z}} \ \ \ \ \ (11)

where

\displaystyle k \displaystyle \equiv \displaystyle \frac{\omega n_{1}}{c}\sin\theta_{I}\ \ \ \ \ (12)
\displaystyle \kappa \displaystyle \equiv \displaystyle \frac{\omega}{c}\sqrt{n_{1}^{2}\sin^{2}\theta_{I}-n_{2}^{2}} \ \ \ \ \ (13)

That is, we venture into the realm of complex variables, and {\theta_{T}} can no longer be interpreted as a geometric angle. However, let’s proceed with the analysis and see what happens.

First, we look at the electric field, which has the general form

\displaystyle \tilde{\mathbf{E}}_{T}\left(\mathbf{r},t\right)=\tilde{\mathbf{E}}_{0_{T}}e^{i\left(\mathbf{k}_{T}\cdot\mathbf{r}-\omega t\right)} \ \ \ \ \ (14)

Plugging in 11, we get

\displaystyle \tilde{\mathbf{E}}_{T}\left(\mathbf{r},t\right)=\tilde{\mathbf{E}}_{0_{T}}e^{-\kappa z}e^{i\left(kx-\omega t\right)} \ \ \ \ \ (15)

That is, the transmitted wave propagates in the {x} direction (parallel to the interface) and is attenuated in the {z} direction.

How much of the wave is reflected in this case? For a wave with polarization parallel to the incident plane (that is, {\mathbf{E}} has only an {x} component), we found earlier that the reflected amplitude is

\displaystyle E_{R}=\frac{\alpha-\beta}{\alpha+\beta}E_{I} \ \ \ \ \ (16)

where

\displaystyle \alpha \displaystyle \equiv \displaystyle \frac{\cos\theta_{T}}{\cos\theta_{I}}\ \ \ \ \ (17)
\displaystyle \beta \displaystyle \equiv \displaystyle \frac{\mu_{1}v_{1}}{\mu_{2}v_{2}}=\frac{\mu_{1}n_{2}}{\mu_{2}n_{1}} \ \ \ \ \ (18)

In this case, {\alpha} is purely imaginary and {\beta} is real, so the reflection coefficient is

\displaystyle R=\left|\frac{E_{R}}{E_{I}}\right|^{2}=\left|\frac{\alpha-\beta}{\alpha+\beta}\right|=1 \ \ \ \ \ (19)

since

\displaystyle \left|\alpha-\beta\right|^{2}=\left|\alpha\right|^{2}+\beta^{2}=\left|\alpha+\beta\right|^{2} \ \ \ \ \ (20)

For perpendicular polarization, we have

\displaystyle E_{R}=\frac{1-\alpha\beta}{1+\alpha\beta}E_{I} \ \ \ \ \ (21)

and again, since 1 is real and {\alpha\beta} is purely imaginary

\displaystyle R=\left|\frac{1-\alpha\beta}{1+\alpha\beta}\right|^{2}=1 \ \ \ \ \ (22)

Thus the reflection is indeed total for both polarizations.

Still with perpendicular polarization, the electric field is entirely in the {y} direction so

\displaystyle \tilde{\mathbf{E}}_{T}=\tilde{E}_{0}e^{-\kappa z}e^{i\left(kx-\omega t\right)}\hat{\mathbf{y}} \ \ \ \ \ (23)

The magnetic field is given by (using 11 and {\mathbf{k}_{T}=\frac{\omega n_{2}}{c}\hat{\mathbf{k}}_{T}} and {v_{2}=c/n_{2}})

\displaystyle \tilde{\mathbf{B}}_{T} \displaystyle = \displaystyle \frac{1}{v_{2}}\hat{\mathbf{k}}_{T}\times\tilde{\mathbf{E}}_{T}\ \ \ \ \ (24)
\displaystyle \displaystyle = \displaystyle \frac{1}{v_{2}}\frac{c}{\omega n_{2}}\tilde{E}_{0}e^{-\kappa z}e^{i\left(kx-\omega t\right)}\left(k\hat{\mathbf{z}}-i\kappa\hat{\mathbf{x}}\right)\ \ \ \ \ (25)
\displaystyle \displaystyle = \displaystyle \frac{\tilde{E}_{0}}{\omega}e^{-\kappa z}e^{i\left(kx-\omega t\right)}\left(k\hat{\mathbf{z}}-i\kappa\hat{\mathbf{x}}\right) \ \ \ \ \ (26)

Taking the real parts to get the actual fields, we have

\displaystyle \mathbf{E}_{T} \displaystyle = \displaystyle E_{0}e^{-\kappa z}\cos\left(kx-\omega t\right)\hat{\mathbf{y}}\ \ \ \ \ (27)
\displaystyle \mathbf{B}_{T} \displaystyle = \displaystyle \frac{E_{0}}{\omega}e^{-\kappa z}\left(k\cos\left(kx-\omega t\right)\hat{\mathbf{z}}+\kappa\sin\left(kx-\omega t\right)\hat{\mathbf{x}}\right) \ \ \ \ \ (28)

To check that these fields satisfy Maxwell’s equations requires grinding away at some derivatives, which we can do in Maple. The divergences are fairly easy and we get

\displaystyle \nabla\cdot\mathbf{E}_{T}=\nabla\cdot\mathbf{B}_{T}=0 \ \ \ \ \ (29)

The curls give

\displaystyle \nabla\times\mathbf{E}_{T} \displaystyle = \displaystyle E_{0}e^{-\kappa z}\left(-k\sin\left(kx-\omega t\right)\hat{\mathbf{z}}+\kappa\cos\left(kx-\omega t\right)\hat{\mathbf{x}}\right)\ \ \ \ \ (30)
\displaystyle \displaystyle = \displaystyle -\frac{\partial\mathbf{B}_{T}}{\partial t}\ \ \ \ \ (31)
\displaystyle \nabla\times\mathbf{B}_{T} \displaystyle = \displaystyle \frac{E_{0}}{\omega}e^{-\kappa z}\sin\left(kx-\omega t\right)\hat{\mathbf{y}}\left(k^{2}-\kappa^{2}\right)\ \ \ \ \ (32)
\displaystyle \displaystyle = \displaystyle \frac{n_{2}^{2}}{c^{2}}\omega E_{0}e^{-\kappa z}\sin\left(kx-\omega t\right)\hat{\mathbf{y}}\ \ \ \ \ (33)
\displaystyle \displaystyle = \displaystyle \frac{1}{v_{2}^{2}}\frac{\partial\mathbf{E}_{T}}{\partial t} \ \ \ \ \ (34)

Thus all four of Maxwell’s equations are satisfied.

The Poynting vector is

\displaystyle \mathbf{S} \displaystyle = \displaystyle \frac{1}{\mu}\mathbf{E}_{T}\times\mathbf{B}_{T}\ \ \ \ \ (35)
\displaystyle \displaystyle = \displaystyle \frac{E_{0}^{2}e^{-2\kappa z}}{\mu\omega}\left(k\cos^{2}\left(kx-\omega t\right)\hat{\mathbf{x}}-\kappa\sin\left(kx-\omega t\right)\cos\left(kx-\omega t\right)\hat{\mathbf{z}}\right) \ \ \ \ \ (36)

Integrating this over one cycle ({t=0} to {2\pi/\omega}) gives zero for the {z} component, thus no energy is transmitted perpendicular to the interface, and all the energy flows in the {x} direction, parallel to the interface.

Follow

Get every new post delivered to your Inbox.

Join 364 other followers