Author Archives:

Apparent size of a black hole to a moving observer

Reference: Moore, Thomas A., A General Relativity Workbook, University Science Books (2013) – Chapter 12; Problem 12.7.

We’ve seen what the view of black hole looks like for a stationary observer at various distances from the black hole. We can do similar calculations for an observer falling in radially from infinity. For such an observer, we’ve already worked out the four-velocity components:

\displaystyle \frac{dt}{d\tau} \displaystyle = \displaystyle e\left(1-\frac{2GM}{r}\right)^{-1}
\displaystyle \frac{d\theta}{d\tau} \displaystyle = \displaystyle 0
\displaystyle \frac{d\phi}{d\tau} \displaystyle = \displaystyle \frac{l}{r^{2}\sin^{2}\theta}=\frac{l}{r^{2}}
\displaystyle \frac{dr}{d\tau} \displaystyle = \displaystyle \pm\sqrt{\frac{2GM}{r}}

For a particle starting at rest at infinity and moving radially inwards, {dr/d\tau<0}, {e=1} and {l=0}, so these equations reduce to

\displaystyle \frac{dt}{d\tau} \displaystyle = \displaystyle \left(1-\frac{2GM}{r}\right)^{-1}
\displaystyle \frac{d\theta}{d\tau} \displaystyle = \displaystyle 0
\displaystyle \frac{d\phi}{d\tau} \displaystyle = \displaystyle 0
\displaystyle \frac{dr}{d\tau} \displaystyle = \displaystyle -\sqrt{\frac{2GM}{r}}

These are the components of the basis vector {\mathbf{o}_{t}} in the Schwarzschild frame:

\displaystyle \mathbf{o}_{t}=\left[\left(1-\frac{2GM}{r}\right)^{-1},-\sqrt{\frac{2GM}{r}},0,0\right]

Note that this vector is already normalized, since

\displaystyle \mathbf{o}_{t}\cdot\mathbf{o}_{t} \displaystyle = \displaystyle -\left(1-\frac{2GM}{r}\right)\left(1-\frac{2GM}{r}\right)^{-2}+\left(1-\frac{2GM}{r}\right)^{-1}\frac{2GM}{r}
\displaystyle \displaystyle = \displaystyle -1

From this, we can work out the three spatial basis vectors. For {\mathbf{o}_{z}}, we know that {\mathbf{o}_{t}\cdot\mathbf{o}_{z}=0} and, since the {z} axis is aligned (by definition) with the {r} direction, {o_{z}^{\phi}=o_{z}^{\theta}=0}, so we get

\displaystyle \mathbf{o}_{t}\cdot\mathbf{o}_{z} \displaystyle = \displaystyle g_{ij}o_{t}^{i}o_{z}^{j}
\displaystyle \displaystyle = \displaystyle -\left(1-\frac{2GM}{r}\right)\left(1-\frac{2GM}{r}\right)^{-1}o_{z}^{t}-\left(1-\frac{2GM}{r}\right)^{-1}\sqrt{\frac{2GM}{r}}o_{z}^{r}=0
\displaystyle o_{z}^{t} \displaystyle = \displaystyle -\left(1-\frac{2GM}{r}\right)^{-1}\sqrt{\frac{2GM}{r}}o_{z}^{r}

We also have the normalization condition {\mathbf{o}_{z}\cdot\mathbf{o}_{z}=1} which gives us

\displaystyle \mathbf{o}_{z}\cdot\mathbf{o}_{z} \displaystyle = \displaystyle \left[-\left(1-\frac{2GM}{r}\right)\left(1-\frac{2GM}{r}\right)^{-2}\frac{2GM}{r}+\left(1-\frac{2GM}{r}\right)^{-1}\right]\left(o_{z}^{r}\right)^{2}
\displaystyle \displaystyle = \displaystyle \left(o_{z}^{r}\right)^{2}=1
\displaystyle o_{z}^{r} \displaystyle = \displaystyle 1

We choose {+1} for {o_{z}^{r}} since the {z} axis is aligned with the {+r} direction. Thus:

\displaystyle \mathbf{o}_{z}=\left[-\left(1-\frac{2GM}{r}\right)^{-1}\sqrt{\frac{2GM}{r}},1,0,0\right]

Since {o_{t}^{\phi}=o_{t}^{\theta}=0}, the condition {\mathbf{o}_{x}\cdot\mathbf{o}_{t}=\mathbf{o}_{y}\cdot\mathbf{o}_{t}=0} tells us that {o_{x}^{t}=o_{y}^{t}=0}, so the normalization condition then says that {\mathbf{o}_{x}} and {\mathbf{o}_{y}} are the same as before, namely

\displaystyle \mathbf{o}_{x} \displaystyle = \displaystyle \left[0,0,0,\frac{1}{r\sin\theta}\right]=\left[0,0,0,\frac{1}{r}\right]
\displaystyle \mathbf{o}_{y} \displaystyle = \displaystyle \left[0,0,-\frac{1}{r},0\right]

We can now follow the same procedure as before to calculate the critical angle at which a photon emitted by the observer is absorbed by the black hole. We have the photon’s four-momentum:

\displaystyle p^{t} \displaystyle = \displaystyle E\left(1-\frac{2GM}{r}\right)^{-1}\frac{dt}{dt}=E\left(1-\frac{2GM}{r}\right)^{-1}
\displaystyle p^{r} \displaystyle = \displaystyle E\left(1-\frac{2GM}{r}\right)^{-1}\frac{dr}{dt}
\displaystyle \displaystyle = \displaystyle \pm E\sqrt{1-\left(1-\frac{2GM}{r}\right)\frac{b^{2}}{r^{2}}}
\displaystyle p^{\theta} \displaystyle = \displaystyle 0
\displaystyle p^{\phi} \displaystyle = \displaystyle E\left(1-\frac{2GM}{r}\right)^{-1}\frac{d\phi}{dt}
\displaystyle \displaystyle = \displaystyle E\frac{b}{r^{2}}

The 3-velocity components as measured by the observer are

\displaystyle v^{i} \displaystyle = \displaystyle \frac{p^{i}}{p^{t}}
\displaystyle \displaystyle = \displaystyle \frac{\mathbf{o}_{i}\cdot\mathbf{p}}{-\mathbf{o}_{t}\cdot\mathbf{p}}

Using our basis vectors from above, we get

\displaystyle -\mathbf{o}_{t}\cdot\mathbf{p} \displaystyle = \displaystyle E\left(1-\frac{2GM}{r}\right)\left(1-\frac{2GM}{r}\right)^{-2}\pm E\left(1-\frac{2GM}{r}\right)^{-1}\sqrt{\frac{2GM}{r}}\sqrt{1-\left(1-\frac{2GM}{r}\right)\frac{b^{2}}{r^{2}}}
\displaystyle \displaystyle = \displaystyle E\left(1-\frac{2GM}{r}\right)^{-1}\left[1\pm\sqrt{\frac{2GM}{r}}\sqrt{1-\left(1-\frac{2GM}{r}\right)\frac{b^{2}}{r^{2}}}\right]
\displaystyle \mathbf{o}_{x}\cdot\mathbf{p} \displaystyle = \displaystyle E\frac{b}{r}

The sine of the emitted angle is, as before

\displaystyle \sin\psi \displaystyle = \displaystyle \frac{v_{x}}{1}=\frac{\mathbf{o}_{x}\cdot\mathbf{p}}{-\mathbf{o}_{t}\cdot\mathbf{p}}
\displaystyle \displaystyle = \displaystyle \frac{b}{r}\left(1-\frac{2GM}{r}\right)\left[1\pm\sqrt{\frac{2GM}{r}}\sqrt{1-\left(1-\frac{2GM}{r}\right)\frac{b^{2}}{r^{2}}}\right]^{-1}

As a check, we can also calculate the cosine:

\displaystyle \cos\psi \displaystyle = \displaystyle v_{z}=\frac{\mathbf{o}_{z}\cdot\mathbf{p}}{-\mathbf{o}_{t}\cdot\mathbf{p}}
\displaystyle \displaystyle = \displaystyle \left[\sqrt{\frac{2GM}{r}}\pm\sqrt{1-\left(1-\frac{2GM}{r}\right)\frac{b^{2}}{r^{2}}}\right]\left[1\pm\sqrt{\frac{2GM}{r}}\sqrt{1-\left(1-\frac{2GM}{r}\right)\frac{b^{2}}{r^{2}}}\right]^{-1}

After a bit of algebra, it can be confirmed that {\sin^{2}\psi+\cos^{2}\psi=1} which is reassuring.

The critical angle occurs when the impact parameter {b=\sqrt{27}GM}, so

\displaystyle \sin\psi_{c}=\frac{\sqrt{27}GM}{r}\left(1-\frac{2GM}{r}\right)\left[1\pm\sqrt{\frac{2GM}{r}}\sqrt{1-\left(1-\frac{2GM}{r}\right)\frac{27G^{2}M^{2}}{r^{2}}}\right]^{-1}

Unlike the case for a stationary observer which is defined only for {r>2GM}, this formula is actually well-defined for all {r>0}, although {\sin\psi_{c}<0} for {r<2GM}. We’ll plot {\sin\psi_{c}} versus {r} (in units of {GM}) for both signs, with the plus sign in red and the minus sign in blue. We see that something odd happens at {r=3GM}:

Moore 12.07a

I’m not totally sure of the interpretation, but if we look at the analysis of the stationary observer that we did earlier, we see that at {r=3GM}, the critical emission angle is {\psi_{c}=90^{\circ}}. That is, for {r>3GM}, the photons at the critical angle are emitted such {dr/dt<0}, while for {r<3GM}, they are emitted with {dr/dt>0}. This means that we should take the minus sign for {p_{r}} in the former case, and the plus sign in the latter. Since the observer is stationary, it is in the same frame as the global Schwarzschild frame.

When dealing with a moving observer, we are still doing the calculations in the global frame (that is, the frame of the central mass {M}); it is only the local basis vectors that have changed due to the motion of the observer. Therefore, the switch from positive to negative {p_{r}} still happens at {r=3GM}, since we are using {\mathbf{p}} as calculated in the global frame. In the plot, therefore, we should use the red curve for {r<3GM} and the blue curve for {r>3GM}.

To the moving observer, however, the angle subtended by the black hole is different from that for a stationary observer at the same distance from the black hole. For example, the maximum of {\sin\psi_{c}=1} occurs at {r=5.196GM} so it is at that radius that the critical angle relative to the moving observer is {90^{\circ}}.

The plot of {\cos\psi_{c}} allows us to determine the quadrant of {\psi_{c}}:

Moore 12.07b

At other radii, we have:

  • {r=4GM}; {\psi_{c}=64.355^{\circ}}
  • {r=3GM}; {\psi_{c}=35.264{}^{\circ}}
  • {r=2GM;} {\psi_{c}=0^{\circ}}
  • {r=GM}; {\psi_{c}=-37.771^{\circ}}

I’m not sure what to make of the last result, since I’d imagine {r=GM} is inside the black hole, where presumably the Schwarzschild metric breaks down.

From the formula {E_{obs}=-\mathbf{o}_{t}\cdot\mathbf{p}} we can work out the observed energy of a photon fired radially from infinity at the observer. If the photon is coming in on a radial line, the impact parameter {b=0} and {p_{r}=-E_{\infty}} so we get

\displaystyle E_{obs} \displaystyle = \displaystyle E_{\infty}\left(1-\sqrt{\frac{2GM}{r}}\right)\left(1-\frac{2GM}{r}\right)^{-1}
\displaystyle \displaystyle = \displaystyle \frac{E_{\infty}}{1+\sqrt{\frac{2GM}{r}}}

where {E_{\infty}} is the photon’s energy as observed at infinity. Since {E_{obs}<E_{\infty}} for all {r}, the light is always red-shifted. The fractional change in wavelength is

\displaystyle \frac{\Delta\lambda}{\lambda_{\infty}} \displaystyle = \displaystyle \left(\frac{1}{E_{\infty}}-\frac{1}{E_{obs}}\right)E_{\infty}
\displaystyle \displaystyle = \displaystyle -\sqrt{\frac{2GM}{r}}

Photon path in flat space

Reference: Moore, Thomas A., A General Relativity Workbook, University Science Books (2013) – Chapter 12; Problem 12.6.

The equations of motion for a photon in the Schwarzschild metric are:

\displaystyle \frac{d\phi}{dt} \displaystyle = \displaystyle \frac{1}{r^{2}}\left(1-\frac{2GM}{r}\right)b
\displaystyle \left[\left(1-\frac{2GM}{r}\right)^{-1}\frac{dr}{dt}\right]^{2}+\left(1-\frac{2GM}{r}\right)\frac{b^{2}}{r^{2}} \displaystyle = \displaystyle 1

In flat space, {M=0} so these equations reduce to

\displaystyle \frac{d\phi}{dt} \displaystyle = \displaystyle \frac{b}{r^{2}}
\displaystyle \frac{dr}{dt} \displaystyle = \displaystyle \pm\sqrt{1-\frac{b^{2}}{r^{2}}}

Dividing the first by the second, we get

\displaystyle \frac{d\phi}{dr}=\pm\frac{b}{r^{2}\sqrt{1-\frac{b^{2}}{r^{2}}}}

We can integrate this directly to get

\displaystyle \phi\left(r\right)=\mp\arctan\frac{b}{\sqrt{r^{2}-b^{2}}}+k

where {k} is a constant of integration.

We can convert this into a more meaningful form by setting {k=\alpha+\frac{\pi}{2}}. Looking at the minus sign option first, we then get

\displaystyle \tan\left(\frac{\pi}{2}-\phi+\alpha\right)=\frac{b}{\sqrt{r^{2}-b^{2}}}

or, using the identity {\tan\left(\frac{\pi}{2}-x\right)=1/\tan x}

\displaystyle \tan\left(\phi-\alpha\right)=\frac{\sqrt{r^{2}-b^{2}}}{b}

We can interpret this geometrically by looking at a plot of a straight line path in polar coordinates:

Moore 12.06a

In this diagram, the line is in red, and the segment {b} (in blue) is the closest approach to the origin (so {b} is perpendicular to the line). A general point on the line is indicated by the green line {r}. The red side of the triangle has length {\sqrt{r^{2}-b^{2}}} so if we label the angles as shown, we have (taking {\alpha<0} since it’s below the {x} axis) the same equation for the tangent as above. That is, the relation for the tangent describes a straight line in polar coordinates, where {b} is the perpendicular distance from the origin to the line, and {\alpha} is the angle from the {x} axis to the segment {b}.

Taking the plus sign in the above derivation we get

\displaystyle \tan\left(-\frac{\pi}{2}+\phi-\alpha\right) \displaystyle = \displaystyle \frac{b}{\sqrt{r^{2}-b^{2}}}
\displaystyle \displaystyle = \displaystyle -\tan\left(\frac{\pi}{2}-\phi+\alpha\right)
\displaystyle \tan\left(\phi-\alpha\right) \displaystyle = \displaystyle -\frac{\sqrt{r^{2}-b^{2}}}{b}

This would apply if {\phi-\alpha<0}. Thus the path of a photon in flat space is a straight line.

Incidentally, to get the same form as given by Moore in his question, we can use the trig identity {1+\tan^{2}x=\sec^{2}x}:

\displaystyle 1+\tan^{2}\left(\phi-\alpha\right) \displaystyle = \displaystyle \frac{r^{2}}{b^{2}}
\displaystyle \cos\left(\phi-\alpha\right) \displaystyle = \displaystyle \frac{b}{r}
\displaystyle \phi \displaystyle = \displaystyle \arccos\frac{b}{r}+\alpha

Red-shifts and blue-shifts

Reference: Moore, Thomas A., A General Relativity Workbook, University Science Books (2013) – Chapter 12; Problem 12.5.

We’ve seen that the fact that Schwarzschild time coordinate {t} and the proper time {\tau} are not the same leads to the gravitational redshift. The relation between the wavelength of a photon at two distances {r_{R}} and {r_{E}} from a mass {M} is

\displaystyle  \frac{\lambda_{R}}{\lambda_{E}}=\sqrt{\frac{1-2GM/r_{R}}{1-2GM/r_{E}}}

We can derive the same formula from the photon’s four-momentum. In an observer’s locally flat frame, the energy of any object (including a photon) is

\displaystyle  E_{o}=-\mathbf{o}_{t}\cdot\mathbf{p}

since {\mathbf{o}_{t}=\left[1,0,0,0\right]} in that local frame, and {p^{t}=E_{o}}. Using the expressions for these two four-vectors in the global Schwarzschild frame, we get

\displaystyle   \mathbf{o}_{t}^{\prime} \displaystyle  = \displaystyle  \left[\left(1-\frac{2GM}{r}\right)^{-1/2},0,0,0\right]
\displaystyle  p^{\prime t} \displaystyle  = \displaystyle  E_{\infty}\left(1-\frac{2GM}{r}\right)^{-1}

where {E_{\infty}} is the energy of the photon as measured by an observer at infinity. The scalar product gives

\displaystyle   E_{o} \displaystyle  = \displaystyle  -g_{tt}o_{t}^{\prime t}p^{\prime t}
\displaystyle  \displaystyle  = \displaystyle  \left(1-\frac{2GM}{r}\right)\left(1-\frac{2GM}{r}\right)^{-1/2}E_{\infty}\left(1-\frac{2GM}{r}\right)^{-1}
\displaystyle  \displaystyle  = \displaystyle  \frac{E_{\infty}}{\sqrt{1-\frac{2GM}{r}}}

That is, the energy of a photon increases the closer it gets to the mass. In terms of the wavelength, we can use the relation from quantum mechanics: {E=h\nu=hc/\lambda=h/\lambda} (taking {c=1} as usual). Thus

\displaystyle  \lambda_{o}=\lambda_{\infty}\sqrt{1-\frac{2GM}{r}}

or, if we take the ratio of the wavelengths at two finite radii as at the start:

\displaystyle  \frac{\lambda_{R}}{\lambda_{E}}=\sqrt{\frac{1-2GM/r_{R}}{1-2GM/r_{E}}}

which is the same as the original formula.

Another way of saying this is that an observer far from a mass sees light red-shifted compared to an observer near the mass or, conversely, the near observer sees incoming light blue-shifted compared to an observer far from the mass.

Apparent size of a black hole

Reference: Moore, Thomas A., A General Relativity Workbook, University Science Books (2013) – Chapter 12; Problem 12.4.

Suppose we are sitting comfortably at a distance {r} from a black hole. Since a black hole absorbs light, it will appear as a black patch on the background of stars. However, we know that light travelling near to a gravitating object will spiral into that object if its path is close enough, so a black hole will appear larger than it actually is. We’ll find out how much larger here.

It’s actually easier to begin with the inverse problem. That is, suppose our observer at {r} emits a photon at an angle {\psi} relative to the ray projecting radially outwards from the black hole. If {\psi=0}, the photon is emitted directly away from the black hole, while if {\psi=\pi}, it is emitted directly towards the black hole. At other angles, the photon will either curve round the black hole and head off to infinity in some direction, or it will be captured by the black hole and spiral in towards it. We want to find the critical angle {\psi_{c}} which is the smallest angle at which the photon gets captured.

We’ve already worked out expressions for the components of the photon’s velocity as viewed by the observer. The transverse component is

\displaystyle v_{x}=\left(1-\frac{2GM}{r}\right)^{1/2}\frac{b}{r}

where {b} is the impact parameter, originally defined as {b=l/e}. We found that if {b<\sqrt{27}GM}, the photon will be captured by the mass.

If we align the {x} direction in the observer’s local frame with the {\phi} direction in the global frame, and the local {z} direction with the global {r} direction, then we found that {v_{x}^{2}+v_{z}^{2}=1}, so that the initial angle of travel of the photon is {\sin\psi=v_{x}/1=v_{x}}. Therefore the condition for the emitted photon to be captured is

\displaystyle \sin\psi<\frac{\sqrt{27}GM}{r}\left(1-\frac{2GM}{r}\right)^{1/2}

and the critical angle is

\displaystyle \psi_{c}=\arcsin\left[\frac{\sqrt{27}GM}{r}\left(1-\frac{2GM}{r}\right)^{1/2}\right]

We need to be a bit careful in calculating the angle, since the sine is symmetric about the point {\psi=\pi/2}. To work out which angle is correct, we observe that if we emitted a photon radially outward, {\psi=0} and we would expect the photon to escape to infinity unless we’re inside the black hole’s event horizon. However, if {\psi=\pi}, the photon is emitted directly at the black hole and will always be absorbed. As we gradually decrease {\psi} from {\pi}, we’d expect the photon to be absorbed until we hit the critical angle, so the range of {\psi} that allows the photon to be absorbed starts at {\pi} and decreases until we hit {\psi=\psi_{c}}.

For example, suppose we require

\displaystyle \frac{\sqrt{27}GM}{r}\left(1-\frac{2GM}{r}\right)^{1/2}=\frac{\sqrt{2}}{2}

Then {\psi_{c}=\frac{\pi}{4},\frac{3\pi}{4}}. Solving for {r}, we get

\displaystyle r=6GM,\;3\left(\sqrt{3}-1\right)GM

Since {6>3\left(\sqrt{3}-1\right)=2.2}, {r=6GM} corresponds to {\psi_{c}=\frac{3\pi}{4}} and {r=3\left(\sqrt{3}-1\right)GM} to {\psi_{c}=\frac{\pi}{4}}. That is, if we’re at {r=6GM}, then all photons emitted at an angle {\frac{3\pi}{4}<\psi<\pi} will be absorbed.

Because the Schwarzschild metric is invariant under time-reversal (replacing {dt} by {-dt} doesn’t change it), we can invert the argument to show that any photon that arrives at the observer must do so from an angle less than {\frac{3\pi}{4}}, which means that the black hole will subtend an angle of {\pi/2} to the observer at {r=6GM}. To an observer at {r=3\left(\sqrt{3}-1\right)GM}, the black hole subtends {\frac{3\pi}{2}}, so appears to almost wrap around the observer, with only a {\pi/2} circle of the outside universe visible.

We can work out the values for other values of {r}. I’ll give the angles in degrees, since most people find them easier to visualize than radians.

  • {r=2.5GM}. {\psi_{c}=68.4^{\circ}}. The black hole occupies more than half the sky, subtending an angle of {223.3^{\circ}}.
  • {r=3GM}; {\psi_{c}=90^{\circ}}, so at this distance, the black hole appears to fill exactly half the sky with an angle of {180^{\circ}}.
  • {r=4GM}; {\psi_{c}=113.3^{\circ}}. We’re now in the regime where the black hole occupies less than half the sky, with an angle of {133.4^{\circ}}.
  • {r=5GM}; {\psi_{c}=126.4^{\circ}}, black hole subtends {107.2^{\circ}}.

Local flat coordinate systems; four-momentum of photons

Reference: Moore, Thomas A., A General Relativity Workbook, University Science Books (2013) – Chapter 12; Problem 12.3.

Sometimes it’s useful to use a local coordinate system to do calculations. If the spacetime metric (such as the Schwarzschild metric) is smooth over a region of space (that is, it has no singularities such as division by zero and no sudden jumps), then we can define a local metric that is essentially flat. A simple example is that of the surface of the Earth. Although the Earth is spherical, a locally flat 2-d coordinate system works well for distances of, say, a few miles.

In 4-d spacetime, we can define a locally flat coordinate system with the four mutually orthogonal basis vectors

\displaystyle   \mathbf{o}_{t} \displaystyle  = \displaystyle  \left[1,0,0,0\right]
\displaystyle  \mathbf{o}_{x} \displaystyle  = \displaystyle  \left[0,1,0,0\right]
\displaystyle  \mathbf{o}_{y} \displaystyle  = \displaystyle  \left[0,0,1,0\right]
\displaystyle  \mathbf{o}_{z} \displaystyle  = \displaystyle  \left[0,0,0,1\right]

and the usual flat space metric from special relativity {\eta_{ij}}.

If we know some four vector {\mathbf{A}} in this flat coordinate system, we can find its components by taking the scalar product with each of the basis vectors. That is

\displaystyle   \mathbf{o}_{x}\cdot\mathbf{A} \displaystyle  = \displaystyle  \eta_{ij}o_{x}^{i}A^{j}
\displaystyle  \displaystyle  = \displaystyle  \eta_{xx}A^{x}
\displaystyle  \displaystyle  = \displaystyle  A^{x}

The second line follows because the only non-zero component of {\mathbf{o}_{x}} is {o_{x}^{x}} so {i=x}, and then the only non-zero component of the metric {\eta_{xj}} is {\eta_{xx}=+1}. By the same argument, we get the other components:

\displaystyle   \mathbf{o}_{t}\cdot\mathbf{A} \displaystyle  = \displaystyle  -A^{t}
\displaystyle  \mathbf{o}_{y}\cdot\mathbf{A} \displaystyle  = \displaystyle  A^{y}
\displaystyle  \mathbf{o}_{z}\cdot\mathbf{A} \displaystyle  = \displaystyle  A^{z}

This might seem trivial but the important point is that since the scalar product is invariant, if we work out the components in one coordinate system, then if we can find the basis vectors {\mathbf{o}_{i}} in another coordinate system, their scalar products with {\mathbf{A}} in that coordinate system must yield the same numerical results. This is often an easier way of finding the components of {\mathbf{A}} in other coordinate systems (as opposed to using the general tensor transformation formula).

The problem then is to find {\mathbf{o}_{i}} in the other coordinate system. If we take this system to be the global system with the Schwarzschild metric, we can work out this transformation. First, we need to align the axes in the two systems. We’ll take the {x}, {y} and {z} axes in the local system to be aligned with the {\phi},{\theta} and {r} axes in the general system. To get started, suppose the observer (who has the local, flat system) is at rest in the general system. Then his four velocity {\mathbf{u}^{\prime}} as measured in the general system must have all its spatial components equal to zero. Using the Schwarzschild metric, we must have

\displaystyle   \mathbf{u}^{\prime}\cdot\mathbf{u}^{\prime} \displaystyle  = \displaystyle  g_{tt}u^{\prime t}u^{\prime t}
\displaystyle  \displaystyle  = \displaystyle  -\left(1-\frac{2GM}{r}\right)\left(u^{\prime t}\right)^{2}
\displaystyle  \displaystyle  = \displaystyle  -1
\displaystyle  u^{\prime t} \displaystyle  = \displaystyle  \left(1-\frac{2GM}{r}\right)^{-1/2}

In the observer’s own local frame, because the metric is flat and the observer is not moving relative to himself, {\mathbf{u}=\left[1,0,0,0\right]}. That is, in the local frame, {\mathbf{u}=\mathbf{o}_{t}}. Therefore, {\mathbf{u}^{\prime}} is the transformed version {\mathbf{o}_{t}^{\prime}} of the time basis vector, and

\displaystyle  \mathbf{o}_{t}^{\prime}=\left[\left(1-\frac{2GM}{r}\right)^{-1/2},0,0,0\right]

What about transforming the spatial basis vectors? We know that in the flat system {\mathbf{o}_{i}\cdot\mathbf{o}_{j}=\eta_{ij}} so the same must be true in the general system (since these are scalar products). That is, it must also be true that {\mathbf{o}_{i}^{\prime}\cdot\mathbf{o}_{j}^{\prime}=\eta_{ij}}. If we’ve aligned the axes in the two systems as stated above, then {\mathbf{o}_{x}^{\prime}} must have zero components along the {\theta} and {r} directions, so we must have (where the components are listed in the order {t}, {r}, {\theta}, {\phi}):

\displaystyle  \mathbf{o}_{x}^{\prime}=\left[o_{x}^{\prime t},0,0,o_{x}^{\prime\phi}\right]

We must also have

\displaystyle   \mathbf{o}_{t}^{\prime}\cdot\mathbf{o}_{x}^{\prime} \displaystyle  = \displaystyle  g_{tt}\left(1-\frac{2GM}{r}\right)^{-1/2}o_{x}^{\prime t}+g_{\phi\phi}\times0\times o_{x}^{\prime\phi}
\displaystyle  \displaystyle  = \displaystyle  0

from which we get {o_{x}^{\prime t}=0}. Finally, we must also have

\displaystyle   \mathbf{o}_{x}^{\prime}\cdot\mathbf{o}_{x}^{\prime} \displaystyle  = \displaystyle  g_{\phi\phi}\left(o_{x}^{\prime\phi}\right)^{2}=1
\displaystyle  o_{x}^{\prime\phi} \displaystyle  = \displaystyle  \frac{1}{r\sin\theta}

Therefore

\displaystyle  \mathbf{o}_{x}^{\prime}=\left[0,0,0,\frac{1}{r\sin\theta}\right]

By the same process, we can work out the other two vectors:

\displaystyle   \mathbf{o}_{y}^{\prime} \displaystyle  = \displaystyle  \left[0,0,-\frac{1}{r},0\right]
\displaystyle  \mathbf{o}_{z}^{\prime} \displaystyle  = \displaystyle  \left[0,\sqrt{1-\frac{2GM}{r}},0,0\right]

where the minus sign for {\mathbf{o}_{y}^{\prime}} is because we’re taking {y} to be in the {-\theta} direction in order to get a right-handed coordinate system.

Having worked out the basis vectors in the general system, if we’ve calculated the the components of {\mathbf{A}} in the local, flat system as above, we can use the invariance of the scalar product to work out the components of {\mathbf{A}} in the general system.

We can now apply this method to the specific four-vector which is {\mathbf{p}}, the four-momentum of a photon. As usual, we start with the four-momentum of a particle with rest mass and seek a form that makes no mention of {m} or {\tau}, the proper time. We get

\displaystyle   p^{i} \displaystyle  = \displaystyle  m\frac{dx^{i}}{d\tau}
\displaystyle  \displaystyle  = \displaystyle  m\frac{dx^{i}}{dt}\frac{dt}{d\tau}
\displaystyle  \displaystyle  = \displaystyle  me\left(1-\frac{2GM}{r}\right)^{-1}\frac{dx^{i}}{dt}

where we’ve used the definition of {e}.

We still need to get rid of {m}, but when defining {e} we discovered that it is the energy per unit mass at {r=\infty}, so {me} is the total energy {E} of the object at infinity. This is carried over to photons by just defining their four-momentum as

\displaystyle  p^{i}=E\left(1-\frac{2GM}{r}\right)^{-1}\frac{dx^{i}}{dt}

For a photon moving in the equatorial plane, we therefore get

\displaystyle   p^{t} \displaystyle  = \displaystyle  E\left(1-\frac{2GM}{r}\right)^{-1}\frac{dt}{dt}=E\left(1-\frac{2GM}{r}\right)^{-1}
\displaystyle  p^{r} \displaystyle  = \displaystyle  E\left(1-\frac{2GM}{r}\right)^{-1}\frac{dr}{dt}
\displaystyle  \displaystyle  = \displaystyle  \pm E\left(1-\frac{2GM}{r}\right)^{-1}\left(1-\frac{2GM}{r}\right)\sqrt{1-\left(1-\frac{2GM}{r}\right)\frac{b^{2}}{r^{2}}}
\displaystyle  \displaystyle  = \displaystyle  \pm E\sqrt{1-\left(1-\frac{2GM}{r}\right)\frac{b^{2}}{r^{2}}}
\displaystyle  p^{\theta} \displaystyle  = \displaystyle  0
\displaystyle  p^{\phi} \displaystyle  = \displaystyle  E\left(1-\frac{2GM}{r}\right)^{-1}\frac{d\phi}{dt}
\displaystyle  \displaystyle  = \displaystyle  E\left(1-\frac{2GM}{r}\right)^{-1}\frac{1}{r^{2}}\left(1-\frac{2GM}{r}\right)b
\displaystyle  \displaystyle  = \displaystyle  E\frac{b}{r^{2}}

For {p^{r}} and {p^{\phi}} we’ve used the photon equations of motion.

Now suppose we want the velocity components of the photon as measured by our observer in his local, flat frame. The components are, as measured in the flat local frame:

\displaystyle   v^{i} \displaystyle  = \displaystyle  \frac{p^{i}}{p^{t}}
\displaystyle  \displaystyle  = \displaystyle  \frac{\mathbf{o}_{i}\cdot\mathbf{p}}{-\mathbf{o}_{t}\cdot\mathbf{p}}

Because we’ve managed to express the components in terms of scalar products, we can evaluate them in any frame, so since we know the components of the {\mathbf{o}_{i}}s and {\mathbf{p}} in the general frame, we can do the scalar product in that frame. We must remember to use the Schwarzschild metric in calculating the scalar products in the general frame, of course! So we get (remember {\theta=\frac{\pi}{2}})

\displaystyle   -\mathbf{o}_{t}\cdot\mathbf{p} \displaystyle  = \displaystyle  -g_{tt}\left(1-\frac{2GM}{r}\right)^{-1/2}E\left(1-\frac{2GM}{r}\right)^{-1}
\displaystyle  \displaystyle  = \displaystyle  E\left(1-\frac{2GM}{r}\right)\left(1-\frac{2GM}{r}\right)^{-1/2}\left(1-\frac{2GM}{r}\right)^{-1}
\displaystyle  \displaystyle  = \displaystyle  E\left(1-\frac{2GM}{r}\right)^{-1/2}
\displaystyle  \mathbf{o}_{x}\cdot\mathbf{p} \displaystyle  = \displaystyle  g_{\phi\phi}\frac{1}{r\sin\theta}E\frac{b}{r^{2}}
\displaystyle  \displaystyle  = \displaystyle  r^{2}\sin^{2}\theta\frac{1}{r\sin\theta}E\frac{b}{r^{2}}
\displaystyle  \displaystyle  = \displaystyle  E\frac{b}{r}
\displaystyle  \mathbf{o}_{y}\cdot\mathbf{p} \displaystyle  = \displaystyle  0
\displaystyle  \mathbf{o}_{z}\cdot\mathbf{p} \displaystyle  = \displaystyle  \pm g_{rr}\left(1-\frac{2GM}{r}\right)^{1/2}E\sqrt{1-\left(1-\frac{2GM}{r}\right)\frac{b^{2}}{r^{2}}}
\displaystyle  \displaystyle  = \displaystyle  \pm\left(1-\frac{2GM}{r}\right)^{-1}\left(1-\frac{2GM}{r}\right)^{1/2}E\sqrt{1-\left(1-\frac{2GM}{r}\right)\frac{b^{2}}{r^{2}}}
\displaystyle  \displaystyle  = \displaystyle  \pm E\left(1-\frac{2GM}{r}\right)^{-1/2}\sqrt{1-\left(1-\frac{2GM}{r}\right)\frac{b^{2}}{r^{2}}}

We therefore get

\displaystyle   v_{x} \displaystyle  = \displaystyle  \left(1-\frac{2GM}{r}\right)^{1/2}\frac{b}{r}
\displaystyle  v_{y} \displaystyle  = \displaystyle  0
\displaystyle  v_{z} \displaystyle  = \displaystyle  \pm\sqrt{1-\left(1-\frac{2GM}{r}\right)\frac{b^{2}}{r^{2}}}

The magnitude of the velocity is then

\displaystyle   \sqrt{v_{x}^{2}+v_{z}^{2}} \displaystyle  = \displaystyle  \left(1-\frac{2GM}{r}\right)\frac{b^{2}}{r^{2}}+1-\left(1-\frac{2GM}{r}\right)\frac{b^{2}}{r^{2}}
\displaystyle  \displaystyle  = \displaystyle  1

Thus the photon’s speed is always 1 to the observer in the local frame, which is a relief, since photons must always have speed 1.

Happy 182nd birthday, Professor Maxwell

A slight diversion in this post, since today (June 13) is the birthday of one of my physics heroes: James Clerk Maxwell. Any physics student should be familiar with his name, as it is attached to some of the most important equations in physics such as Maxwell’s equations in electromagnetism, of course, but also the Maxwell-Boltzmann distribution in statistical mechanics. He must surely be the greatest physicist of the 19th century, and it’s arguable that his contributions to science are more important to our everyday lives than those of Einstein. Maxwell’s equations are at the foundation of pretty well everything electrical, from computers to radio waves.

I won’t go into a biography of him here (Wikipedia has a decent entry on him), though I would like to mention that even in his homeland (Scotland, where I happen to live), he is largely unknown, even by many in technical professions such as science and engineering. There is a statue of him that was erected recently in Edinburgh, but if you mention his name to most people, you will draw a blank.

So to anyone interested in physics on any level, please take a moment to ponder Maxwell’s contributions and importance to your subject, and do your part to get him better known!

Photon equations of motion

Reference: Moore, Thomas A., A General Relativity Workbook, University Science Books (2013) – Chapter 12; Problems 12.1, 12.2.

The main problem in applying the geodesic equation to photons is that for photons, {d\tau} is always zero (that is, the proper time in the frame of the photon never changes). The geodesic equation makes explicit reference to {\tau}:

\displaystyle  \frac{d}{d\tau}\left(g_{aj}\frac{dx^{j}}{d\tau}\right)-\frac{1}{2}\frac{\partial g_{ij}}{\partial x^{a}}\frac{dx^{i}}{d\tau}\frac{dx^{j}}{d\tau}=0 \ \ \ \ \ (1)

The resolution of this problem is typical of the approach to relativity: find some equations for ordinary particles (that is, particles with rest mass) that don’t depend on {\tau} or {m} (since photons have no rest mass, we can’t have mass showing up in the equations either), and then assume that these equations are valid for photons too.

We can start with the conserved quantities in the Schwarzschild metric:

\displaystyle   e \displaystyle  = \displaystyle  \left(1-\frac{2GM}{r}\right)\frac{dt}{d\tau}
\displaystyle  l \displaystyle  = \displaystyle  r^{2}\sin^{2}\theta\frac{d\phi}{d\tau}

These quantities both make reference to {\tau} but their ratio doesn’t. If we further assume that motion is in the equatorial plane so that {\theta=\pi/2}, then we get

\displaystyle   \frac{l}{e} \displaystyle  = \displaystyle  \frac{r^{2}\frac{d\phi}{d\tau}}{\left(1-\frac{2GM}{r}\right)\frac{dt}{d\tau}}
\displaystyle  \displaystyle  = \displaystyle  \left(1-\frac{2GM}{r}\right)^{-1}r^{2}\frac{d\phi}{dt}

where {t} is now the Schwarzschild time coordinate. Thus one equation of motion is

\displaystyle  \frac{d\phi}{dt}=\frac{1}{r^{2}}\left(1-\frac{2GM}{r}\right)b \ \ \ \ \ (2)


where {b\equiv l/e}.

A second equation can be derived starting from the Schwarzschild metric definition:

\displaystyle  ds^{2}=-\left(1-\frac{2GM}{r}\right)dt^{2}+\left(1-\frac{2GM}{r}\right)^{-1}dr^{2}+r^{2}d\theta^{2}+r^{2}\sin^{2}\theta d\phi^{2}

For a photon, {ds^{2}=0}, and since we’re restricting motion to the equatorial plane {d\theta=0}, so

\displaystyle  -\left(1-\frac{2GM}{r}\right)dt^{2}+\left(1-\frac{2GM}{r}\right)^{-1}dr^{2}+r^{2}d\phi^{2}=0

Dividing through by {\left(1-\frac{2GM}{r}\right)dt^{2}} and using 2, we get

\displaystyle   \left[\left(1-\frac{2GM}{r}\right)^{-1}\frac{dr}{dt}\right]^{2}+\left(1-\frac{2GM}{r}\right)^{-1}r^{2}\left(\frac{d\phi}{dt}\right)^{2} \displaystyle  = \displaystyle  1
\displaystyle  \left[\left(1-\frac{2GM}{r}\right)^{-1}\frac{dr}{dt}\right]^{2}+\left(1-\frac{2GM}{r}\right)\frac{b^{2}}{r^{2}} \displaystyle  = \displaystyle  1
\displaystyle  \left[\frac{1}{b}\left(1-\frac{2GM}{r}\right)^{-1}\frac{dr}{dt}\right]^{2}+\frac{1}{r^{2}}\left(1-\frac{2GM}{r}\right) \displaystyle  = \displaystyle  \frac{1}{b^{2}}

This last equation is the equation of motion for {r}.

If we interpret this as a sort of kinetic energy plus potential energy equation, we can define the pseudo-kinetic energy as

\displaystyle  K\equiv\left[\frac{1}{b}\left(1-\frac{2GM}{r}\right)^{-1}\frac{dr}{dt}\right]^{2}

and the pseudo-potential energy as

\displaystyle  U\left(r\right)\equiv\frac{1}{r^{2}}\left(1-\frac{2GM}{r}\right)

The potential energy has a zero at {r=2GM} and a maximum found from its derivative

\displaystyle   \frac{dU}{dr} \displaystyle  = \displaystyle  -\frac{2}{r^{3}}+\frac{6GM}{r^{4}}=0
\displaystyle  r \displaystyle  = \displaystyle  3GM
\displaystyle  U_{max} \displaystyle  = \displaystyle  \frac{1}{27G^{2}M^{2}}

The ratio {l/e=b} must be specified in order for the equations of motion to be solved. We can put a geometric interpretation on {b}, which helps us to understand what the equations of motion are saying. First, draw a ray from {r=0} out to infinity, and suppose a photon starts at infinity and follows an initial path that is parallel to this ray, but at a perpendicular distance {d} from it. Now draw a triangle with one side being the line from {r=0} to the photon, a second side being along the ray we drew at the start, and the third side being the perpendicular line of length {d} that goes from the photon’s path to the ray. For large {r}, the first two sides of the triangle will be almost the same length, and the angle {\phi} subtended by the side {d} will be very small, so that {\sin\phi\approx\phi\approx d/r}. Taking the derivative, we get

\displaystyle  \frac{d\phi}{dt}=-\frac{d}{r^{2}}\frac{dr}{dt}

From the equation of motion for {r}, at very large {r}

\displaystyle   \frac{1}{b^{2}}\left(\frac{dr}{dt}\right)^{2} \displaystyle  \approx \displaystyle  \frac{1}{b^{2}}
\displaystyle  \frac{dr}{dt} \displaystyle  \approx \displaystyle  \pm1

This isn’t terribly surprising, since at very large {r}, space is essentially flat, so the speed of the photon is just {dr/dt} which is the speed of light. Putting this into the previous equation, we get (taking {dr/dt=-1} for an incoming photon):

\displaystyle  \frac{d\phi}{dt}=-\frac{d}{r^{2}}\frac{dr}{dt}=\frac{d}{r^{2}}

However, from 2 at very large {r} {d\phi/dt\approx b/r^{2}}, so it seems that {b=d}, that is, {b} is what is known as the impact parameter. In flat space, {b} is the distance of closest approach to {r=0}.

Returning to the Schwarzschild metric, in the equation of motion for {r} above, we see that if {b^{2}<27G^{2}M^{2}} then {dr/dt} can never be zero, since {U_{max}} is always less than {1/b^{2}}. Thus if the impact parameter is {b<\sqrt{27}GM} and {dr/dt<0} initially (that is, the photon is approaching {r=0}), then it must continue to approach {r=0} forever. The only way it can do this is to spiral in towards the origin.

If {b>\sqrt{27}GM}, there will be two positive values of {r} for which {U\left(r\right)=1/b^{2}} (since this equation is actually a cubic equation in {r}, there are 3 solutions, but one of them is always negative, which we ignore). The values of {r} between these solutions make {U>1/b^{2}} which is not allowed, since that would require the first term, which is a square, to be negative. Thus a photon coming in with {dr/dt<0} approaches until {r} is the higher of the two solutions, at which point {dr/dt=0}. After that, {dr/dt} becomes positive, and the photon recedes in some other direction.

For example, if {b=6GM}, the photon will approach until it reaches the larger root, which is {r=4.453GM} and then recede to infinity.

If we consider a cylinder of light heading in towards the central mass, then all photons with {b<\sqrt{27}GM} will be absorbed by the mass, thus all light within a cylinder of radius {R=\sqrt{27}GM} will be absorbed. Photons with impact parameters outside this value will not be absorbed, but will be deflected by the mass and continue on back out to infinity.

If {b=\sqrt{27}GM}, then there is only one positive value of {r} for which {U=1/b^{2}}, and that is {r=3GM}. When the photon reaches that radius, it can continue either by spiralling inwards or receding to infinity, or by entering an unstable circular orbit.

Statistical mechanics in quantum theory: 3-d harmonic oscillator

References: Griffiths, David J. (2005), Introduction to Quantum Mechanics, 2nd Edition; Pearson Education – Problem 5.37.

Here’s another example of working out the energy of a collection of particles. This time we’ll look at the 3-d harmonic oscillator and consider distinguishable particles only. In this case, the number of particles {n_{j}} in energy level {j} is

\displaystyle  n_{j}=d_{j}e^{-\alpha-\beta E_{j}}

where {\alpha} and {\beta} are the Lagrange multipliers and {E_{j}} is the energy of that level. For the 3-d harmonic oscillator

\displaystyle  E_{j}=\left(j+\frac{3}{2}\right)\hbar\omega

and {j} is the sum of the three quantum numbers {j=j_{x}+j_{y}+j_{z}} in the three rectangular coordinates. The degeneracy of state {j} was worked out to be

\displaystyle  d_{j}=\frac{1}{2}\left(j+1\right)\left(j+2\right)

The total number of particles is then

\displaystyle  N=\frac{e^{-\alpha-3\beta\hbar\omega/2}}{2}\sum_{j=0}^{\infty}\left(j+1\right)\left(j+2\right)e^{-\beta j\hbar\omega} \ \ \ \ \ (1)

The sum can be evaluated directly in Maple, giving the result

\displaystyle   N \displaystyle  = \displaystyle  e^{-\alpha-3\beta\hbar\omega/2}\frac{e^{3\beta\hbar\omega}}{\left(e^{\beta\hbar\omega}-1\right)^{3}}
\displaystyle  \displaystyle  = \displaystyle  e^{-\alpha}\frac{e^{3\beta\hbar\omega/2}}{\left(e^{\beta\hbar\omega}-1\right)^{3}}
\displaystyle  \displaystyle  = \displaystyle  e^{-\alpha}\frac{e^{-3\beta\hbar\omega/2}}{\left(1-e^{-\beta\hbar\omega}\right)^{3}}

However, to see how this is done using the method suggested by Griffiths, we start with the geometric series

\displaystyle  \sum_{j=0}^{\infty}x^{j}=\frac{1}{1-x}

Taking the first derivative:

\displaystyle   \frac{1}{\left(1-x\right)^{2}} \displaystyle  = \displaystyle  \sum_{j=0}^{\infty}jx^{j-1}
\displaystyle  \frac{x}{\left(1-x\right)^{2}} \displaystyle  = \displaystyle  \sum_{j=0}^{\infty}jx^{j}

And the second derivative:

\displaystyle   \frac{2}{\left(1-x\right)^{3}} \displaystyle  = \displaystyle  \sum_{j=0}^{\infty}j\left(j-1\right)x^{j-2}
\displaystyle  \frac{2x^{2}}{\left(1-x\right)^{3}} \displaystyle  = \displaystyle  \sum_{j=0}^{\infty}j\left(j-1\right)x^{j}

We can write the coefficient in 1 as

\displaystyle  \left(j+1\right)\left(j+2\right)=j\left(j-1\right)+4j+2

and define

\displaystyle  x\equiv e^{-\beta\hbar\omega}

Then we have

\displaystyle   \sum_{j=0}^{\infty}\left(j+1\right)\left(j+2\right)e^{-\beta j\hbar\omega} \displaystyle  = \displaystyle  \sum_{j=0}^{\infty}j\left(j-1\right)x^{j}+4\sum_{j=0}^{\infty}jx^{j}+2\sum_{j=0}^{\infty}jx^{j-1}
\displaystyle  \displaystyle  = \displaystyle  \frac{2x^{2}}{\left(1-x\right)^{3}}+4\frac{x}{\left(1-x\right)^{2}}+\frac{2}{1-x}
\displaystyle  \displaystyle  = \displaystyle  \frac{2}{\left(1-x\right)^{3}}\left[x^{2}+2x\left(1-x\right)+\left(1-x\right)^{2}\right]
\displaystyle  \displaystyle  = \displaystyle  \frac{2}{\left(1-x\right)^{3}}

Plugging this back into 1 gives the result above for {N}.

Using {\beta=1/k_{B}T}, we can solve for {\alpha} to get

\displaystyle  e^{-\alpha}=N\frac{\left(1-e^{-\hbar\omega/k_{B}T}\right)^{3}}{e^{-3\hbar\omega/2k_{B}T}}

In terms of the chemical potential, we have {-\alpha=\mu k_{B}T} so

\displaystyle  \mu=\frac{1}{k_{B}T}\left[\ln N+3\ln\left(1-e^{-\hbar\omega/k_{B}T}\right)+\frac{3\hbar\omega}{2k_{B}T}\right]

For the total energy, we have

\displaystyle   E \displaystyle  = \displaystyle  \sum_{j=0}^{\infty}n_{j}E_{j}
\displaystyle  \displaystyle  = \displaystyle  \frac{\hbar\omega}{2}e^{-\alpha}\sum_{j=0}^{\infty}\left(j+1\right)\left(j+2\right)\left(j+\frac{3}{2}\right)e^{-\beta E_{j}}

This sum can be done in a similar way to the previous one (though we need another derivative of the geometric series). However, I can be lazy and get Maple to do the work, giving the result

\displaystyle   E \displaystyle  = \displaystyle  \frac{3\hbar\omega}{2}e^{-\alpha-3\beta\hbar\omega/2}\frac{e^{3\beta\hbar\omega}\left(e^{\beta\hbar\omega}+1\right)}{\left(e^{\beta\hbar\omega}-1\right)^{4}}
\displaystyle  \displaystyle  = \displaystyle  \frac{3\hbar\omega}{2}e^{-\alpha-3\beta\hbar\omega/2}\frac{\left(1+e^{-\beta\hbar\omega}\right)}{\left(1-e^{-\beta\hbar\omega}\right)^{3}\left(1-e^{-\beta\hbar\omega}\right)}

Substituting for {e^{-\alpha}} and {\beta} from above, we get

\displaystyle  E=\frac{3\hbar\omega N}{2}\frac{\left(1+e^{-\hbar\omega/k_{B}T}\right)}{\left(1-e^{-\hbar\omega/k_{B}T}\right)}

In the limit of very low temperatures {k_{B}T\ll\hbar\omega} and

\displaystyle  \mu\left(T\right)\rightarrow k_{B}T\ln N+\frac{3}{2}\hbar\omega\rightarrow\frac{3}{2}\hbar\omega

and

\displaystyle  E\rightarrow\frac{3}{2}\hbar\omega N

Thus all particles settle into the ground state, although even at absolute zero the energy is not zero.

At the other extreme, where {k_{B}T\gg\hbar\omega}

\displaystyle   1+e^{-\hbar\omega/k_{B}T} \displaystyle  \rightarrow \displaystyle  2
\displaystyle  1-e^{-\hbar\omega/k_{B}T} \displaystyle  \rightarrow \displaystyle  \frac{\hbar\omega}{k_{B}T}
\displaystyle  E \displaystyle  \rightarrow \displaystyle  3Nk_{B}T

The equipartition theorem from classical statistical mechanics says that at thermal equilibrium, each degree of freedom in the system contributes {\frac{1}{2}k_{B}T} to the total energy. In the 3-d harmonic oscillator, each particle has 3 translational degrees of freedom and 3 vibrational degrees of freedom making a total of 6, giving the energy above.

Electron pressure in a neutron star

References: Griffiths, David J. (2005), Introduction to Quantum Mechanics, 2nd Edition; Pearson Education – Problem 5.36.

We can get a rough idea of how electrons fill the available states in the relativistic regime by using the relativistic kinetic energy

\displaystyle  E=\sqrt{p^{2}c^{2}+m^{2}c^{4}}-mc^{2}

in place of the Newtonian formula {E=p^{2}/2m}. This presumably isn’t a true relativistic quantum theory, since the rest of the calculations are still based on the Schrödinger equation, but it’s interesting to see where it leads.

If we take the momentum to be {\mathbf{p}=\hbar\mathbf{k}}, then in the extreme relativistic case, {pc\gg mc^{2}} so {E\approx pc=\hbar ck}. If we use this energy in our previous calculation, the energy of the states in a shell in the first octant in {k}-space is

\displaystyle   dE \displaystyle  = \displaystyle  2\hbar ck\frac{V}{\pi^{3}}\left(\frac{1}{8}4\pi k^{2}\right)dk
\displaystyle  \displaystyle  = \displaystyle  \hbar c\frac{V}{\pi^{2}}k^{3}dk

where the first factor of 2 accounts for the spin degeneracy, {V/\pi^{3}} is the volume of one state in {k}-space and the last factor is the volume of a spherical shell in the first octant.

The total energy is then

\displaystyle   E_{tot} \displaystyle  = \displaystyle  \hbar c\frac{V}{\pi^{2}}\int_{0}^{k_{F}}k^{3}dk
\displaystyle  \displaystyle  = \displaystyle  \frac{\hbar ck_{F}^{4}V}{4\pi^{2}}

where {k_{F}} is the Fermi radius (the maximum value of {k} in the ground state). This is the same as in the previous calculation, so we have

\displaystyle   k_{F} \displaystyle  = \displaystyle  \left(\frac{3\pi^{2}Nq}{V}\right)^{1/3}
\displaystyle  E_{tot} \displaystyle  = \displaystyle  \frac{\hbar c}{4\pi^{2}V^{1/3}}\left(3\pi^{2}Nq\right)^{4/3}

Using {V=4\pi R^{3}/3} for a sphere, we get

\displaystyle  E_{tot}=\frac{\hbar c}{4\pi^{2}}\left(3\pi^{2}Nq\right)^{4/3}\left(\frac{3}{4\pi}\right)^{1/3}\frac{1}{R}

We can now apply this to the case of a star as we did with the white dwarf. The gravitational potential energy of the star is the same:

\displaystyle  U=-\frac{3}{5R}GN^{2}m_{n}^{2}

where {G} is the gravitational constant and {m_{n}} is the mass of a nucleon.

At this point, we would like to find the minimum of {E_{tot}+U}, but unlike the non-relativistic case, both terms now have a {1/R} dependence, so there is no minimum. If {E_{tot}+U>0}, the electron pressure is greater than the gravitational force and the star will expand; in the opposite case, gravity wins out and the star collapses. The critical point occurs when {E_{tot}+U=0}, that is when

\displaystyle  E=\left[\frac{\hbar c}{4\pi^{2}}\left(3\pi^{2}Nq\right)^{4/3}\left(\frac{3}{4\pi}\right)^{1/3}-\frac{3}{5}GN^{2}m_{n}^{2}\right]\frac{1}{R}=0

This doesn’t give us a condition on {R}, but the only variable quantity within the brackets is the number of nucelons {N}, which can be translated into the mass of the star. The condition is

\displaystyle   \frac{\hbar c}{4\pi^{2}}\left(3\pi^{2}N_{c}q\right)^{4/3}\left(\frac{3}{4\pi}\right)^{1/3} \displaystyle  = \displaystyle  \frac{3}{5}GN_{c}^{2}m_{n}^{2}
\displaystyle  N_{c} \displaystyle  = \displaystyle  \frac{15\sqrt{5\pi}\left(\hbar c\right)^{3/2}q^{2}}{16G^{3/2}m_{n}^{3}}
\displaystyle  \displaystyle  = \displaystyle  2.047\times10^{57}

(This doesn’t agree with the answer given in Griffiths’s question, but I think he’s made a mistake, since this value gives a Chandrasekhar limit that is closer to the accepted value of 1.4 solar masses.)

Given the solar mass of {1.98892\times10^{30}\mbox{ kg}} and a proton mass of {1.6726\times10^{-27}\mbox{ kg}}, this number of nucleons gives a critical mass of around 1.72 solar masses. This stellar mass is known as the Chandrasekhar limit, since stars more massive than this will collapse beyond the white dwarf stage, possibly forming neutron stars or black holes.

In a neutron star, we can use the white dwarf calculations to work out the star’s radius, since neutrons are fermions. The only differences in the calculation are: (1) use the neutron mass {m_{n}} in place of the electron mass and (2) take {q=1} instead of {\frac{1}{2}} since there is only one fermion (the neutron itself) per nucleon. The original formula for the radius of a white dwarf is

\displaystyle  R=\left(\frac{9\pi}{4}\right)^{2/3}\frac{\hbar^{2}q^{5/3}}{Gmm_{n}^{2}N^{1/3}}

so we can take this result and multiply it by {2^{5/3}m/m_{n}}. The radius of a solar-mass white dwarf is 7162 km, so the radius of a solar-mass neutron star is

\displaystyle  R=7162\frac{2^{5/3}m}{m_{n}}=12.38\mbox{ km}

For the Fermi energy, we make the same two replacements

\displaystyle   E_{F} \displaystyle  = \displaystyle  \frac{\hbar^{2}k_{F}^{2}}{2m_{n}}
\displaystyle  \displaystyle  = \displaystyle  \frac{\hbar^{2}}{2m_{n}}\left(\frac{3\pi^{2}Nq}{V}\right)^{2/3}
\displaystyle  \displaystyle  = \displaystyle  \frac{\hbar^{2}}{2m_{n}}\left(\frac{9\pi N}{4R^{3}}\right)^{2/3}
\displaystyle  \displaystyle  = \displaystyle  55.7\mbox{ MeV}

where we used {N=1.189\times10^{57}} for the sun (as we found in the white dwarf calculation). The rest mass of a nucleon is about 940 MeV, so we’re not really in the relativistic zone.

Incidentally, the density of a neutron star is about {2.5\times10^{17}\mbox{kg m}^{-3}} which is more than {10^{14}} times the density of water.

Electron gas in a 2-d infinite square well

References: Griffiths, David J. (2005), Introduction to Quantum Mechanics, 2nd Edition; Pearson Education – Problem 5.34.

By analogy with the ideal gas in the 3-d infinite square well, we can work out the Fermi energy for a 2-d infinite square well. In this case, {k}-space occupies the first quadrant of the plane, and each cell in {k}-space occupies an area of {\pi^{2}/l_{x}l_{y}=\pi^{2}/A}, where {A} is the area of the square well and {l_{x}} and {l_{y}} are its dimensions.

If the space is filled with electrons in the ground state, they will fill the first quadrant of a circle with radius {k_{F}}. As before, the number of states required if we have {N} atoms with {q} electrons per atom is {Nq/2} (divided by 2 because of the spin degeneracy) so the total area required is

\displaystyle   \frac{Nq}{2}\frac{\pi^{2}}{A} \displaystyle  = \displaystyle  \frac{\pi k_{F}^{2}}{4}
\displaystyle  k_{F} \displaystyle  = \displaystyle  \sqrt{2\sigma\pi}
\displaystyle  \sigma \displaystyle  \equiv \displaystyle  \frac{Nq}{A}

The Fermi energy is then

\displaystyle   E_{F} \displaystyle  = \displaystyle  \frac{\hbar^{2}k_{F}^{2}}{2m}
\displaystyle  \displaystyle  = \displaystyle  \frac{\hbar^{2}\sigma\pi}{m}

Follow

Get every new post delivered to your Inbox.

Join 163 other followers