Helium atom: electron-electron interaction

Required math: calculus

Required physics: Schrödinger equation

Reference: Griffiths, David J. (2005), Introduction to Quantum Mechanics, 2nd Edition; Pearson Education – Problem 5.11.

The crude model of the helium atom ignored the interaction between the two electrons. As a result, the ground state energy of ${-109\mbox{ eV}}$ as predicted by the model was quite far off the measured value of ${-78.975\mbox{ eV}}$. One way of computing a correction to the model is to take the interaction-free wave function and use it to calculate the mean value of the interaction term. That is, we take the ground state wave function to be

$\displaystyle \psi_{0}\left(\mathbf{r}_{1},\mathbf{r}_{2}\right)=\psi_{100}\left(\mathbf{r}_{1}\right)\psi_{100}\left(\mathbf{r}_{2}\right)$

The two functions on the RHS aren’t quite the same as the hydrogen ground state functions, however. Recall that the ground state of hydrogen is

$\displaystyle \psi_{100H}=\frac{1}{\sqrt{\pi a^{3}}}e^{-r/a}$

where ${a}$ is the Bohr radius:

$\displaystyle a=\frac{4\pi\epsilon_{0}\hbar^{2}}{me^{2}}$

The ${e^{2}}$ in the denominator comes from the interaction energy between the single electron in hydrogen and the single proton in the nucleus. For helium, each electron interacts with the two protons in the nucleus, so the ‘Bohr radius for helium’ has a factor of ${2e^{2}}$ in place of the ${e^{2}}$ for hydrogen. Thus we must replace ${e^{2}}$ by ${2e^{2}}$ to get the helium wave function:

$\displaystyle \psi_{100He}=\frac{1}{\sqrt{\pi\left(\frac{a}{2}\right)^{3}}}e^{-2r/a}$

giving:

$\displaystyle \psi_{0}\left(\mathbf{r}_{1},\mathbf{r}_{2}\right)=\frac{8}{\pi a^{3}}e^{-2\left(r_{1}+r_{2}\right)/a}$

We can now use this function to calculate the mean value of the interaction term ${\frac{1}{4\pi\epsilon_{0}}\frac{e^{2}}{\left|\mathbf{r}_{1}-\mathbf{r}_{2}\right|}}$. That is, we want the integral:

$\displaystyle \frac{e^{2}}{4\pi\epsilon_{0}}\left\langle \frac{1}{\left|\mathbf{r}_{1}-\mathbf{r}_{2}\right|}\right\rangle =\frac{e^{2}}{4\pi\epsilon_{0}}\left(\frac{8}{\pi a^{3}}\right)^{2}\int\int e^{-4\left(r_{1}+r_{2}\right)/a}\frac{d^{3}\mathbf{r}_{1}d^{3}\mathbf{r}_{2}}{\left|\mathbf{r}_{1}-\mathbf{r}_{2}\right|}$

Isolating the mean value factor and expanding the modulus term in the denominator, we get, if we take ${\mathbf{r}_{1}}$ to be along the ${z}$ axis:

$\displaystyle \left\langle \frac{1}{\left|\mathbf{r}_{1}-\mathbf{r}_{2}\right|}\right\rangle =\left(\frac{8}{\pi a^{3}}\right)^{2}\int\int e^{-4\left(r_{1}+r_{2}\right)/a}\frac{d^{3}\mathbf{r}_{1}d^{3}\mathbf{r}_{2}}{\sqrt{r_{1}^{2}+r_{2}^{2}-2r_{1}r_{2}\cos\theta_{2}}}$

Each integration element has the form

$\displaystyle d^{3}\mathbf{r}_{i}=r_{i}^{2}\sin\theta_{i}dr_{i}d\theta_{i}d\phi_{i}$

so we get

$\displaystyle \left\langle \frac{1}{\left|\mathbf{r}_{1}-\mathbf{r}_{2}\right|}\right\rangle =\left(\frac{8}{\pi a^{3}}\right)^{2}\int e^{-4\left(r_{1}+r_{2}\right)/a}\frac{r_{1}^{2}\sin\theta_{1}r_{2}^{2}\sin\theta_{2}dr_{1}d\theta_{1}d\phi_{1}dr_{2}d\theta_{2}d\phi_{2}}{\sqrt{r_{1}^{2}+r_{2}^{2}-2r_{1}r_{2}\cos\theta_{2}}}$

The two ${\phi_{i}}$ integrals just give a factor of ${4\pi^{2}}$, so we can do those and then the integral over ${\theta_{2}}$. This gives

 $\displaystyle \left\langle \frac{1}{\left|\mathbf{r}_{1}-\mathbf{r}_{2}\right|}\right\rangle$ $\displaystyle =$ $\displaystyle \frac{512}{a^{6}}\int_{0}^{r_{1}}r_{1}r_{2}^{2}\sin\theta_{1}e^{-4\left(r_{1}+r_{2}\right)/a}dr_{2}d\theta_{1}dr_{1}+$ $\displaystyle$ $\displaystyle$ $\displaystyle \frac{512}{a^{6}}\int_{r_{1}}^{\infty}r_{1}^{2}r_{2}\sin\theta_{1}e^{-4\left(r_{1}+r_{2}\right)/a}dr_{2}d\theta_{1}dr_{1}$

We can now do the remaining integrals in the order ${r_{2}}$, ${\theta_{2}}$ and ${r_{1}}$, where the ${r_{2}}$ integral is done in two parts: the first from ${0}$ to ${r_{1}}$ and the second from ${r_{1}}$ to ${\infty}$. The result is

$\displaystyle \left\langle \frac{1}{\left|\mathbf{r}_{1}-\mathbf{r}_{2}\right|}\right\rangle =\frac{5}{4a}$

The correction to the energy is then

$\displaystyle \Delta E=\frac{5}{4a}\frac{e^{2}}{4\pi\epsilon_{0}}=5.4497\times10^{-18}\mbox{ Joules}=34\mbox{ eV}$

This adjusts the model energy to ${-109+34=-75\mbox{ eV}}$ which is much closer to the experimental value of ${-78.975}$.

Helium atom: parahelium and orthohelium

Required math: calculus

Required physics: Schrödinger equation

Reference: Griffiths, David J. (2005), Introduction to Quantum Mechanics, 2nd Edition; Pearson Education – Problem 5.10.

We looked at a very crude model of the helium atom, in which we ignored the interaction between the two electrons. In that model, the spatial wave function for helium is just the product of two hydrogen-like functions. In all but the ground state, we can construct totally symmetric and totally antisymmetric combinations of the wave functions. Since a system composed of two fermions (which the electrons are) must have an overall antisymmetric wave function, we need to combine symmetric spatial functions with antisymmetric spin functions, and vice versa.

States with an antisymmetric spin function are known as parahelium, and states with a symmetric spin function are known as orthohelium. Since the ground state always has a symmetric spatial function, it is always parahelium, but the excited states all come in both forms.

Because of the exchange force, the average distance between two identical particles in a symmetric spatial state is less than for an antisymmetric spatial state. Since two electrons have a higher interaction energy if they are closer together, we’d expect parahelium (symmetric spatial and antisymmetric spin) energies to be higher than the corresponding orthohelium (antisymmetric spatial and symmetric spin) states, and this is, in fact, observed experimentally.

If electrons were bosons, the situation would be reversed and the ground state would be orthohelium, since now must combine a symmetric spatial state with a symmetric spin state. When we add the two spins, the possible states are the symmetric triplet state (with total spin 1) and the antisymmetric singlet state (with total spin 0). Thus the ground state contains the triplet state. The excited states must pair a symmetric spatial state with a symmetric spin state, so all such states are triplets. All other states are singlets. The orthohelium states would now have higher energy, since they contain the symmetric spatial states.

If electrons were distinguishable, then there are no constraints on combinations of spatial and spin functions, so every energy level is four-fold degenerate, and there should be, on average, no difference in energy levels between ortho and parahelium. (I say ‘on average’, because atoms with a symmetric spatial function will still have higher energy, but in a large collection of atoms, there would be roughly equal numbers of ortho and parahelium atoms with any given spatial function, so the average energy of each group would be the same.)

Helium atom

Required math: calculus

Required physics: Schrödinger equation

Reference: Griffiths, David J. (2005), Introduction to Quantum Mechanics, 2nd Edition; Pearson Education – Problem 5.9.

So far, we’ve looked at identical particles only in the non-interacting case. In real life, of course, most particles interact with each other, so the Schrödinger equation must take this into account. For an atom with ${Z}$ protons and ${Z}$ electrons, each electron experiences an electric interaction with the nucleus, and with all the other electrons. The Schrödinger equation in this case is therefore

$\displaystyle H\psi=E\psi$

where

$\displaystyle H=\sum_{j=1}^{Z}\left[-\frac{\hbar^{2}}{2m}\nabla_{j}^{2}-\frac{Ze^{2}}{4\pi\epsilon_{0}r_{j}}\right]+\frac{1}{2}\frac{1}{4\pi\epsilon_{0}}\sum_{j\ne k}\frac{e^{2}}{\left|\mathbf{r}_{j}-\mathbf{r}_{k}\right|}$

The first term in the first sum is the kinetic energy of the electrons (we’re assuming the atom as a whole is at rest, so there is no contribution from the kinetic energy of the nucleus), the second term gives the interaction between the electrons and the nucleus, and the last sum gives the electron-electron interactions.

Needless to say, solving this equation is very difficult and in fact, there is no known exact solution except in the case of hydrogen, where ${Z=1}$ and the last term vanishes.

If we could find a solution, however, we’d need to form a completely anti-symmetric function from it, since electrons are fermions. The general solution would have the form

$\displaystyle \psi=\psi(\mathbf{r}_{1},\mathbf{r}_{2},\mathbf{r}_{3},...,\mathbf{r}_{Z})$

Since the hamiltonian is completely symmetric with respect to the ${Z}$ vectors ${\mathbf{r}_{i}}$, any permutation of the vectors in the solution is also a solution, so any linear combination of solutions is also a solution, and will have the same energy.

Since the anti-symmetry results from interchanging position vectors, we can apply the same process as that used to find anti-symmetric wave functions from stationary states. This time, however, we apply the anti-symmetrization to the order of the vectors in the argument list, rather than to individual stationary states. Thus, we’d get

$\displaystyle \psi_{f}=A\left[\sum_{even}\psi(\mathbf{r}_{1},\mathbf{r}_{2},\mathbf{r}_{3},...,\mathbf{r}_{Z})-\sum_{odd}\psi(\mathbf{r}_{1},\mathbf{r}_{2},\mathbf{r}_{3},...,\mathbf{r}_{Z})\right] \ \ \ \ \ (1)$

where the first sum is over all even permutations of the vectors and the second is over all odd permutations. The constant ${A}$ is determined by normalization.

If we formed an anti-symmetric function of the position vectors, then the spin portion of the wave function would have to be symmetric. Conversely, if we formed a symmetric function of position, then the spin would have to be anti-symmetric.

In the case of bosons we just replace the minus sign by a plus sign, which results in a sum over all permutations

$\displaystyle \psi_{b}=A\sum_{all}\psi(\mathbf{r}_{1},\mathbf{r}_{2},\mathbf{r}_{3},...,\mathbf{r}_{Z})$

This would, of course, not apply to the hamiltonian above, since electrons are not bosons, but if we did have a hamiltonian that applied to a collection of bosons, we could use the same procedure to generate a symmetric wave function. In the boson case, we would need to pair a symmetric spatial wave function with a symmetric spin function, and an anti-symmetric spatial function with an anti-symmetric spin function.

The simplest atom larger than hydrogen is helium, with 2 electrons. In this case, the hamiltonian is

$\displaystyle H=-\frac{\hbar^{2}}{2m}\nabla_{1}^{2}-\frac{2e^{2}}{4\pi\epsilon_{0}r_{1}}-\frac{\hbar^{2}}{2m}\nabla_{2}^{2}-\frac{2e^{2}}{4\pi\epsilon_{0}r_{2}}+\frac{1}{4\pi\epsilon_{0}}\frac{e^{2}}{\left|\mathbf{r}_{1}-\mathbf{r}_{2}\right|}$

A very crude approximation is to ignore the electron-electron interaction. Although this doesn’t give very accurate results, it does at least allow us to solve the Schrödinger equation exactly, since ${r_{1}}$ and ${r_{2}}$ are separated in the hamiltonian. The solution is just the product of hydrogen-like wave functions, so the ground state would be

$\displaystyle \psi_{0}\left(\mathbf{r}_{1},\mathbf{r}_{2}\right)=\psi_{100}\left(\mathbf{r}_{1}\right)\psi_{100}\left(\mathbf{r}_{2}\right)$

where the wave functions on the RHS now each have an energy of

$\displaystyle E_{1}=Z^{2}E_{1H}=4\times\left(-13.6\mbox{ eV}\right)=-54.4\mbox{ eV}$

The total energy is just the sum of the two energies for each electron, so

$\displaystyle E_{1He}=-108.8\mbox{ eV}$

The actual energy is measured to be ${-78.975\mbox{ eV}}$ so this crude model isn’t very good.

Since this ground state consists of the product of two identical functions, we can’t anti-symmetrize it (${\psi_{f}}$ as calculated from 1 just gives zero), so to get an anti-symmetric total wave function, we have to multiply the spatial function by an anti-symmetric spin function.

Using this crude model, we can investigate the behaviour of a helium atom with both electrons in the ${n=2}$ state. Experimentally, what happens in this case is that one electron decays back down to the ground state and instead of emitting a photon, it imparts the energy from this decay to the other electron. The ${n=2}$ state has an energy of ${E_{2}=Z^{2}E_{2H}=4\times\left(-13.6/4\right)=-13.6\mbox{ eV}}$ so the energy emitted by the decaying electron is ${-13.6-\left(-54.4\right)=+40.8\mbox{ eV}}$. Transfering this energy to the other electron gives it an energy of ${40.8-13.6=+27.2\mbox{ eV}}$. Since this energy is positive, the electron leaves the atom, resulting in a helium ion.

The spectrum of the helium ion can be calculated from the Rydberg formula:

$\displaystyle \frac{1}{\lambda}=R\left(\frac{1}{n_{f}^{2}}-\frac{1}{n_{i}^{2}}\right)$

where ${n_{f}}$ is the final state and ${n_{i}}$ is the initial state of the electron, and ${R}$ is the Rydberg constant, which for helium is ${Z^{2}R_{H}=4R_{H}}$, or 4 times the Rydberg constant for hydrogen. As a result, the spectrum of the helium ion is the same as that of hydrogen, except all the wavelengths are a quarter of those in hydrogen.

Perihelion shift: numerical solution

Reference: Moore, Thomas A., A General Relativity Workbook, University Science Books (2013) – Chapter 11; Problem 11.11.

The general case of an object orbiting a much larger mass is treated by the equations for radius and angle as functions of proper time:

 $\displaystyle \frac{1}{2}\left(\frac{dr}{d\tau}\right)^{2}+\frac{1}{2}\frac{l^{2}}{r^{2}}-GM\left(\frac{1}{r}+\frac{l^{2}}{r^{3}}\right)$ $\displaystyle =$ $\displaystyle \frac{1}{2}\left(e^{2}-1\right)$ $\displaystyle r^{2}\frac{d\phi}{d\tau}$ $\displaystyle =$ $\displaystyle l$

Taking the derivative of the first equation with respect to ${\tau}$ and cancelling off the common factor of ${dr/d\tau}$ that results, gives us

$\displaystyle \frac{d^{2}r}{d\tau^{2}}=-\frac{GM}{r^{2}}+\frac{l^{2}}{r^{3}}-\frac{3GMl^{2}}{r^{4}}$

This equation, together with the one for ${\phi}$ above, make up a coupled system of two ODEs. In general, there is no analytic solution to them, but they can be solved numerically to give parametric equations for ${r}$ and ${\phi}$ as functions of ${\tau}$, which can then be combined to give ${r\left(\phi\right)}$ which, when plotted, gives us a diagram of the orbit.

Since the ${r}$ equation is second order, we need to specify ${r}$ and ${dr/d\tau}$ at ${\tau=0}$, while for the first order ${\phi}$ equation, we need to specify only ${\phi\left(0\right)}$. We can also specify ${d\phi/d\tau}$ at ${\tau=0}$ and use this, together with ${r\left(0\right)}$ to determine ${l=r^{2}\left(0\right)\phi^{\prime}\left(0\right)}$. To specify a value for ${\phi^{\prime}\left(0\right)}$, we can take a certain fraction ${\phi_{0}}$ of the angular speed ${\phi^{\prime}}$ for a circular orbit of radius ${r\left(0\right)}$. For a circular orbit

 $\displaystyle l^{2}=r^{4}\left(\phi^{\prime}\right)^{2}$ $\displaystyle =$ $\displaystyle \frac{r^{2}GM}{r-3GM}$ $\displaystyle \phi^{\prime}$ $\displaystyle =$ $\displaystyle \frac{\sqrt{GM}}{r\sqrt{r-3GM}}=\frac{1}{r\sqrt{r-3}}$

where at the end, we’ve written ${r}$ in units of ${GM}$. Thus taking a fraction ${\phi_{0}}$ of this angular speed gives

 $\displaystyle l$ $\displaystyle =$ $\displaystyle r^{2}\left(0\right)\phi^{\prime}\left(0\right)$ $\displaystyle$ $\displaystyle =$ $\displaystyle \phi_{0}\frac{r\left(0\right)}{\sqrt{r\left(0\right)-3}}$

If we write ${r}$ and ${l}$ in units of ${GM}$, then the equations become

 $\displaystyle r^{2}\frac{d\phi}{d\tau}$ $\displaystyle =$ $\displaystyle l$ $\displaystyle \frac{d^{2}r}{d\tau^{2}}$ $\displaystyle =$ $\displaystyle -\frac{1}{r^{2}}+\frac{l^{2}}{r^{3}}-\frac{3l^{2}}{r^{4}}$

There are various ways of integrating these equations numerically, but if you have access to mathematical software such as Maple or Mathematica, the easiest way is to get them to do the hard work for you. A maple procedure that solves the equations and then draws a plot of the result is as follows:

path := proc (r0, rp0, phi0, tup)
local sys, l, sol, R, Phi;
l := r0*phi0/sqrt(r0-3);
sys := {phi(0) = 0, r(0) = r0,
diff(phi(t), t) = l/r(t)^2,
diff(r(t), \$(t, 2)) = -1/r(t)^2+l^2/r(t)^3-
3*l^2/r(t)^4, (D(r))(0) = rp0};
sol := dsolve(sys, {phi(t), r(t)}, type = numeric,
output = listprocedure);
Phi := subs(sol, phi(t));
R := subs(sol, r(t));
plot([R(t), Phi(t), t = 0 .. tup], coords = polar,
axiscoordinates = polar);
end proc


We can choose the starting point at ${\tau=0}$ to be a stationary point (where ${r^{\prime}=0}$) and label this angle as ${\phi=0}$, so ${r^{\prime}\left(0\right)=0}$. These two initial conditions will always be true, so the analysis involves changing ${r\left(0\right)}$ and ${\phi_{0}}$. Here are a few plots of the results. In each plot, the red curve shows ${r\left(\phi\right)}$, with ${r}$ in units of ${GM}$.

First, we look at ${r=50}$ and ${\phi_{0}=0.75}$. Here are 3 orbits:

This displays the precession quite nicely. The starting point is an aphelion (greatest distance) point. The perihelion occurs at about ${r_{min}\approx18}$.

Now suppose we decrease ${\phi_{0}}$ to 0.65. The result is:

Two effects are noticeable here. First, the object gets closer to the central mass (with ${r_{min}\approx11}$). This might be expected, since the object has a smaller tangential velocity so would be pulled further towards the centre. Second, the perihelion shift per orbit is larger. Interpreting this is a bit trickier, since the formula we got earlier for almost-circular orbits (${\Delta\phi=6\pi GM/r_{c}}$) can’t be used for such non-circular orbits. Probably a simple explanation is that as the object gets closer to the centre, it gets a bigger kick from the highly curved space there which distorts its orbit more, causing a larger perihelion shift.

If we decrease ${\phi_{0}}$ further, we soon come to a point where the object spirals in to ${r=0}$. Increasing ${\phi_{0}}$ from 0.75 makes the orbit more nearly circular with smaller ${\Delta\phi}$ until, at ${\phi_{0}=1}$, we get an exactly circular orbit. Increasing ${\phi_{0}}$ beyond 1 results in the object’s starting point becoming its perihelion instead of aphelion, with the aphelion getting further away as ${\phi_{0}}$ is increased. Here’s an example with ${\phi_{0}=1.35}$:

Changing ${r\left(0\right)}$ has similar effects. Decreasing ${r\left(0\right)}$ causes an increase in perihelion shift, since the object is closer to the centre. This is similar to the effect of decreasing ${\phi_{0}}$ above. Here’s ${r\left(0\right)=25}$ and ${\phi_{0}=0.75}$:

Embedding a 2-d curved surface into 3-d: inverse cosh

Reference: Moore, Thomas A., A General Relativity Workbook, University Science Books (2013) – Chapter 11; Problem 11.09.

A final example embedding a curved-space 2-d metric in 3-d flat space. This time, our 2-d metric is

$\displaystyle ds^{2}=d\rho^{2}+\left(\rho^{2}+b^{2}\right)d\phi^{2}$

where ${b}$ is a constant.

To convert the ${\phi}$ component to the required form of ${r^{2}d\phi^{2}}$, we define ${r=\sqrt{\rho^{2}+b^{2}}}$. Then

 $\displaystyle dr$ $\displaystyle =$ $\displaystyle \frac{\rho}{\sqrt{\rho^{2}+b^{2}}}d\rho$ $\displaystyle$ $\displaystyle =$ $\displaystyle \frac{\sqrt{r^{2}-b^{2}}d\rho}{r}$ $\displaystyle d\rho$ $\displaystyle =$ $\displaystyle \frac{r}{\sqrt{r^{2}-b^{2}}}dr$

The metric becomes:

$\displaystyle ds^{2}=\frac{r^{2}dr^{2}}{r^{2}-b^{2}}+r^{2}d\phi^{2}$

We now equate this to the cylindrical metric:

 $\displaystyle dz^{2}+r^{2}d\phi^{2}+dr^{2}$ $\displaystyle =$ $\displaystyle \frac{r^{2}dr^{2}}{r^{2}-b^{2}}+r^{2}d\phi^{2}$ $\displaystyle \frac{dz}{dr}$ $\displaystyle =$ $\displaystyle \frac{b}{\sqrt{r^{2}-b^{2}}}$

This integral can be done with software or looked up, and is

 $\displaystyle z$ $\displaystyle =$ $\displaystyle b\ln\left(r+\sqrt{r^{2}-b^{2}}\right)$ $\displaystyle$ $\displaystyle =$ $\displaystyle b\ln\left[b\left(\frac{r}{b}+\sqrt{\frac{r^{2}}{b^{2}}-1}\right)\right]$ $\displaystyle$ $\displaystyle =$ $\displaystyle b\ln\left(\frac{r}{b}+\sqrt{\frac{r^{2}}{b^{2}}-1}\right)+b\ln b$

From tables of inverse hyperbolic functions, we see that this is equivalent to

$\displaystyle z=b\mbox{arcosh}\left(\frac{r}{b}\right)+b\ln b$

We can ignore the last term as it is just a constant and serves only to raise or lower the surface as a whole.

The surface is similar to Flamm’s paraboloid, although since it is derived from hyperbolic functions and not parabolas, it’s not a paraboloid.

Embedding a 2-d curved surface in 3-d: the cosine

Reference: Moore, Thomas A., A General Relativity Workbook, University Science Books (2013) – Chapter 11; Problem 11.08.

A third example embedding a curved-space 2-d metric in 3-d flat space. This time, our 2-d metric is

$\displaystyle ds^{2}=\frac{dr^{2}}{\cos^{2}\left(r/R\right)}+r^{2}d\phi^{2}$

Here, the ${\phi}$ component is already in the required form, so we can equate this to the cylindrical metric directly:

 $\displaystyle dz^{2}+r^{2}d\phi^{2}+dr^{2}$ $\displaystyle =$ $\displaystyle \frac{dr^{2}}{\cos^{2}\left(r/R\right)}+r^{2}d\phi^{2}$ $\displaystyle \frac{dz}{dr}$ $\displaystyle =$ $\displaystyle \pm\sqrt{\frac{1-\cos^{2}\left(r/R\right)}{\cos^{2}\left(r/R\right)}}$ $\displaystyle$ $\displaystyle =$ $\displaystyle \pm\tan\left(\frac{r}{R}\right)$

This integrates directly to give

$\displaystyle z=\pm R\ln\left|\cos\left(\frac{r}{R}\right)\right|$

This integral is a bit problematic, as the logarithm is defined only for positive arguments, which is why we’ve put the absolute value in the answer. If the limits include a region where the cosine is zero, the log goes to infinity, so in our case here, ${0\le r/R<\pi/2}$. (This also follows from the original metric, since the cosine is in the denominator.) Since the cosine is always ${\le1}$, the log is always negative.

The lobe of this surface obtained from taking the + sign above looks like this (there is an upper lobe which is a mirror image of the lower one):

Embedding a 2-d curved surface in 3-d: the cosh

Reference: Moore, Thomas A., A General Relativity Workbook, University Science Books (2013) – Chapter 11; Problem 11.07.

A second example embedding a curved-space 2-d metric in 3-d flat space. This time, our 2-d metric is

$\displaystyle ds^{2}=\cosh^{2}\left(\frac{r}{R}\right)dr^{2}+r^{2}d\phi^{2}$

Here, the ${\phi}$ component is already in the required form, so we can equate this to the cylindrical metric directly:

 $\displaystyle dz^{2}+r^{2}d\phi^{2}+dr^{2}$ $\displaystyle =$ $\displaystyle \cosh^{2}\left(\frac{r}{R}\right)dr^{2}+r^{2}d\phi^{2}$ $\displaystyle \frac{dz}{dr}$ $\displaystyle =$ $\displaystyle \pm\left[\cosh^{2}\left(\frac{r}{R}\right)-1\right]^{1/2}$ $\displaystyle$ $\displaystyle =$ $\displaystyle \pm\sinh\left(\frac{r}{R}\right)$

This integrates directly to give

$\displaystyle z=\pm R\cosh\left(\frac{r}{R}\right)$

The upper lobe of this surface looks like this (there is a lower lobe which is a mirror image of the upper one):

Embedding 2-d curved space in 3-d: the sphere

Reference: Moore, Thomas A., A General Relativity Workbook, University Science Books (2013) – Chapter 11; Problem 11.06.

We can now look at some examples of embedding a curved-space 2-d metric in 3-d flat space. We’ll begin with the familiar case of the spherical surface. However, the goal here is to start with a metric defined in terms of some unknown coordinates and then discover the nature of the surface by embedding it in 3-d space.

The metric is, not surprisingly

$\displaystyle ds^{2}=R^{2}d\theta^{2}+R^{2}\sin^{2}\theta d\phi^{2}$

To get this in the form where we can embed it using cylindrical coordinates, we need the ${\phi}$ term to be ${r^{2}d\phi^{2}}$ for some coordinate ${r}$. Therefore, we can try just defining ${r=R\sin\theta}$ and see what this does to the ${\theta}$ component. We get

 $\displaystyle dr$ $\displaystyle =$ $\displaystyle R\cos\theta d\theta$ $\displaystyle d\theta$ $\displaystyle =$ $\displaystyle \frac{dr}{R\cos\theta}$ $\displaystyle$ $\displaystyle =$ $\displaystyle \frac{dr}{R\sqrt{1-\sin^{2}\theta}}$ $\displaystyle$ $\displaystyle =$ $\displaystyle \frac{dr}{R\sqrt{1-\left(\frac{r}{R}\right)^{2}}}$

Thus the metric can be rewritten in terms of ${r}$ and ${\phi}$ as follows:

$\displaystyle ds^{2}=\frac{dr^{2}}{1-\left(\frac{r}{R}\right)^{2}}+r^{2}d\phi^{2}$

From here, we equate this to the 3-d cylindrical metric:

 $\displaystyle dz^{2}+r^{2}d\phi^{2}+dr^{2}$ $\displaystyle =$ $\displaystyle \frac{dr^{2}}{1-\left(\frac{r}{R}\right)^{2}}+r^{2}d\phi^{2}$ $\displaystyle \frac{dz}{dr}$ $\displaystyle =$ $\displaystyle \pm\frac{r/R}{\sqrt{1-\left(\frac{r}{R}\right)^{2}}}$ $\displaystyle z$ $\displaystyle =$ $\displaystyle \pm R\sqrt{1-\left(\frac{r}{R}\right)^{2}}$ $\displaystyle$ $\displaystyle =$ $\displaystyle \pm\sqrt{R^{2}-r^{2}}$

This is the equation of the two halves (upper and lower) of a sphere, so we see that the embedding of the 2-d metric in 3-d space does indeed give a sphere.

Perihelion shift – contribution of the time coordinate

Reference: Moore, Thomas A., A General Relativity Workbook, University Science Books (2013) – Chapter 11; Problem 11.05.

We’ve seen that a third of the precession of an object’s closest approach in its orbit around a central mass comes from the curvature of space caused by the radial component of the Schwarzschild metric. We can now investigate the contribution from the time coordinate. For nearly circular orbits where the mean radius ${r_{c}}$ of the orbit is much greater than ${GM}$ (${M}$ is the central mass), the angle of closest approach advances by

$\displaystyle \Delta\phi=\frac{6\pi GM}{r_{c}}$

on each orbit.

By analogy with the radial coordinate, we might think that we can analyze the time component by setting ${dr=0}$. However, since we’re now concerned with changes in the way time is perceived at different locations, rather than merely with the curvature of space at a particular time, we have to consider the change in position with time, so our approach has to be quite different from that for the radial coordinate. The analysis is, in fact, much the same as in the original derivation of the full precession using the Schwarzschild metric. Since we’re interested in looking only at the time coordinate, we consider a modified metric in which space is flat, which means that the metric component ${g_{rr}=1}$ rather than the ${\left(1-\frac{2GM}{r}\right)^{-1}}$ in the Schwarzschild metric. That is, the metric is (since we’re still considering motion in the equatorial plane):

$\displaystyle ds^{2}=-\left(1-\frac{2GM}{r}\right)dt^{2}+dr^{2}+r^{2}d\phi^{2}$

Since the ${t}$ and ${\phi}$ components are unchanged from the original metric, the conserved quantities ${e}$ and ${l}$ are also unchanged:

 $\displaystyle e$ $\displaystyle =$ $\displaystyle \left(1-\frac{2GM}{r}\right)\frac{dt}{d\tau}$ $\displaystyle l$ $\displaystyle =$ $\displaystyle r^{2}\frac{d\phi}{d\tau}$

Using the invariant ${\mathbf{u}\cdot\mathbf{u}=-1}$ we get

 $\displaystyle -1$ $\displaystyle =$ $\displaystyle -\left(1-\frac{2GM}{r}\right)\left(\frac{dt}{d\tau}\right)^{2}+\left(\frac{dr}{d\tau}\right)^{2}+r^{2}\left(\frac{d\phi}{d\tau}\right)^{2}$ $\displaystyle 1$ $\displaystyle =$ $\displaystyle \left(1-\frac{2GM}{r}\right)^{-1}e^{2}-\left(\frac{dr}{d\tau}\right)^{2}-\frac{l^{2}}{r^{2}}$

Now we use

 $\displaystyle \frac{dr}{d\tau}$ $\displaystyle =$ $\displaystyle \frac{dr}{d\phi}\frac{d\phi}{d\tau}=\frac{l}{r^{2}}\frac{dr}{d\phi}$ $\displaystyle 1$ $\displaystyle =$ $\displaystyle \left(1-\frac{2GM}{r}\right)^{-1}e^{2}-\frac{l^{2}}{r^{4}}\left(\frac{dr}{d\phi}\right)^{2}-\frac{l^{2}}{r^{2}}$

Substituting ${u=1/r}$, we get

$\displaystyle 1=\left(1-2GMu\right)^{-1}e^{2}-l^{2}\left(\frac{du}{d\phi}\right)^{2}-l^{2}u^{2}$

Taking the derivative with respect to ${\phi}$ and using the notation ${u^{\prime}}$ to denote a derivative, we get

 $\displaystyle 2GM\left(1-2GMu\right)^{-2}e^{2}u^{\prime}-2l^{2}u^{\prime}u^{\prime\prime}-2l^{2}uu^{\prime}$ $\displaystyle =$ $\displaystyle 0$ $\displaystyle u^{\prime\prime}+u$ $\displaystyle =$ $\displaystyle \frac{GMe^{2}}{l^{2}\left(1-2GMu\right)^{2}}$

If we now assume that ${r\gg GM}$ as usual, we can approximate the RHS term:

 $\displaystyle u^{\prime\prime}+u$ $\displaystyle \approx$ $\displaystyle \frac{GMe^{2}}{l^{2}}\left(1+4GMu\right)$ $\displaystyle u^{\prime\prime}+u\left(1-4\left(\frac{GMe}{l}\right)^{2}\right)$ $\displaystyle =$ $\displaystyle \frac{GMe^{2}}{l^{2}}$

Now we want to treat a nearly circular orbit as a purely circular orbit plus a small perturbation, as we did in the original perihelion calculation. If ${u=u_{c}=\mbox{constant}}$, then the above equation becomes

 $\displaystyle u_{c}\left(1-4\left(\frac{GMe}{l}\right)^{2}\right)$ $\displaystyle =$ $\displaystyle \frac{GMe^{2}}{l^{2}}$ $\displaystyle$ $\displaystyle =$ $\displaystyle \frac{1}{GM}\left(\frac{GMe}{l}\right)^{2}$ $\displaystyle \left(\frac{GMe}{l}\right)^{2}$ $\displaystyle =$ $\displaystyle \frac{GMu_{c}}{4GMu_{c}+1}$

Substituting this into the ODE, we get

$\displaystyle u^{\prime\prime}+u\left(1-\frac{GMu_{c}}{4GMu_{c}+1}\right)=\frac{u_{c}}{4GMu_{c}+1}$

Now we define ${u=u_{c}+u_{c}w}$, where ${w}$ is the small perturbation on the circular orbit. Plugging this into the differential equation above, we get, after cancelling ${u_{c}}$ off both sides:

 $\displaystyle w^{\prime\prime}+\left(1+w\right)\frac{1}{4GMu_{c}+1}$ $\displaystyle =$ $\displaystyle \frac{1}{4GMu_{c}+1}$ $\displaystyle w^{\prime\prime}$ $\displaystyle =$ $\displaystyle -\frac{1}{4GMu_{c}+1}w$

This is the familiar harmonic oscillator equation again, so we can write the solution in which ${\phi=0}$ is the maximum value of ${w}$:

$\displaystyle w\left(\phi\right)=A\cos\left(\sqrt{4GMu_{c}+1}\phi\right)$

A complete orbit will bring the argument of the cosine to ${2\pi}$, so

 $\displaystyle \sqrt{4GMu_{c}+1}\phi$ $\displaystyle =$ $\displaystyle 2\pi$ $\displaystyle \phi$ $\displaystyle =$ $\displaystyle \frac{2\pi}{\sqrt{4GMu_{c}+1}}$ $\displaystyle$ $\displaystyle \approx$ $\displaystyle 2\pi\left(1+2GMu_{c}\right)$

The perihelion shift due to the ${t}$ coordinate is then

$\displaystyle \Delta\phi=4\pi GMu_{c}=\frac{4\pi GM}{r_{c}}$

which is ${\frac{2}{3}}$ of the total amount stated at the start. Thus the radial plus time coordinates together account for the total perihelion shift (in the approximation of large radii and nearly circular orbits, anyway).

Perihelion shift – contribution from the radial coordinate

Reference: Moore, Thomas A., A General Relativity Workbook, University Science Books (2013) – Chapter 11; Exercise 11.6.1.

In analyzing the precession of an object’s closest approach in its orbit around a central mass we found that, for nearly circular orbits where the mean radius ${r_{c}}$ of the orbit is much greater than ${GM}$ (${M}$ is the central mass), the angle of closest approach advances by

$\displaystyle \Delta\phi=\frac{6\pi GM}{r_{c}}$

on each orbit.

Since the only two components in the Schwarzschild metric that differ from flat space are the radial and time components, this shift must be due to contributions from each of these coordinates. To see how much arises from each coordinate, we can hold the other one constant and see what perihelion shift arises.

First, consider the radial coordinate ${r}$. In this case, ${dt=d\theta=0}$ (the latter because we’re moving in the equatorial plane, as usual), so the metric becomes

$\displaystyle ds^{2}=\left(1-\frac{2GM}{r}\right)^{-1}dr^{2}+r^{2}d\phi$

This is a 2-d curved space but, just as we can visualize the 2-d curved space of a sphere by embedding it in 3-d flat space, we can do the same here. Since the metric is independent of ${\phi}$, the natural coordinate system to use in the flat space is the cylindrical system, with coordinates ${z}$, ${r}$ and ${\phi}$. In the cylindrical system

$\displaystyle ds^{2}=dr^{2}+r^{2}d\phi+dz^{2}$

so we can equate the two metrics to get

 $\displaystyle \left(1-\frac{2GM}{r}\right)^{-1}dr^{2}+r^{2}d\phi$ $\displaystyle =$ $\displaystyle dr^{2}+r^{2}d\phi+dz^{2}$ $\displaystyle \frac{dz}{dr}$ $\displaystyle =$ $\displaystyle \sqrt{\left(1-\frac{2GM}{r}\right)^{-1}-1}$ $\displaystyle$ $\displaystyle =$ $\displaystyle \sqrt{\frac{2GM/r}{1-2GM/r}}$ $\displaystyle$ $\displaystyle =$ $\displaystyle \frac{\sqrt{2GM}}{\sqrt{r-2GM}}$

This is quite easy to integrate, and we get

 $\displaystyle z$ $\displaystyle =$ $\displaystyle 2\sqrt{2GM\left(r-2GM\right)}$ $\displaystyle$ $\displaystyle =$ $\displaystyle \sqrt{8GM\left(r-2GM\right)}$

The 3-d plot of this surface looks like this:

The surface is known as Flamm’s paraboloid (Try as I might, I couldn’t find out who Flamm is or was. There was an Austrian table tennis player by that name, but it’s unlikely to be her.) and has become something of an icon in discussions of black holes, since it seems to indicate that there is a ‘gravity well’ around the black hole into which objects fall. This is misleading, since the paraboloid is a snapshot of the space at one particular instant in time (remember we took ${dt=0}$) so an object near a black hole does not move along the surface of the paraboloid. We’ll look at this in more detail later.

To use this paraboloid to determine the amount of the perihelion shift due to the radial component, we can start by looking at an orbit in flat space. Suppose we consider a nearly circular orbit of radius ${R}$. In flat space, we can draw this as a nearly-circular ellipse on a flat sheet of paper. Also, in the absence of a central mass, there is no perihelion shift, so the object goes through exactly ${2\pi}$ radians from one perihelion to the next.

Now if we introduce a central mass, the space becomes curved into the shape of Flamm’s paraboloid. For ${R\gg GM}$ (our usual assumption), we can approximate the curvature of space by imagining that the plane in flat space is deformed into a cone that is tangent to the paraboloid. That is, if you imagine the paraboloid as a funnel, we put a conical filter paper into the funnel, and this paper touches the funnel in an almost-circle. As you may have done in high school chemistry, you can make a cone out of a circular piece of paper by cutting the circle from its centre along a radius out to the edge, then overlapping the cut edges by a certain amount. The angle ${\delta}$ subtended by the overlapping portions of the cut circle determine how steep the sides of the cone are.

To introduce some quantities, we know that the distance from the vertex of the cone to the rim is just the mean radius ${R}$ of the original ellipse. Now suppose that this line from vertex to rim makes an angle ${\alpha}$ with a radius ${r}$ on the base of the cone. That is, we draw a line from the centre of the base along a radius out to the edge of the base, then from that point up the side of the cone to the vertex. The angle ${\alpha}$ is the angle between the two lines where they meet at the rim of the base. The radius of the base is then ${r=R\cos\alpha}$ and the circumference of the base is ${2\pi r=2\pi R\cos\alpha}$.

However, the circumference of the base is also the circumference of the original circle minus the overlapping bit. That is, we must also have

 $\displaystyle 2\pi r=2\pi R\cos\alpha$ $\displaystyle =$ $\displaystyle \left(2\pi-\delta\right)R$ $\displaystyle 2\pi\cos\alpha$ $\displaystyle =$ $\displaystyle \left(2\pi-\delta\right)$

For ${R\gg GM}$, the angle ${\alpha}$ will be very small, so ${\cos\alpha\approx1-\frac{1}{2}\alpha^{2}}$. Also, ${\tan\alpha}$ is the slope of the paraboloid at the point where the cone is tangent to it, so

 $\displaystyle \tan\alpha$ $\displaystyle =$ $\displaystyle \left.\frac{dz}{dr}\right|_{r=R\cos\alpha}$ $\displaystyle$ $\displaystyle =$ $\displaystyle \frac{\sqrt{2GM}}{\sqrt{R\cos\alpha-2GM}}$ $\displaystyle$ $\displaystyle \approx$ $\displaystyle \frac{\sqrt{2GM}}{\sqrt{R\left(1-\frac{1}{2}\alpha^{2}\right)-2GM}}$ $\displaystyle$ $\displaystyle \approx$ $\displaystyle \alpha$

We can now square both sides to get a quadratic equation for ${\alpha^{2}}$:

 $\displaystyle \frac{R}{2}\alpha^{4}-\left(R-2GM\right)\alpha^{2}+2GM$ $\displaystyle =$ $\displaystyle 0$ $\displaystyle \alpha^{2}$ $\displaystyle =$ $\displaystyle \frac{1}{R}\left[R-2GM\pm\sqrt{\left(R-2GM\right)^{2}-4RGM}\right]$ $\displaystyle$ $\displaystyle =$ $\displaystyle \frac{1}{R}\left(R-2GM\right)\left[1\pm\sqrt{1-\frac{4RGM}{\left(R-2GM\right)^{2}}}\right]$ $\displaystyle$ $\displaystyle \approx$ $\displaystyle \left(1-\frac{2GM}{R}\right)\left[1\pm\left(1-\frac{2RGM}{\left(R-2GM\right)^{2}}\right)\right]$ $\displaystyle$ $\displaystyle \approx$ $\displaystyle \left(1-\frac{2GM}{R}\right)\left(1\pm\left(1-\frac{2GM}{R}\right)\right)$

where in the last two lines we used the condition ${R\gg GM}$. Since ${\alpha}$ is small, we need to take the minus sign in the last line, and saving up to first-order terms in ${2GM/R}$ we get

$\displaystyle \alpha^{2}\approx\frac{2GM}{R}$

Plugging this back into the equation above for the overlap angle ${\delta}$ (which is the perihelion shift due to the radial coordinate), we find

 $\displaystyle 2\pi\left(1-\frac{1}{2}\alpha^{2}\right)$ $\displaystyle =$ $\displaystyle 2\pi-\delta$ $\displaystyle \delta$ $\displaystyle =$ $\displaystyle \frac{2\pi GM}{R}$

This is ${\frac{1}{3}}$ of the total perihelion shift as given at the start.