Orbits in the centre of mass frame: energy and angular momentum

Reference: Carroll, Bradley W. & Ostlie, Dale A. (2007), An Introduction to Modern Astrophysics, 2nd Edition; Pearson Education – Chapter 2, Problems 2.4-2.5.

For two masses {m_{1}} and {m_{2}} interacting via gravity, it’s easiest to do calculations in the centre of mass frame. The position of the centre of mass is defined as

\displaystyle  \mathbf{R}\equiv\frac{m_{1}\mathbf{r}_{1}'+m_{2}\mathbf{r}_{2}'}{m_{1}+m_{2}} \ \ \ \ \ (1)

where {\mathbf{r}_{i}'} is the position of {m_{i}} in a coordinate system where the origin could be anywhere.

In the centre of mass frame, we take the centre of mass to be at the origin, so that {\mathbf{R}=0}, giving

\displaystyle  \frac{m_{1}\mathbf{r}_{1}+m_{2}\mathbf{r}_{2}}{m_{1}+m_{2}}=0 \ \ \ \ \ (2)

where now {\mathbf{r}_{i}} (without the prime) indicates the position relative to the centre of mass. If we introduce the relative position

\displaystyle  \mathbf{r}\equiv\mathbf{r}_{2}-\mathbf{r}_{1} \ \ \ \ \ (3)

then

\displaystyle   \frac{m_{1}\mathbf{r}_{1}+m_{2}\left(\mathbf{r}+\mathbf{r}_{1}\right)}{m_{1}+m_{2}} \displaystyle  = \displaystyle  0\ \ \ \ \ (4)
\displaystyle  \mathbf{r}_{1} \displaystyle  = \displaystyle  -\frac{m_{2}}{m_{1}+m_{2}}\mathbf{r}\ \ \ \ \ (5)
\displaystyle  \frac{m_{1}\left(\mathbf{r}_{2}-\mathbf{r}\right)+m_{2}\mathbf{r}_{2}}{m_{1}+m_{2}} \displaystyle  = \displaystyle  0\ \ \ \ \ (6)
\displaystyle  \mathbf{r}_{2} \displaystyle  = \displaystyle  \frac{m_{1}}{m_{1}+m_{2}}\mathbf{r} \ \ \ \ \ (7)

In terms of the reduced mass

\displaystyle  \mu\equiv\frac{m_{1}m_{2}}{m_{1}+m_{2}} \ \ \ \ \ (8)

we get

\displaystyle   \mathbf{r}_{1} \displaystyle  = \displaystyle  -\frac{\mu}{m_{1}}\mathbf{r}\ \ \ \ \ (9)
\displaystyle  \mathbf{r}_{2} \displaystyle  = \displaystyle  \frac{\mu}{m_{2}}\mathbf{r} \ \ \ \ \ (10)

The velocities of the two masses are just the derivatives of the positions, so that

\displaystyle  \mathbf{v}_{i}=\frac{d\mathbf{r}_{i}}{dt} \ \ \ \ \ (11)

and the rate of change of the relative position is

\displaystyle  \mathbf{v}=\frac{d\mathbf{r}}{dt} \ \ \ \ \ (12)

The total energy of the two mass system is the sum of the two kinetic energy terms and the gravitational potential energy term, so

\displaystyle  E=\frac{1}{2}m_{1}\left|\mathbf{v}_{1}\right|^{2}+\frac{1}{2}m_{2}\left|\mathbf{v}_{2}\right|^{2}-G\frac{m_{1}m_{2}}{r} \ \ \ \ \ (13)

where

\displaystyle  r=\left|\mathbf{r}_{2}-\mathbf{r}_{1}\right| \ \ \ \ \ (14)

The gravitational potential energy can be written in terms of the reduced mass and the total mass {M\equiv m_{1}+m_{2}} as

\displaystyle  -G\frac{m_{1}m_{2}}{r}=-G\left(m_{1}+m_{2}\right)\frac{\mu}{r}=-G\frac{M\mu}{r} \ \ \ \ \ (15)

The kinetic energy can be rewritten by using 9 and 10

\displaystyle   \mathbf{v}_{1} \displaystyle  = \displaystyle  \frac{d\mathbf{r}_{1}}{dt}\ \ \ \ \ (16)
\displaystyle  \displaystyle  = \displaystyle  -\frac{\mu}{m_{1}}\mathbf{v}\ \ \ \ \ (17)
\displaystyle  \mathbf{v}_{2} \displaystyle  = \displaystyle  \frac{\mu}{m_{2}}\mathbf{v}\ \ \ \ \ (18)
\displaystyle  \frac{1}{2}m_{1}\left|\mathbf{v}_{1}\right|^{2}+\frac{1}{2}m_{2}\left|\mathbf{v}_{2}\right|^{2} \displaystyle  = \displaystyle  \frac{1}{2}\frac{\mu^{2}}{m_{1}}v^{2}+\frac{1}{2}\frac{\mu^{2}}{m_{2}}v^{2}\ \ \ \ \ (19)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{2}\mu^{2}\frac{m_{1}+m_{2}}{m_{1}m_{2}}v^{2}\ \ \ \ \ (20)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{2}\mu v^{2} \ \ \ \ \ (21)

Thus the total energy is

\displaystyle  E=\frac{1}{2}\mu v^{2}-G\frac{M\mu}{r} \ \ \ \ \ (22)

which is the energy of a mass {\mu} moving around another mass {M}, the latter of which is fixed at the origin.

The orbital angular momentum (that is, the angular momentum due to the masses orbiting about each other, not including any angular momentum due to each mass rotating on its own axis) is

\displaystyle   \mathbf{L} \displaystyle  = \displaystyle  m_{1}\mathbf{r}_{1}\times\mathbf{v}_{1}+m_{2}\mathbf{r}_{2}\times\mathbf{v}_{2}\ \ \ \ \ (23)
\displaystyle  \displaystyle  = \displaystyle  -m_{1}\frac{\mu}{m_{1}}\mathbf{r}\times\left(-\frac{\mu}{m_{1}}\mathbf{v}\right)+m_{2}\frac{\mu}{m_{2}}\mathbf{r}\times\left(\frac{\mu}{m_{2}}\mathbf{v}\right)\ \ \ \ \ (24)
\displaystyle  \displaystyle  = \displaystyle  \mu^{2}\mathbf{r}\times\mathbf{v}\left(\frac{1}{m_{1}}+\frac{1}{m_{2}}\right)\ \ \ \ \ (25)
\displaystyle  \displaystyle  = \displaystyle  \frac{m_{1}+m_{2}}{m_{1}m_{2}}\mu^{2}\mathbf{r}\times\mathbf{v}\ \ \ \ \ (26)
\displaystyle  \displaystyle  = \displaystyle  \mu\mathbf{r}\times\mathbf{v} \ \ \ \ \ (27)

Thus the angular momentum is due to the reduced mass alone, which is consistent with this mass orbiting about a fixed mass at the origin.

Elliptical orbits

Reference: Carroll, Bradley W. & Ostlie, Dale A. (2007), An Introduction to Modern Astrophysics, 2nd Edition; Pearson Education – Chapter 2, Problems 2.1-2.2.

In a two-body solar system (that is, where we have only the Sun and a single planet or other object orbiting it), it turns out that the planet’s orbit is a conic section (ellipse, parabola or hyperbola) with the centre of mass of the system at one of the foci (or the single focus, in the case of a parabola) of the orbit. For any bound orbit (that is, one in which the planet never escapes to infinity), the orbital curve is an ellipse, so it’s useful to review a few properties of ellipses.

To define an ellipse, first specify two points which serve as the foci (see Fig. 2.4 in Carroll & Ostlie). The ellipse is then the set of points such that the sum of the distances of a given point from the two foci is a constant, namely {2a}, where {a} is called the semimajor axis (and {2a} is the major axis). If the distance from a point {P} on the ellipse to one focus is {r'} and to the other focus is {r}, then

\displaystyle  r'+r=2a \ \ \ \ \ (1)

If we place the ellipse in a rectangular coordinate system with the line between the foci on the {x} axis and the origin at the centre of the ellipse, then the distance along the {y} axis from the origin to the ellipse is known as {b}, the semiminor axis. The eccentricity {e} of the ellipse is defined as the distance between the foci divided by the major axis, so that the distance between the foci is {2ae}. If the two foci coincide, the distance between them is zero so {e=0}, {r'=r=a} and we get a circle. If the foci coincide with the ends of the major axis, then {ae=a} and {e=1}, giving just a line segment along the {x} axis extending from {x=-a} to {x=+a}. In the general case, if we draw a right angled triangle with vertices at one of the foci, the centre, and the point {\left(0,b\right)}, then the hypotenuse of this triangle is {a} so by Pythagoras we have

\displaystyle   a^{2}e^{2}+b^{2} \displaystyle  = \displaystyle  a^{2}\ \ \ \ \ (2)
\displaystyle  b^{2} \displaystyle  = \displaystyle  a^{2}\left(1-e^{2}\right)\ \ \ \ \ (3)
\displaystyle  e^{2} \displaystyle  = \displaystyle  1-\frac{b^{2}}{a^{2}} \ \ \ \ \ (4)

If we use a polar coordinate system with the origin at the right-hand focus, then the angle {\theta} between the {x} axis and the line {r} from that focus to the ellipse is the polar angle. To get the equation of an ellipse in polar coordinates, we can consider the triangle with its vertices at the polar origin, the point on the ellipse with coordinates {\left(r,\theta\right)} and the left-hand focus. The triangle’s angle at the origin is then {\pi-\theta} so by the law of cosines we have

\displaystyle   \left(r'\right)^{2} \displaystyle  = \displaystyle  4a^{2}e^{2}+r^{2}-4rae\cos\left(\pi-\theta\right)\ \ \ \ \ (5)
\displaystyle  \displaystyle  = \displaystyle  4a^{2}e^{2}+r^{2}+4rae\cos\theta \ \ \ \ \ (6)

From 1 we have

\displaystyle  \left(r'\right)^{2}=4a^{2}+r^{2}-4ar \ \ \ \ \ (7)

so

\displaystyle   4a^{2}-4ar \displaystyle  = \displaystyle  4a^{2}e^{2}+4rae\cos\theta\ \ \ \ \ (8)
\displaystyle  r \displaystyle  = \displaystyle  \frac{a\left(1-e^{2}\right)}{1+e\cos\theta}\ \ \ \ \ (9)
\displaystyle  \displaystyle  = \displaystyle  \frac{b^{2}}{a\left(1+e\cos\theta\right)} \ \ \ \ \ (10)

We can derive a rectangular coordinate version of the equation for an ellipse by two methods. We can start with 9 and shift the origin to the centre of the ellipse, so that a point on the ellipse has rectangular coordinates

\displaystyle   x \displaystyle  = \displaystyle  ae+r\cos\theta\ \ \ \ \ (11)
\displaystyle  y \displaystyle  = \displaystyle  r\sin\theta \ \ \ \ \ (12)

Now (since we know the answer we’re looking to prove) we can form the following quantity, using 4 and 10:

\displaystyle   \frac{x^{2}}{a^{2}}+\frac{y^{2}}{b^{2}} \displaystyle  = \displaystyle  \frac{1}{a^{2}}\left(a\sqrt{1-\frac{b^{2}}{a^{2}}}+r\cos\theta\right)^{2}+\frac{r^{2}\sin^{2}\theta}{b^{2}}\ \ \ \ \ (13)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{a^{2}}\left(a\sqrt{1-\frac{b^{2}}{a^{2}}}+\frac{b^{2}\cos\theta}{a\left(1+\sqrt{1-\frac{b^{2}}{a^{2}}}\cos\theta\right)}\right)^{2}+\frac{b^{2}\sin^{2}\theta}{a^{2}\left(1+\sqrt{1-\frac{b^{2}}{a^{2}}}\cos\theta\right)^{2}}\ \ \ \ \ (14)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{a^{4}\left(1+\sqrt{1-\frac{b^{2}}{a^{2}}}\cos\theta\right)^{2}}\left[\left(a^{2}\sqrt{1-\frac{b^{2}}{a^{2}}}\left(1+\sqrt{1-\frac{b^{2}}{a^{2}}}\cos\theta\right)+b^{2}\cos\theta\right)^{2}+a^{2}b^{2}\sin^{2}\theta\right]\ \ \ \ \ (15)
\displaystyle  \displaystyle  = \displaystyle  \frac{a^{4}}{a^{4}\left(1+\sqrt{1-\frac{b^{2}}{a^{2}}}\cos\theta\right)^{2}}\left[\left(1-\frac{b^{2}}{a^{2}}\right)\cos^{2}\theta+2\sqrt{1-\frac{b^{2}}{a^{2}}}\cos\theta+1\right]\ \ \ \ \ (16)
\displaystyle  \displaystyle  = \displaystyle  \frac{\left(1+\sqrt{1-\frac{b^{2}}{a^{2}}}\cos\theta\right)^{2}}{\left(1+\sqrt{1-\frac{b^{2}}{a^{2}}}\cos\theta\right)^{2}}\ \ \ \ \ (17)
\displaystyle  \displaystyle  = \displaystyle  1 \ \ \ \ \ (18)

The other method is to use the definition 1 and Pythagoras again. We have

\displaystyle   \left(r'\right)^{2} \displaystyle  = \displaystyle  \left(x+ae\right)^{2}+y^{2}\ \ \ \ \ (19)
\displaystyle  r^{2} \displaystyle  = \displaystyle  \left(x-ae\right)^{2}+y^{2} \ \ \ \ \ (20)

so

\displaystyle  \sqrt{\left(x+ae\right)^{2}+y^{2}}+\sqrt{\left(x-ae\right)^{2}+y^{2}}=2a \ \ \ \ \ (21)

Squaring both sides, we get

\displaystyle   \left(x+ae\right)^{2}+\left(x-ae\right)^{2}+2y^{2}+2\sqrt{\left(x+ae\right)^{2}+y^{2}}\sqrt{\left(x-ae\right)^{2}+y^{2}} \displaystyle  = \displaystyle  4a^{2}\ \ \ \ \ (22)
\displaystyle  x^{2}+y^{2}+a^{2}e^{2}+\sqrt{\left(x+ae\right)^{2}+y^{2}}\sqrt{\left(x-ae\right)^{2}+y^{2}} \displaystyle  = \displaystyle  2a^{2}\ \ \ \ \ (23)
\displaystyle  \sqrt{\left(x^{2}-a^{2}e^{2}\right)^{2}+y^{4}+y^{2}\left(\left(x+ae\right)^{2}+\left(x-ae\right)^{2}\right)} \displaystyle  = \displaystyle  a^{2}+b^{2}-x^{2}-y^{2}\ \ \ \ \ (24)
\displaystyle  \sqrt{y^{4}+2x^{2}y^{2}+x^{4}+2a^{2}e^{2}\left(y^{2}-x^{2}\right)+a^{4}e^{4}} \displaystyle  = \displaystyle  a^{2}+b^{2}-x^{2}-y^{2}\ \ \ \ \ (25)
\displaystyle  y^{4}+2x^{2}y^{2}+x^{4}+2a^{2}e^{2}\left(y^{2}-x^{2}\right)+a^{4}e^{4} \displaystyle  = \displaystyle  \left(a^{2}+b^{2}-x^{2}-y^{2}\right)^{2}\ \ \ \ \ (26)
\displaystyle  2x^{2}y^{2}+2\left(a^{2}-b^{2}\right)\left(y^{2}-x^{2}\right)+\left(a^{2}-b^{2}\right)^{2} \displaystyle  = \displaystyle  \left(a^{2}+b^{2}\right)^{2}-2\left(a^{2}+b^{2}\right)\left(x^{2}+y^{2}\right)+2x^{2}y^{2}\ \ \ \ \ (27)
\displaystyle  4b^{2}x^{2}+4a^{2}y^{2} \displaystyle  = \displaystyle  4a^{2}b^{2}\ \ \ \ \ (28)
\displaystyle  \frac{x^{2}}{a^{2}}+\frac{y^{2}}{b^{2}} \displaystyle  = \displaystyle  1 \ \ \ \ \ (29)

To find the area of an ellipse, we can again use either polar or rectangular coordinates. To find the area in polar coordinates, consider the area of a thin wedge of radius {r} and angular extent {d\theta}. If {r} were constant over the range {\theta\in\left[0,2\pi\right]}, we’d have the area of a circle, or {\pi r^{2}}. The wedge has a fraction {d\theta/2\pi} of this area, so the area of the wedge is {\frac{1}{2}r^{2}d\theta}, and the area of the ellipse is

\displaystyle  A=\int_{0}^{2\pi}\frac{1}{2}r^{2}d\theta \ \ \ \ \ (30)

Using 10 we get

\displaystyle  A=\frac{b^{4}}{2a^{2}}\int_{0}^{2\pi}\frac{d\theta}{\left(1+e\cos\theta\right)^{2}} \ \ \ \ \ (31)

I’m not sure how you would do this integral by hand, but for the indefinite integral Maple gives a complicated result involving arctans and tangents. Putting in the limits, however, and restricting {e} to {0\le e<1} gives a simple answer (using 3):

\displaystyle  A=\frac{\pi b^{4}}{a^{2}\left(1-e^{2}\right)^{3/2}}=\pi\frac{b^{2}}{\sqrt{1-e^{2}}}=\pi ab \ \ \ \ \ (32)

Using rectangular coordinates is somewhat easier. We can work out the area for the first quadrant and multiply by 4:

\displaystyle   A \displaystyle  = \displaystyle  4b\int_{0}^{a}\sqrt{1-\frac{x^{2}}{a^{2}}}dx\ \ \ \ \ (33)
\displaystyle  \displaystyle  = \displaystyle  2ab\left[\arctan\left(\frac{x}{\sqrt{a^{2}-x^{2}}}\right)+x\sqrt{a^{2}-x^{2}}\right]_{0}^{a}\ \ \ \ \ (34)
\displaystyle  \displaystyle  = \displaystyle  \pi ab \ \ \ \ \ (35)

Angular distances on the celestial sphere

Reference: Carroll, Bradley W. & Ostlie, Dale A. (2007), An Introduction to Modern Astrophysics, 2nd Edition; Pearson Education – Chapter 1, Problems 1.8 – 1.11.

Sometimes we need to know the angular distance between two points on the celestial sphere. By angular distance, I mean the angle subtended by lines drawn from the centre of the Earth to each of the two points. For example, the angular distance between the north pole and any point on the equator is {90^{\circ}}. The angular diameters of both the moon and the sun are around {30'} or half a degree. One practical application of such a calculation is to determine if two stars that are close to each other in the sky will both be visible within the field of view of a telescope using a particular eyepiece, since an eyepiece’s field of view is typically given as an angle.

Since the coordinates of objects are typically given in right ascension (RA) and declination (Dec), we’d like a formula that gives the angular distance between two sets of such coordinates. The derivation of the formula uses spherical trigonometry and is given in full in Carroll & Ostlie’s Section 1.3 (in particular Fig. 1.17 and surrounding text), so we won’t go through the details here. The exact formulas relating the angular distance {\Delta\theta} between a point {P} with coordinates {\left(RA,Dec\right)=\left(\alpha,\delta\right)} and an adjacent point {Q} with coordinates {\left(\alpha+\Delta\alpha,\delta+\Delta\delta\right)} are

\displaystyle   \sin\left(\Delta\alpha\right)\cos\left(\delta+\Delta\delta\right) \displaystyle  = \displaystyle  \sin\left(\Delta\theta\right)\sin\phi\ \ \ \ \ (1)
\displaystyle  \cos\left[90^{\circ}-\left(\delta+\Delta\delta\right)\right] \displaystyle  = \displaystyle  \cos\left(90^{\circ}-\delta\right)\cos\left(\Delta\theta\right)+\sin\left(90^{\circ}-\delta\right)\sin\left(\Delta\theta\right)\cos\phi \ \ \ \ \ (2)

where {\phi} is the angle, measured counterclockwise as we look at the sky, between a line due north from {P} and the line {PQ} (by ‘line’ I mean a segment of a great circle, since these lines are drawn on a sphere), measured . In principle, {\phi} is of course determined by {P} and {Q} so we should be able to eliminate it from these equations, but it turns out that {P} and {Q} are quite close together, there is an easier way to eliminate {\phi}, as we’ll see now.

If we assume that {\Delta\delta} and {\Delta\alpha} are both small enough, we can use the approximations {\sin x\approx x} and {\cos x\approx1} together with {\cos\left(90^{\circ}-x\right)=\sin x}, {\sin\left(90^{\circ}-x\right)=\cos x} and the trig formulas for the sum and difference of angles to simplify the above formulas. From 1 we get, by keeping only first order terms in {\Delta\delta} and {\Delta\alpha}:

\displaystyle   \Delta\alpha\left(\cos\delta-\sin\Delta\delta\sin\delta\right) \displaystyle  = \displaystyle  \Delta\theta\sin\phi\ \ \ \ \ (3)
\displaystyle  \Delta\alpha\cos\delta \displaystyle  = \displaystyle  \Delta\theta\sin\phi \ \ \ \ \ (4)

From 2 we get

\displaystyle   \sin\left(\delta+\Delta\delta\right) \displaystyle  = \displaystyle  \sin\delta+\Delta\theta\cos\delta\cos\phi\ \ \ \ \ (5)
\displaystyle  \sin\delta+\Delta\delta\cos\delta \displaystyle  = \displaystyle  \sin\delta+\Delta\theta\cos\delta\cos\phi\ \ \ \ \ (6)
\displaystyle  \Delta\delta\cos\delta \displaystyle  = \displaystyle  \Delta\theta\cos\delta\cos\phi\ \ \ \ \ (7)
\displaystyle  \Delta\delta \displaystyle  = \displaystyle  \Delta\theta\cos\phi \ \ \ \ \ (8)

We can now square 4 and 8 and add them to get

\displaystyle  \boxed{\left(\Delta\theta\right)^{2}=\left(\Delta\alpha\cos\delta\right)^{2}+\left(\Delta\delta\right)^{2}} \ \ \ \ \ (9)

Note that the approximation {\sin x\approx x} assumes that {x} is in radians, not degrees. However, because each term in this equation contains the square of an angle, the conversion factor from degrees to radians cancels out, so the formula is valid whether we specify angles in degrees or radians (as long as you remember to take {\delta} in the correct units when taking the cosine).

Example 1 The closest star system to the Sun is the {\alpha} Centauri triple star system. Of the three stars in this system, Proxima Centauri ({\alpha} Centauri C) is the closest to us. The coordinates of Proxima Centauri in the year 2000 (known as the epoch J2000.0) were {\left(\alpha,\delta\right)_{C}=\left(14^{h}29^{m}42.95^{s},-62^{\circ}40'46.1^{\prime\prime}\right)} (unfortunately for those of us in the far northern hemisphere, the {\alpha} Centauri system’s extreme southern declination means we can never see it). The brightest star in the system ({\alpha} Centauri A) has J2000.0 coordinates of {\left(\alpha,\delta\right)_{A}=\left(14^{h}39^{m}36.50^{s},-60^{\circ}50'02.3^{\prime\prime}\right)}. Since these two points are quite close to each other, we can use 9 to see how far apart they appear in the sky. We can first convert the coordinates into degrees in decimal form. The conversion for {\alpha} is {1^{h}=15^{\circ}}, so {1^{m}} is 1/60 of {15^{\circ}} or {15'=0.25^{\circ}}. Likewise, {1^{s}=15^{\prime\prime}=15/3600^{\circ}}. Therefore {14^{h}29^{m}42.95^{s}=14\times15+29\times0.25+42.95\times15/3600=217.42896^{\circ}}.

\displaystyle   \left(\alpha,\delta\right)_{C} \displaystyle  = \displaystyle  \left(217.42896^{\circ},-62.67947^{\circ}\right)\ \ \ \ \ (10)
\displaystyle  \left(\alpha,\delta\right)_{A} \displaystyle  = \displaystyle  \left(219.90208^{\circ},-60.83397^{\circ}\right) \ \ \ \ \ (11)

We therefore have

\displaystyle   \Delta\alpha \displaystyle  = \displaystyle  2.47312^{\circ}\ \ \ \ \ (12)
\displaystyle  \Delta\delta \displaystyle  = \displaystyle  1.8455^{\circ}\ \ \ \ \ (13)
\displaystyle  \cos\delta \displaystyle  = \displaystyle  \cos\left(-62.67947^{\circ}\right)\ \ \ \ \ (14)
\displaystyle  \displaystyle  = \displaystyle  0.45897\ \ \ \ \ (15)
\displaystyle  \Delta\theta \displaystyle  = \displaystyle  \sqrt{\left(\Delta\alpha\cos\delta\right)^{2}+\left(\Delta\delta\right)^{2}}\ \ \ \ \ (16)
\displaystyle  \displaystyle  = \displaystyle  2.1666^{\circ} \ \ \ \ \ (17)

If the distance to the {\alpha} Centauri system is {r=4.0\times10^{16}\mbox{ m}} (and we assume that all members of the system are the same distance from us, or at least that the radial distances between the three stars is negligible as a fraction of their distance from us), then Proxima Centauri and {\alpha} Centauri A are separated by

\displaystyle  d_{C\rightarrow A}=r\Delta\theta \ \ \ \ \ (18)

This time, we do have to use radians, so we get

\displaystyle  d_{C\rightarrow A}=\left(4.0\times10^{16}\right)\frac{\pi}{180}\left(2.1666\right)=1.512\times10^{15}\mbox{ m} \ \ \ \ \ (19)

This is about 3.78% of the system’s distance from us.

Example 2 Proper motion and precession. The position of a star as seen from Earth can change due to two effects: the star’s actual motion through space, known as proper motion, and the precession of the Earth’s axis. The latter effect is due to the direction of the Earth’s axis slowly changing so that the north celestial pole moves through a circle in the sky with a period of about 26,000 years. Both these effects cause a star’s coordinates to change. In the previous example, we gave the coordinates of the {\alpha} Centauri system for epoch J2000.0. Astronomical coordinates are typically updated for the effects of precession every 50 years, but of course it is sometimes necessary to know the precise coordinates of a star for a time between updates. Calculating these positions exactly is a complicated business, but there are approximate formulas that are usually good enough. The changes in coordinates due to precession relative to J2000.0 are given by

\displaystyle   \Delta\alpha \displaystyle  = \displaystyle  M+N\sin\alpha\tan\delta\ \ \ \ \ (20)
\displaystyle  \Delta\delta \displaystyle  = \displaystyle  N\cos\alpha \ \ \ \ \ (21)

where

\displaystyle   M \displaystyle  = \displaystyle  1^{\circ}.2812323T+0^{\circ}.0003879T^{2}+0^{\circ}.0000101T^{3}\ \ \ \ \ (22)
\displaystyle  N \displaystyle  = \displaystyle  0^{\circ}.5567530T-0^{\circ}.0001185T^{2}-0^{\circ}.0000116T^{3}\ \ \ \ \ (23)
\displaystyle  T \displaystyle  = \displaystyle  \left(t-2000.0\right)/100 \ \ \ \ \ (24)

and {t} is the current date, specified as the current year plus a fraction for the current date.

For Proxima Centauri in the year 2010, using these formulas gives

\displaystyle   \Delta\alpha \displaystyle  = \displaystyle  0^{\circ}.1936\ \ \ \ \ (25)
\displaystyle  \Delta\delta \displaystyle  = \displaystyle  -0^{\circ}.0442 \ \ \ \ \ (26)

Substituting into 9 we get

\displaystyle   \Delta\theta_{precession} \displaystyle  = \displaystyle  0^{\circ}.0992\ \ \ \ \ (27)
\displaystyle  \displaystyle  = \displaystyle  357^{\prime\prime}\approx6' \ \ \ \ \ (28)

The proper motion of Proxima Centauri is measured to be {3.84^{\prime\prime}} per year with a position angle of {\phi=282^{\circ}}. Over 10 years, proper motion results in a shift of {\Delta\theta_{pm}=38.4^{\prime\prime}}, so the precession effect is larger. We can get the shifts in RA and Dec from 4 and 8:

\displaystyle   \Delta\alpha_{pm} \displaystyle  = \displaystyle  \Delta\theta_{pm}\frac{\sin\phi}{\cos\delta}\ \ \ \ \ (29)
\displaystyle  \displaystyle  = \displaystyle  -81.84^{\prime\prime}=-5.46^{s}\ \ \ \ \ (30)
\displaystyle  \Delta\delta_{pm} \displaystyle  = \displaystyle  \Delta\theta_{pm}\cos\phi\ \ \ \ \ (31)
\displaystyle  \displaystyle  = \displaystyle  7.98^{\prime\prime} \ \ \ \ \ (32)

Julian dates

Reference: Carroll, Bradley W. & Ostlie, Dale A. (2007), An Introduction to Modern Astrophysics, 2nd Edition; Pearson Education – Chapter 1, Problem 1.7.

The various units commonly used to measure the passage of time, such as days, weeks, months and years, are not conducive to simple calculations since such quantities as determining the number of days between two dates are not easy to calculate. In astronomy, most time calculations refer to the time difference between two events (for example, the orbital period of a planet or the period of variability of a star), so a method of specifying time that makes such differences easy to calculate is preferred.

The most common such system is the Julian date, in which each day, starting from noon Universal Time (UT or Greenwich Mean Time (GMT), which is the time zone used in the UK during the winter months) on January 1, 4713 BC, is numbered sequentially. The historic reason for this choice of starting date relies on choosing some rather obscure periods (one of which is a 15 year cycle the Romans used for calculating land tax), and isn’t important in astronomy, but if you’re interested, I’ll refer you to the Wikipedia article.

The advantage of using Julian dates is, of course, that if the JD values for two events are known, the time between these events is just the difference in JDs. Of course, if we know the calendar date of an event, it’s a bit of a pain to calculate the corresponding JD, but online converters exist, such as this one provided by the US Naval Observatory.

In order to calculate a JD for a date near to the present, it’s useful to know the JD for a reference date in our current era. The JD of noon UT January 1, 2000 is

\displaystyle  JD2000=2451545.0 \ \ \ \ \ (1)

The JD of a time of day other than noon is calculated by adding on the fraction of a day since noon that has elapsed, so that midnight on January 2, 2000 (that is, 0:00 hours on January 2) is 2451545.5.

For a particular date such as July 14, 2006, 16:15 hours UT, we can work out the JD starting with JD2000 by counting the number of days since JD2000 (remembering to take leap years into account). The period from Jan 1, 2000 to Jan 1, 2006 includes 2 leap years (2000 and 2004) so there are {6\times365+2=2192} days between these dates. Between noon Jan 1, 2006 and noon July 14, 2006, there are {31+28+31+30+31+30+13=194} full days, so we’re now up to {2192+194=2386} complete days since Jan 1, 2000. The time from 12:00 to 16:15 is {4.25/24=0.177083} days, so the JD of July 14, 2006, 16:15 hours UT is

\displaystyle  JD=JD2000+2386+0.177083=2453931.177083 \ \ \ \ \ (2)

There are several variants of the Julian date that are sometimes used. The Modified Julian Date (MJD) is defined as

\displaystyle  MJD\equiv JD-2400000.5 \ \ \ \ \ (3)

which is equivalent to setting the starting date of {MJD=0.0} at 0h November 17, 1858 (thus the MJD starts at midnight UT rather than noon). Historically, the MJD was introduced in 1957 to allow the computers of the day (with very limited memory) to track the orbit of the first artificial satellite, Sputnik. Reducing the size of the numbers allowed the computer to handle the calculations. Thus the MJD of July 14, 2006, 16:15 hours UT is

\displaystyle  MJD=2453931.177083-2400000.5=53930.677083 \ \ \ \ \ (4)

Other variants include the Reduced JD, with starting date of 12:00 on November 16, 1858, Truncated JD (0:00, May 24, 1968), Dublin JD (12:00, December 31, 1899) and Chronological JD (0:00, January 1, 4713 BC, but adjusted for local time zone). All these variants, however, still number days sequentially from their starting date so they are all equally useful for calculating time differences between events.

Astronomical coordinates: declination and right ascension

Reference: Carroll, Bradley W. & Ostlie, Dale A. (2007), An Introduction to Modern Astrophysics, 2nd Edition; Pearson Education – Chapter 1, Problems 1.4 – 1.6.

Although objects visible in the sky are all at different distances from Earth, for the purposes of observing them from Earth-bound telescopes we can treat the objects in the sky as if they were all on the surface of a sphere whose centre is at the centre of the Earth. This sphere is known as the celestial sphere. The coordinates of an object in the sky can then be specified using only two coordinates, analogous to the latitude and longitude used to identify places on Earth. We can get celestial latitudes by simply projecting Earth’s latitudes onto the celestial sphere. The celestial latitude is called declension (Dec for short) and is measured in degrees ranging from {-90^{\circ}} at the south celestial pole (the point in the sky that is a projection of Earth’s south pole onto the celestial sphere) to {+90^{\circ}} at the north celestial pole.

When we try to specify celestial longitude we can’t just project Earth’s longitude onto the celestial sphere since Earth rotates relative to the stars so the longitude of a point on the Earth’s surface that is directly under some particular star changes continuously over the course of a day. A set of longitude lines is therefore specified that is fixed relative to the stars’ positions on the celestial sphere. This ‘celestial longitude’ is called right ascension and, rather than being measured in degrees as terrestrial longitude is, it is measured in hours, minutes and seconds, with values ranging from 0 hours around (in an easterly direction) to 24 hours.

Although right ascension (or RA as it is more commonly abbreviated) is measured in units of time, we need to be careful in relating RA time to terrestrial time. The 24 hours of RA correspond to one complete rotation of Earth relative to the stars, NOT the Sun! To see the difference, picture the Earth in its orbit about the Sun. Over the course of one complete orbit about the Sun, Earth goes through {360^{\circ}}. As the year is 365.25 days, Earth travels just under {1^{\circ}} of its orbit each day. That means that Earth must rotate about {361^{\circ}} between successive times at which the Sun is on the meridian (that is, at its highest point in the sky for a given day). However, Earth needs to rotate only {360^{\circ}} between successive times at which some particular distant star is on the meridian. The former time interval (which is what we think of as an ordinary day) is called a solar day, while the latter time interval is a sidereal day. A solar day is about 4 minutes longer than a sidereal day, as that’s how long it takes Earth to rotate through {1^{\circ}}. The units of RA measure sidereal time, not solar time, so the time it takes for one complete rotation of the celestial sphere (that is, the time taken to run through one complete cycle of RA from 0 hours up to the next 0 hours) is about 4 minutes less than 24 hours measured by your wristwatch. Another way of putting it is that any given star rises about 4 minutes earlier each day. It is this mismatch between sidereal and solar time that causes the constellations we see in the night sky to shift slowly over the course of a year.

The zero hour of RA is defined as the RA coordinate at the point where the Sun crosses the celestial equator heading north, which is the spring or vernal equinox in the northern hemisphere, and the autumnal equinox in the southern hemisphere and occurs around March 21. This point (0 hours RA and {0^{\circ}} Dec) is called the first point of Aries although confusingly it is actually located in Pisces. [The shift is due to Earth’s precession, or wobble in its axis, which we’ll get to eventually. A long time ago, this point actually was in Aries. Incidentally, if you believe in astrology, you should know that the dates of the various star signs are based on the positions of the constellations several thousand years ago. For example, the star sign of Aries runs from March 21 to April 19, during most of which time the Sun is actually in Pisces. The Sun doesn’t actually enter Aries until around April 17. Yet another reason, if one were needed, not to take horoscopes seriously.]

Due to the tilt of the Earth’s axis of around {23.5^{\circ}}, the path of the Sun across the sky (known as the ecliptic) varies between extremes of declination of {+23.5} (around June 21, at which point the Sun’s coordinates are {RA=6^{h};\; Dec=+23.5^{\circ}}) through the equinox around September 21 ({RA=12^{h};\; Dec=0^{\circ}}) to the winter (summer) solstice in the northern (southern) hemisphere around December 21 ({RA=18^{h};\; Dec=-23.5^{\circ}}).

The altitude (angle relative to the horizon) of an object depends on the latitude from which it is observed. At the north pole, the north celestial pole is directly overhead, so an object with {Dec=90^{\circ}} is directly overhead. In general, an object with Dec {\delta} has altitude {\delta} when seen from the north pole. At the north pole, the altitude of any given object is a constant; it just rotates around the sky in a circle around the zenith (the point directly overhead).

As we move south, the celestial north pole moves downwards, so that at latitude {L} the altitude of the celestial north pole is also {L}. Since objects rotate around the celestial north pole, an object with Dec {\delta} has an altitude that depends on where in the sky it is. When it’s on the meridian, its altitude is highest and at that point, the altitude is {90-L+\delta}. This means that an object with {L=\delta} is directly overhead on the meridian. The highest point in the sky reached by the Sun (on the summer solstice) at a given latitude is thus

\displaystyle A=90-L+23.5=113.5-L \ \ \ \ \ (1)

For latitudes less than 23.5, this gives a value greater than 90 which just indicates that the Sun appears past the zenith at an angle with the zenith of {A-90}. For my latitude of +56.5 north, the highest the Sun gets is {57^{\circ}}.

At the other extreme (the winter solstice), the Sun’s Dec is {-23.5} so the altitude is

\displaystyle A=90-L-23.5=66.5-L \ \ \ \ \ (2)

For my latitude, the winter sun thus never gets more than {10^{\circ}} above the horizon on December 21.

Since the north celestial pole has an altitude {L} at latitude {L}, any objects with declinations in the range {90-L<\delta<90} will never set as seen from that latitude and are known as circumpolar. Similarly, stars with declinations {-90<\delta<L-90} will never rise and are permanently invisible from that latitude. At the north pole, all stars with {\delta>0} are circumpolar, and only the northern half of the sky is ever visible. At the equator, no stars are circumpolar, but all stars are visible at some point.

Midnight sun at the summer solstice therefore occurs when the Sun, at declination {\delta=23.5}, is circumpolar, which occurs for latitudes {66.5<L<90}. At the vernal equinox the Sun’s declination is {\delta=0}, so it will never set only at the north and south poles, where it will just skim the horizon all the way round.

Sidereal and synodic periods of planets

Reference: Carroll, Bradley W. & Ostlie, Dale A. (2007), An Introduction to Modern Astrophysics, 2nd Edition; Pearson Education – Chapter 1, Problems 1.1 – 1.3.

As I recently got a shiny new telescope (for looking at the night sky, not the neighbours!), I thought it would be a good idea to revisit some astrophysics since it’s been something like 40 years since I studied it properly. The book I’ve chosen for this is the one referenced above, and it will take a while for me to work through it, since it has 30 chapters and comes close to 1500 pages. However, since I live in Scotland where clear nights are rare and the sky never gets truly dark between May and July (due to being so far north; the town of Monifieth where I live is almost as far north as Juneau, Alaska), having a form of astronomy that doesn’t depend on actually looking at the sky is very useful.

So let’s begin with some basics about planetary orbits. For most of history, people thought that the Earth was the centre of the universe. Looking at the sky, particularly at night, reinforced this view since all objects visible in the sky appear to rotate around the Earth and since the Earth felt solid an immovable (except during earthquakes), it was just common sense that the Earth was fixed and everything else revolved around it. This form of common sense is the same kind that got Aristotle in trouble by stating that the natural state of objects was at rest (since everything on Earth seemed to come to rest, given long enough) and that heavier objects fell faster than lighter ones (well, a cannonball falls faster than a feather anyway).

In 1543, Polish astronomer Nicolaus Copernicus published On the Revolution of the Celestial Sphere, in which he proposed that the Sun, not the Earth, was the true centre of the universe. Copernicus was still a bit of a mystic, however, so he assumed that the orbits of everything around the Sun were all geometrically ‘perfect’ circles. Since the planets’ orbits are all elliptical to some extent, this severely limited Copernicus’s theory’s ability to predict planetary positions, but it was still a giant leap forward in astronomical thought.

One consequence of the Copernican theory is that it is relatively easy to calculate the synodic period of a planet if we know its sidereal period. First, we need to define these two terms.

The sidereal period of a planet is simply the time it takes to make one complete orbit around the Sun. Thus Earth’s sidereal period is 1 year.

To understand the concept of a synodic period, we need to think about some observational astronomy. First, consider a superior planet (that is, a planet whose orbit is further from the Sun than Earth’s, such as Mars). When the positions of Earth and Mars are just right, Mars is directly opposite the Sun in the sky (that is, the angle between a line from Earth to Sun and a line from Earth to Mars is {\pi} or {180^{\circ}}). When this happens, Mars is said to be in opposition. After opposition, both planets continue in their orbits until some future time at which they line up again, giving another opposition. The time between two successive oppositions is the synodic period.

For inferior planets (planets whose orbits are closer to the Sun than Earth’s, of which there are only two: Venus and Mercury), opposition can never be achieved since it’s impossible for the Earth to get between the Sun and the planet. However, the Earth, the inferior planet and the Sun can still line up in two different ways. When the planet is directly between the Earth and the Sun, it is in inferior conjunction, at which point it’s at its closest position to Earth. When the planet is directly on the other side of the Sun, it is in superior conjunction, at which point it is at its furthest position from Earth. (Superior planets can be in superior conjunction as well, of course, but never in inferior conjunction. As a result, superior conjunction for a superior planet is usually called just ‘conjunction’.)

When Venus or Mercury is in inferior conjuction with the Earth, the Earth is in opposition as seen from Venus or Mercury.

We can work out the relation between the sidereal and synodic periods using a bit of geometry. Consider a superior planet such as Mars, and let’s suppose that at {t=0}, Mars is in opposition. How long will it be before the next opposition? Since Mars takes longer to orbit the Sun, its angular speed {\omega_{M}} is smaller than that of Earth {\omega_{E}}. After a synodic period {S} has elapsed, the position angles of Earth and Mars must be equal again, but since Earth is moving faster, it will have made one more complete orbit around the Sun than Mars has. The angle swept out by Earth in time {S} is {\omega_{E}S}, so

\displaystyle  \omega_{E}S=\omega_{M}S+2\pi \ \ \ \ \ (1)

In terms of the sidereal period of Earth {\omega_{E}=2\pi/P_{E}} (with a similar relation for Mars), so

\displaystyle   \frac{2\pi}{P_{E}}S \displaystyle  = \displaystyle  \frac{2\pi}{P_{M}}S+2\pi\ \ \ \ \ (2)
\displaystyle  \frac{1}{S} \displaystyle  = \displaystyle  \frac{1}{P_{E}}-\frac{1}{P_{M}} \ \ \ \ \ (3)

If all times are in years, then for any superior planet with period {P}

\displaystyle  \frac{1}{S}=1-\frac{1}{P} \ \ \ \ \ (4)

For an inferior planet, the roles of Earth and the planet are reversed, so

\displaystyle  \frac{1}{S}=\frac{1}{P}-1 \ \ \ \ \ (5)

All of this, of course, assumes perfectly circular orbits with planets moving at constant speeds, so the formulas aren’t exact. However, we can get an idea of how good they are by plugging in some actual numbers.

Planet {P} (years) {S} (calc) {S} (actual)
Mercury 0.241 0.3175 0.3176
Venus 0.616 1.604 1.599
Mars 1.9 2.111 2.136
Jupiter 11.9 1.092 1.092
Saturn 29.5 1.035 1.035
Uranus 84.0 1.012 1.013
Neptune 164.8 1.0061 1.0075
Pluto 248.5 1.0040 1.0048

The agreement is suprisingly good for such a simple formula. As you might expect, the further out the planet is, the closer {S} gets to 1 Earth year, since the outer planets don’t move that much in their own orbits over the course of a year. Thus the shortest synodic period for a superior planet is for the planet that is furthest away, namely Pluto (or, if you insist on degrading Pluto to the status of a dwarf planet, Neptune). For amateur astronomers, this means that each superior planet (Jupiter and beyond) is best placed for observing (which happens when it’s at opposition) roughly once per year. Mars is at closest approach to Earth about every two years.

Historically, the synodic period of a planet would be the available datum for the planet (since it’s easy to measure the time from one opposition or conjunction to the next), and the formula above would be inverted to give the planet’s actual period as

\displaystyle  P=\begin{cases} \left(1-\frac{1}{S}\right)^{-1} & \mbox{superior planet}\\ \left(1+\frac{1}{S}\right)^{-1} & \mbox{inferior planet} \end{cases} \ \ \ \ \ (6)

Given the observational data available to someone, such as Copernicus, in the pre-telescope age, we can work out the relative ordering of the planets from the Sun. Since Mercury and Venus are never observed in opposition, they must have orbits closer to the Sun than Earth, and since Mercury’s greatest elongation (angular separation from the Sun) is less than that of Venus, we can deduce that Mercury must be closer to the Sun than Venus.

For the superior planets, we can order them according to decreasing synodic period, which gives the correct ordering of Mars through Pluto (or up to Saturn, since Uranus, Neptune and Pluto were discovered only after the invention of the telescope).

Relativistic electromagnetic potentials

Reference: Griffiths, David J. (2007), Introduction to Electrodynamics, 3rd Edition; Pearson Education – Chapter 12, Problem 12.56.

Maxwell’s equations can be written in terms of the electromagnetic field tensor, its dual and the current four-vector as

\displaystyle   \partial_{j}F^{ij} \displaystyle  = \displaystyle  \mu_{0}J^{i}\ \ \ \ \ (1)
\displaystyle  \partial_{j}G^{ij} \displaystyle  = \displaystyle  0 \ \ \ \ \ (2)

where

\displaystyle   F^{ij} \displaystyle  = \displaystyle  \left[\begin{array}{cccc} 0 & E_{x} & E_{y} & E_{z}\\ -E_{x} & 0 & B_{z} & -B_{y}\\ -E_{y} & -B_{z} & 0 & B_{x}\\ -E_{z} & B_{y} & -B_{x} & 0 \end{array}\right]\ \ \ \ \ (3)
\displaystyle  G^{ij} \displaystyle  = \displaystyle  \left[\begin{array}{cccc} 0 & B_{x} & B_{y} & B_{z}\\ -B_{x} & 0 & -E_{z} & E_{y}\\ -B_{y} & E_{z} & 0 & -E_{x}\\ -B_{z} & -E_{y} & E_{x} & 0 \end{array}\right]\ \ \ \ \ (4)
\displaystyle  J^{i} \displaystyle  = \displaystyle  \left[\rho,J_{x},J_{y},J_{z}\right]\ \ \ \ \ (5)
\displaystyle  \displaystyle  = \displaystyle  \frac{\rho_{0}}{\sqrt{1-\beta^{2}}}\left[1,u_{x},u_{y},u_{z}\right] \ \ \ \ \ (6)

It turns out that an even more compact form for Maxwell’s equations can be written using the 4-vector potential

\displaystyle   A^{i} \displaystyle  = \displaystyle  \left(\frac{V}{c},A_{x},A_{y},A_{z}\right)\ \ \ \ \ (7)
\displaystyle  \displaystyle  = \displaystyle  \left(V,A_{x},A_{y},A_{z}\right) \ \ \ \ \ (8)

where the last line uses relativistic units where {c=1}.

Griffiths shows in section 12.3.5 that the field tensor can be written in terms of the potentials as

\displaystyle  F^{ij}=\partial^{i}A^{j}-\partial^{j}A^{i} \ \ \ \ \ (9)

Note that we’re using the contravariant gradient operator here, in order to get the signs right on the time components. Because of the form of {F^{ij}}, the gauge invariance shows up naturally, since if we replace {A^{i}} by

\displaystyle  A^{i}\rightarrow A^{i}+\partial^{i}\lambda \ \ \ \ \ (10)

where {\lambda} is any scalar function, {F^{ij}} is unchanged, as the order in which the partial derivatives of {\lambda} are taken doesn’t matter, so {\lambda} drops out of the equation for {F^{ij}}. The Lorentz gauge condition is

\displaystyle  \nabla\cdot\mathbf{A}=-\frac{\partial V}{\partial t}=-\partial_{0}V \ \ \ \ \ (11)

[Notice we’re back to using the covariant gradient operator.] This can be condensed to read

\displaystyle  \partial_{i}A^{i}=0 \ \ \ \ \ (12)

[Incidentally, Griffiths’s equation 12.135 is wrong; it should read {\partial A^{\mu}/\partial x^{\mu}=0}.]

Combining 1 with 9 gives

\displaystyle  \partial_{j}\partial^{i}A^{j}-\partial_{j}\partial^{j}A^{i}=\mu_{0}J^{i} \ \ \ \ \ (13)

If we use the Lorentz gauge, the first term is zero, so we get

\displaystyle   \partial_{j}\partial^{j}A^{i} \displaystyle  = \displaystyle  -\mu_{0}J^{i}\ \ \ \ \ (14)
\displaystyle  \Box^{2}A^{i} \displaystyle  = \displaystyle  -\mu_{0}J^{i} \ \ \ \ \ (15)

where the symbol {\Box^{2}} is the d’Alembertian operator, defined as

\displaystyle  \Box^{2}\equiv\partial_{j}\partial^{j} \ \ \ \ \ (16)

We can verify that the other Maxwell equation 2 is also satisfied by the potential formulation by using the earlier result

\displaystyle  \partial_{j}G^{ij}=\partial_{a}F_{bc}+\partial_{b}F_{ca}+\partial_{c}F_{ab} \ \ \ \ \ (17)


where {i}, {a}, {b} and {c} are all different.

Lowering the indexes on 9 we get

\displaystyle  F_{ij}=\partial_{i}A_{j}-\partial_{j}A_{i} \ \ \ \ \ (18)

Substituting 18 into 17 we get

\displaystyle  \partial_{j}G^{ij}=\partial_{a}\partial_{b}A_{c}-\partial_{a}\partial_{c}A_{b}+\partial_{b}\partial_{c}A_{a}-\partial_{b}\partial_{a}A_{c}+\partial_{c}\partial_{a}A_{b}-\partial_{c}\partial_{b}A_{a}=0 \ \ \ \ \ (19)

Contravariant gradient operator

Reference: Griffiths, David J. (2007), Introduction to Electrodynamics, 3rd Edition; Pearson Education – Chapter 12, Problem 12.55.

The gradient of a scalar function {\phi} is a covariant vector since it transforms as

\displaystyle  \frac{\partial\phi}{\partial\bar{x}{}^{a}}=\frac{\partial\phi}{\partial x^{i}}\frac{\partial x^{i}}{\partial\bar{x}{}^{a}} \ \ \ \ \ (1)

We can therefore regard the gradient operator {\partial_{a}} on its own as a covariant vector, so it should have a contravariant counterpart. In flat space, the only change in switching from covariant to contravariant is that the time component changes sign. Given that the Lorentz transformation for a contravariant four-vector is

\displaystyle   \bar{x}^{0} \displaystyle  = \displaystyle  \gamma\left(x^{0}-\beta x^{1}\right)\ \ \ \ \ (2)
\displaystyle  \bar{x}^{1} \displaystyle  = \displaystyle  \gamma\left(x^{1}-\beta x^{0}\right)\ \ \ \ \ (3)
\displaystyle  \bar{x}^{2} \displaystyle  = \displaystyle  x^{2}\ \ \ \ \ (4)
\displaystyle  \bar{x}^{3} \displaystyle  = \displaystyle  x^{3} \ \ \ \ \ (5)

the transformations for the covariant four-vector are obtained by lowering all indices and replacing the time components by their negatives:

\displaystyle   \bar{x}_{0} \displaystyle  = \displaystyle  \gamma\left(x_{0}+\beta x_{1}\right)\ \ \ \ \ (6)
\displaystyle  \bar{x}_{1} \displaystyle  = \displaystyle  \gamma\left(x_{1}+\beta x_{0}\right)\ \ \ \ \ (7)
\displaystyle  \bar{x}_{2} \displaystyle  = \displaystyle  x_{2}\ \ \ \ \ (8)
\displaystyle  \bar{x}_{3} \displaystyle  = \displaystyle  x_{3} \ \ \ \ \ (9)

where we multiplied the {\bar{x}_{0}} equation through by {-1}. The corresponding inverse transformations are obtained by replacing {\beta} by {-\beta}:

\displaystyle   x_{0} \displaystyle  = \displaystyle  \gamma\left(\bar{x}_{0}-\beta\bar{x}_{1}\right)\ \ \ \ \ (10)
\displaystyle  x_{1} \displaystyle  = \displaystyle  \gamma\left(\bar{x}_{1}-\beta\bar{x}_{0}\right)\ \ \ \ \ (11)
\displaystyle  x_{2} \displaystyle  = \displaystyle  \bar{x}_{2}\ \ \ \ \ (12)
\displaystyle  x_{3} \displaystyle  = \displaystyle  \bar{x}_{3} \ \ \ \ \ (13)

Thus the inverse covariant transformations are the same as the forward contravariant transformations.

The contravariant gradient is {\partial^{i}\phi=\frac{\partial\phi}{\partial x_{i}}} so

\displaystyle   \overline{\partial^{i}\phi} \displaystyle  = \displaystyle  \frac{\partial\phi}{\partial\bar{x}_{i}}\ \ \ \ \ (14)
\displaystyle  \displaystyle  = \displaystyle  \frac{\partial\phi}{\partial x_{k}}\frac{\partial x_{k}}{\partial\bar{x}_{i}}\ \ \ \ \ (15)
\displaystyle  \displaystyle  = \displaystyle  \partial^{k}\phi\frac{\partial x_{k}}{\partial\bar{x}_{i}} \ \ \ \ \ (16)

The transformations for each value of {i} are then

\displaystyle   \overline{\partial^{0}\phi} \displaystyle  = \displaystyle  \gamma\partial^{0}\phi-\beta\gamma\partial^{1}\phi\ \ \ \ \ (17)
\displaystyle  \overline{\partial^{1}\phi} \displaystyle  = \displaystyle  \gamma\partial^{1}\phi-\beta\gamma\partial^{0}\phi\ \ \ \ \ (18)
\displaystyle  \overline{\partial^{2}\phi} \displaystyle  = \displaystyle  \partial^{2}\phi\ \ \ \ \ (19)
\displaystyle  \overline{\partial^{3}\phi} \displaystyle  = \displaystyle  \partial^{3}\phi \ \ \ \ \ (20)

Thus {\partial^{i}\phi} transforms like a contravariant vector.

Electromagnetic Minkowski force

Reference: Griffiths, David J. (2007), Introduction to Electrodynamics, 3rd Edition; Pearson Education – Chapter 12, Problem 12.54.

The Minkowski force is defined in general by

\displaystyle  \mathbf{K}=\frac{d\mathbf{p}}{d\tau}=\frac{1}{\sqrt{1-\beta^{2}}}\frac{d\mathbf{p}}{dt}=\frac{1}{\sqrt{1-\beta^{2}}}\mathbf{F} \ \ \ \ \ (1)


where {\mathbf{p}} is the spatial part of the four-momentum. Griffiths shows in his section 12.3.4 that the Minkowski force due to electromagnetic fields is

\displaystyle  K^{i}=q\eta_{j}F^{ij} \ \ \ \ \ (2)


where the electromagnetic field tensor is (with {c=1}):

\displaystyle  F^{ij}=\left[\begin{array}{cccc} 0 & E_{x} & E_{y} & E_{z}\\ -E_{x} & 0 & B_{z} & -B_{y}\\ -E_{y} & -B_{z} & 0 & B_{x}\\ -E_{z} & B_{y} & -B_{x} & 0 \end{array}\right] \ \ \ \ \ (3)

and the proper velocity is

\displaystyle  \eta_{i}=\frac{dx_{i}}{d\tau}=\frac{u_{i}}{\sqrt{1-\beta^{2}}} \ \ \ \ \ (4)

Griffiths shows that the spatial components of {K^{i}} work out to

\displaystyle  \mathbf{K}=\frac{q}{\sqrt{1-\beta^{2}}}\left(\mathbf{E}+\mathbf{u}\times\mathbf{B}\right)=\frac{1}{\sqrt{1-\beta^{2}}}\mathbf{F} \ \ \ \ \ (5)

so the relation 1 between {\mathbf{K}} and {\mathbf{F}} is correct.

To see what the time component of 2 gives us, we can just work it out by reading off the first row of {F^{ij}}:

\displaystyle   K^{0} \displaystyle  = \displaystyle  q\eta_{j}F^{0j}\ \ \ \ \ (6)
\displaystyle  \displaystyle  = \displaystyle  \frac{q}{\sqrt{1-\beta^{2}}}\left(u_{x}E_{x}+u_{y}E_{y}+u_{z}E_{z}\right)\ \ \ \ \ (7)
\displaystyle  \displaystyle  = \displaystyle  \frac{q}{\sqrt{1-\beta^{2}}}\mathbf{u}\cdot\mathbf{E}\ \ \ \ \ (8)
\displaystyle  \displaystyle  = \displaystyle  \frac{1}{\sqrt{1-\beta^{2}}}\mathbf{u}\cdot\mathbf{F}_{E} \ \ \ \ \ (9)

The term {\mathbf{u}\cdot\mathbf{F}_{E}} is the rate at which the electric field does work on the charge as it moves along (remember that because the direction of motion is always perpendicular to the force exerted by a magnetic field, {\mathbf{B}} never does any work, so this is the total work done by the electromagnetic field).

Self-force on a dipole in hyperbolic motion

Reference: Griffiths, David J. (2007), Introduction to Electrodynamics, 3rd Edition; Pearson Education – Chapter 12, Problem 12.61.

Here we’ll revisit the problem of calculating the self-force of a moving charge. In our earlier derivation, we considered a single charge {q} split into two equal charges {q/2} separated by a distance {d} and moving perpendicular to the line joining the two charges. If the motion is in the {x} direction and the line joining the charges is parallel to the {z} axis, the only net electric field felt by one charge due to the field generated by the other charge at a retarded time {t_{r}} is

\displaystyle  E_{x}=\frac{q}{8\pi\epsilon_{0}c^{3}}\frac{r}{\left(r-\frac{lv}{c}\right)^{3}}\left[\left(\frac{cl}{r}-v\right)\left(\frac{c^{2}}{\gamma^{2}}+la\right)-ac\left(r-\frac{lv}{c}\right)\right] \ \ \ \ \ (1)


where {r} is the distance from the other charge at the retarded time to the current charge at the current time, {l} is the distance moved in the {x} direction in the time {t-t_{r}} and {v} and {a} are the velocity and acceleration at the retarded time. {\gamma=1/\sqrt{1-v^{2}/c^{2}}} as usual. The calculating proceeded from here by assuming that the separation distance {d} was small and deriving the self-force in that limiting case.

We’ll now consider an electric dipole consisting of charges {+q} and {-q} separated by a distance {d}, but without assuming {d} to be small. The electric field due to {+q} at the retarded time {t_{r}} felt by {-q} at the current time {t} is therefore

\displaystyle   E_{x} \displaystyle  = \displaystyle  \frac{q}{4\pi\epsilon_{0}c^{3}}\frac{r}{\left(r-\frac{lv}{c}\right)^{3}}\left[\left(\frac{cl}{r}-v\right)\left(\frac{c^{2}}{\gamma^{2}}+la\right)-ac\left(r-\frac{lv}{c}\right)\right]\ \ \ \ \ (2)
\displaystyle  \displaystyle  = \displaystyle  \frac{q}{4\pi\epsilon_{0}c^{2}}\frac{1}{\left(r-\frac{lv}{c}\right)^{3}}\left[\frac{c^{2}l}{\gamma^{2}}+al^{2}-\frac{cvr}{\gamma^{2}}-ar^{2}\right] \ \ \ \ \ (3)

where we’ve changed the 8 in the denominator to a 4 because we’ve replaced {q/2} by {q}. To go any further, we need to make some assumptions about the motion of the dipole, so we’ll suppose that it is moving under hyperbolic motion. Griffiths wants us to consider the position of the charge to be given by

\displaystyle  x\left(t\right)=\frac{mc^{2}}{F}\sqrt{1+\left(Ft/mc\right)^{2}}-1 \ \ \ \ \ (4)


However, to simplify the notation a bit, we’ll consider the more general form

\displaystyle  x\left(t\right)=\sqrt{b^{2}+\left(ct\right)^{2}} \ \ \ \ \ (5)


where {b} is a constant with dimensions of length. This is equivalent to 4 except that {x=b} at {t=0} rather than {x=0}. Since all calculations depend only on how much the dipole moves over a time interval and not on its absolute position, this change won’t affect anything that follows.

Using this form, we can calculate the velocity and acceleration by taking derivatives:

\displaystyle   v\left(t\right) \displaystyle  = \displaystyle  \frac{c^{2}t}{\sqrt{b^{2}+\left(ct\right)^{2}}}=\frac{c^{2}t}{x}\ \ \ \ \ (6)
\displaystyle  a\left(t\right) \displaystyle  = \displaystyle  \frac{c^{2}}{\sqrt{b^{2}+\left(ct\right)^{2}}}-\frac{c^{4}t^{2}}{\left(b^{2}+\left(ct\right)^{2}\right)^{3/2}}\ \ \ \ \ (7)
\displaystyle  \displaystyle  = \displaystyle  \frac{b^{2}c^{2}}{\left(b^{2}+\left(ct\right)^{2}\right)^{3/2}}=\frac{b^{2}c^{2}}{x^{3}}\ \ \ \ \ (8)
\displaystyle  \frac{1}{\gamma^{2}} \displaystyle  = \displaystyle  1-\frac{v^{2}}{c^{2}}\ \ \ \ \ (9)
\displaystyle  \displaystyle  = \displaystyle  \frac{b^{2}}{x^{2}} \ \ \ \ \ (10)

Adding a subscript {r} to indicate a quantity evaluated at {t_{r}} and plugging these equations into 3, we get

\displaystyle  \frac{4\pi\epsilon_{0}c^{2}}{q}E_{x}=\left(r-\frac{lct_{r}}{x_{r}}\right)^{-3}\frac{c^{2}b^{2}}{x_{r}^{2}}\left[l+\frac{l^{2}}{x_{r}}-\frac{crt_{r}}{x_{r}}-\frac{r^{2}}{x_{r}}\right] \ \ \ \ \ (11)

We now need to express {r}, {l} and {x_{r}} in terms of {t_{r}}. From the geometry of the setup

\displaystyle  r^{2}=l^{2}+d^{2} \ \ \ \ \ (12)

and since a signal travels from {+q} at time {t_{r}} to {-q} at time {t} over a distance {r}, we have

\displaystyle   r \displaystyle  = \displaystyle  c\left(t-t_{r}\right)\ \ \ \ \ (13)
\displaystyle  l \displaystyle  = \displaystyle  \sqrt{c^{2}\left(t-t_{r}\right)^{2}-d^{2}} \ \ \ \ \ (14)

Substituting these into 11 and simplifying gives

\displaystyle   \frac{4\pi\epsilon_{0}}{qb^{2}}E_{x} \displaystyle  = \displaystyle  \frac{c^{2}t_{r}\left(t-t_{r}\right)+d^{2}-x_{r}\sqrt{c^{2}\left(t-t_{r}\right)^{2}-d^{2}}}{c^{3}\left(t_{r}\sqrt{c^{2}\left(t-t_{r}\right)^{2}-d^{2}}-x_{r}t+x_{r}t_{r}\right)^{3}}\ \ \ \ \ (15)
\displaystyle  \displaystyle  = \displaystyle  \frac{c^{2}t_{r}\left(t-t_{r}\right)+d^{2}-\sqrt{b^{2}+\left(ct_{r}\right)^{2}}\sqrt{c^{2}\left(t-t_{r}\right)^{2}-d^{2}}}{c^{3}\left(t_{r}\sqrt{c^{2}\left(t-t_{r}\right)^{2}-d^{2}}-\sqrt{b^{2}+\left(ct\right)^{2}}\left(t-t_{r}\right)\right)^{3}} \ \ \ \ \ (16)

where we used 5 to get rid of {x_{r}} in the last line.

Also, {l} is the distance moved in time {t-t_{r}}, so

\displaystyle   l \displaystyle  = \displaystyle  x\left(t\right)-x\left(t_{r}\right)\ \ \ \ \ (17)
\displaystyle  \displaystyle  = \displaystyle  \sqrt{b^{2}+\left(ct\right)^{2}}-\sqrt{b^{2}+\left(ct_{r}\right)^{2}} \ \ \ \ \ (18)

[Note that we can’t take {l=v\left(t-t_{r}\right)}, since the velocity is changing over that time interval.] We can therefore find {t} in terms of {t_{r}} by solving

\displaystyle  \sqrt{c^{2}\left(t-t_{r}\right)^{2}-d^{2}}=\sqrt{b^{2}+\left(ct\right)^{2}}-\sqrt{b^{2}+\left(ct_{r}\right)^{2}} \ \ \ \ \ (19)

This turns out to be a quadratic equation, with solutions

\displaystyle  t=t_{r}+\frac{1}{2cb^{2}t_{r}}\left[ct_{r}^{2}d^{2}\pm dt_{r}\sqrt{\left(4b^{2}+d^{2}\right)\left(c^{2}t_{r}^{2}+b^{2}\right)}\right] \ \ \ \ \ (20)

To decide which sign to take, we note that for very small {d}, the square root term dominates since the first term is of order {d^{2}}. Since we must have {t>t_{r}}, we’ll need to take the + sign. We then get for {l}:

\displaystyle   l \displaystyle  = \displaystyle  \sqrt{c^{2}\left(t-t_{r}\right)^{2}-d^{2}}\ \ \ \ \ (21)
\displaystyle  \displaystyle  = \displaystyle  \frac{d}{2b^{2}}\sqrt{\left(4c^{2}b^{2}t_{r}^{2}+2c^{2}d^{2}t_{r}^{2}+d^{2}b^{2}+2cdt_{r}\sqrt{\left(4b^{2}+d^{2}\right)\left(c^{2}t_{r}^{2}+b^{2}\right)}\right)}\ \ \ \ \ (22)
\displaystyle  \displaystyle  = \displaystyle  \frac{d}{2b^{2}}\sqrt{\left(ct_{r}\sqrt{\left(4b^{2}+d^{2}\right)}+d\sqrt{c^{2}t_{r}^{2}+b^{2}}\right)^{2}}\ \ \ \ \ (23)
\displaystyle  \displaystyle  = \displaystyle  \frac{d}{2b^{2}}\left(ct_{r}\sqrt{\left(4b^{2}+d^{2}\right)}+d\sqrt{c^{2}t_{r}^{2}+b^{2}}\right) \ \ \ \ \ (24)

Plugging this into 16 we get

\displaystyle   \frac{4\pi\epsilon_{0}}{qb^{2}}E_{x} \displaystyle  = \displaystyle  \frac{c^{2}t_{r}\left(t-t_{r}\right)+d^{2}-\frac{d}{2b^{2}}\left(ct_{r}\sqrt{\left(4b^{2}+d^{2}\right)}+d\sqrt{c^{2}t_{r}^{2}+b^{2}}\right)\sqrt{b^{2}+\left(ct_{r}\right)^{2}}}{c^{3}\left(t_{r}\frac{d}{2b^{2}}\left(ct_{r}\sqrt{\left(4b^{2}+d^{2}\right)}+d\sqrt{c^{2}t_{r}^{2}+b^{2}}\right)-\sqrt{b^{2}+\left(ct\right)^{2}}\left(t-t_{r}\right)\right)^{3}}\ \ \ \ \ (25)
\displaystyle  \displaystyle  = \displaystyle  \frac{4b^{4}}{c^{3}}\frac{t_{r}^{2}c^{2}d^{2}-2c^{2}b^{2}t_{r}\left(t-t_{r}\right)-d^{2}b^{2}+ct_{r}d\sqrt{\left(4b^{2}+d^{2}\right)}\sqrt{c^{2}t_{r}^{2}+b^{2}}}{\left[\sqrt{c^{2}t_{r}^{2}+b^{2}}\left(2b^{2}\left(t-t_{r}\right)-d^{2}t_{r}\right)-cdt_{r}^{2}\sqrt{\left(4b^{2}+d^{2}\right)}\right]^{3}} \ \ \ \ \ (26)

We can now substitute from 20

\displaystyle  t-t_{r}=\frac{1}{2cb^{2}t_{r}}\left[ct_{r}^{2}d^{2}+dt_{r}\sqrt{\left(4b^{2}+d^{2}\right)\left(c^{2}t_{r}^{2}+b^{2}\right)}\right] \ \ \ \ \ (27)

The numerator in 26 is now

\displaystyle   t_{r}^{2}c^{2}d^{2}-c\left[ct_{r}^{2}d^{2}+dt_{r}\sqrt{\left(4b^{2}+d^{2}\right)\left(c^{2}t_{r}^{2}+b^{2}\right)}\right]-d^{2}b^{2}+ct_{r}d\sqrt{\left(4b^{2}+d^{2}\right)}\sqrt{c^{2}t_{r}^{2}+b^{2}} \displaystyle  = \displaystyle  -d^{2}b^{2} \ \ \ \ \ (28)

and the denominator is

\displaystyle   \left[\sqrt{c^{2}t_{r}^{2}+b^{2}}\left(\frac{1}{ct_{r}}\left[ct_{r}^{2}d^{2}+dt_{r}\sqrt{\left(4b^{2}+d^{2}\right)\left(c^{2}t_{r}^{2}+b^{2}\right)}\right]-d^{2}t_{r}\right)-cdt_{r}^{2}\sqrt{\left(4b^{2}+d^{2}\right)}\right]^{3} \displaystyle  = \displaystyle  \left[\sqrt{\left(4b^{2}+d^{2}\right)}\left(\frac{d}{c}\left(c^{2}t_{r}^{2}+b^{2}\right)-cdt_{r}^{2}\right)\right]^{3}\ \ \ \ \ (29)
\displaystyle  \displaystyle  = \displaystyle  \frac{b^{6}d^{3}}{c^{3}}\left(4b^{2}+d^{2}\right)^{3/2} \ \ \ \ \ (30)

Putting this back into 26 we get

\displaystyle   \frac{4\pi\epsilon_{0}}{qb^{2}}E_{x} \displaystyle  = \displaystyle  \frac{4b^{4}}{c^{3}}\left(-\frac{d^{2}b^{2}}{\frac{b^{6}d^{3}}{c^{3}}\left(4b^{2}+d^{2}\right)^{3/2}}\right)\ \ \ \ \ (31)
\displaystyle  \displaystyle  = \displaystyle  -\frac{4}{d\left(4b^{2}+d^{2}\right)^{3/2}}\ \ \ \ \ (32)
\displaystyle  E_{x} \displaystyle  = \displaystyle  -\frac{qb^{2}}{\pi\epsilon_{0}d\left(4b^{2}+d^{2}\right)^{3/2}}\ \ \ \ \ (33)
\displaystyle  \displaystyle  = \displaystyle  -\frac{q}{8\pi\epsilon_{0}db\left(1+\left(d/2b\right)^{2}\right)^{3/2}} \ \ \ \ \ (34)

Miraculously, the dependence on both {t} and {t_{r}} has disappeared, showing that the field (and therefore the self-force) is constant. To get this answer back into the form given by Griffiths, we compare 4 and 5.

\displaystyle   x \displaystyle  = \displaystyle  \sqrt{b^{2}+c^{2}t^{2}}\ \ \ \ \ (35)
\displaystyle  \displaystyle  = \displaystyle  b\sqrt{1+\left(\frac{ct}{b}\right)^{2}}\ \ \ \ \ (36)
\displaystyle  \displaystyle  = \displaystyle  \frac{mc^{2}}{F}\sqrt{1+\left(Ft/mc\right)^{2}}\ \ \ \ \ (37)
\displaystyle  b \displaystyle  = \displaystyle  \frac{mc^{2}}{F} \ \ \ \ \ (38)

[We can ignore the {-1} in 4 since that just changes the origin of {x}, and all the calculations above depend only on {x_{r}-x} so the {-1} cancels out.] With this value for {b} we have

\displaystyle  E_{x}=-\frac{qF}{8\pi\epsilon_{0}mc^{2}d\left(1+\left(Fd/2mc^{2}\right)^{2}\right)^{3/2}} \ \ \ \ \ (39)

The force felt by {-q} is therefore

\displaystyle  F_{x}=-qE_{x}=\frac{q^{2}F}{8\pi\epsilon_{0}mc^{2}d\left(1+\left(Fd/2mc^{2}\right)^{2}\right)^{3/2}} \ \ \ \ \ (40)

The total self-force on the dipole is twice this, since there is an equal force on {+q} due to {-q}, so

\displaystyle  F_{tot}=\frac{q^{2}F}{4\pi\epsilon_{0}mc^{2}d\left(1+\left(Fd/2mc^{2}\right)^{2}\right)^{3/2}} \ \ \ \ \ (41)

[However, it would seem to me that there should also be forces on each charge due to their own fields at the retarded time. If these forces are equal in magnitude to the cross-self-force calculated here, they should also be opposite in direction since the force is between the same charge at different times, causing repulsion. Thus it would seem that the total force is zero, which actually makes more sense than having the dipole constantly accelerating without any external force. What am I missing?]

In any case, if we set {F_{tot}=F} and solve for {F}, we get

\displaystyle   F \displaystyle  = \displaystyle  \frac{2mc^{2}}{d}\sqrt{\left(\frac{q^{2}}{4\pi\epsilon_{0}mc^{2}d}\right)^{2/3}-1}\ \ \ \ \ (42)
\displaystyle  \displaystyle  = \displaystyle  \frac{2mc^{2}}{d}\sqrt{\left(\frac{\mu_{0}q^{2}}{4\pi md}\right)^{2/3}-1} \ \ \ \ \ (43)

Follow

Get every new post delivered to your Inbox.

Join 510 other followers