next up previous
Next: About this document ...

3rd Year Honours Mathematical Physics
Special Relativity
Lorentz Transformations

Brian Dolan

Let two inertial co-ordinate systems (ICR's), S and $S^\prime$, be in standard configuration. This means that $S^\prime$ is moving with constant velocity ${ \underline {\hbox{v}} }$ relative to S in the x-direction as in the following figure



\begin{figure}
\special{psfile=/home/bdolan/usr/texfiles/handouts/SR/html/LT1.eps
hscale=100 vscale=100 voffset=-200}
\end{figure}


and that the spatial origins, O and $O^\prime$, and the three spatial axes, $({ \underline {\hbox{e}} }_1,{ \underline {\hbox{e}} }_2,{ \underline {\hbox{e}} }_3)$ and $({ \underline {\hbox{e}} }^\prime_1,{ \underline {\hbox{e}} }^\prime_2,{ \underline {\hbox{e}} }^\prime_3)$, co-incide at time $t=t^\prime=0$.

We shall convert units of time (eg. seconds) to a length (eg. meters) by multiplying time by the speed of light c - thus our measure of time will be ct which has units of length (this assumes implicitly that c is the same in both reference frames - one of the postulates of relativity). We wish to determine Cartesian co-ordinates in $S^\prime$, $(ct^\prime,x^\prime,y^\prime,z^\prime)$, as functions of Cartesian co-ordinates in S, (ct,x,y,z), using reasonable assumptions. In other words $x^\prime$ will be a function of ct,x,y and z, i.e. $x^\prime(ct,x,y,z)$, etc. It will sometimes be convenient to adopt an index notation where the four co-ordinates (ct,x,y,z) are labelled by an index a taking on four possible values 0,1,2,3 with x0=ct, x1=x, x2=y and x3=z so that (ct,x,y,z)=(xa). Similarly for primed co-ordinates an index notation is sometimes useful, $(ct^\prime,x^\prime,y^\prime,z^\prime)=(x^{{a^\prime}})$, where the prime is placed on the index so that the index itself can be used to distinguish between the two co-ordinates systems. The change from (ct,x,y,z)=(xa) to $(ct^\prime,x^\prime,y^\prime,z^\prime)=(x^{{a^\prime}})$ is called a co-ordinate transformation.

The derivation of the explicit form of the co-ordinate transformation proceeds in four steps:


Step 1): The transformations are linear


Consider a clock moving with constant velocity, showing a time $\tau$. The path of the clock in S can be described by four functions $x^a(\tau)$. Since the clock is moving with constant velocity in Sequal increments of $\tau$ must correspond to equal increments of the co-ordinates (xa) labeling the position of the clock in S. Thus $dx^a/d\tau$ is constant and $d^2x^a/d\tau^2=0$. Since $S^\prime$ is moving with constant velocity relative to S, the clock must aslo be moving with constant velocity in $S^\prime$ hence the same argument implies that $dx^{{a^\prime} }/d\tau$ is constant and $d^2x^{{a^\prime}}/d\tau^2=0$. Now treating $x^{a^\prime}$ as functions of xa the chain rule for differentiation implies

$\displaystyle {dx^{{a^\prime}}\over d\tau}$ = $\displaystyle \sum^3_{b=0}{\partial x^{{a^\prime}}\over\partial x^b}
{dx^b\over d\tau}$ (1)
$\displaystyle {d^2x^{{a^\prime}}\over d\tau^2}$ = $\displaystyle \sum^3_{b=0}{\partial x^{{a^\prime}}\over\partial x^b}
{d^2x^b\ov...
...x^{{a^\prime}}\over\partial x^b\partial x^c}{dx^b\over d\tau}{dx^c\over d\tau}.$ (2)

Thus $d^2x^a/d\tau^2=0$ and $d^2x^{{a^\prime}}/d\tau^2=0$can only be true if ${\partial^2 x^{{a^\prime}}\over\partial x^b\partial x^c}=0$, in other words the transformations must be linear in xa. In mathematical symbols this means
$\displaystyle ct^\prime$ = $\displaystyle {L^{0^\prime}}_0 ct + {L^{0^\prime}}_1 x
+ {L^{0^\prime}}_2 y + {L^{0^\prime}}_3 z + C^{0^\prime}\cr
x^\prime$ (3)

where $C^{a^\prime}$ are constants and ${L^{a^\prime}}_b(v)$ are sixteen functions, independent of xa but possibly depending on v - the velocity of S relative to $S^\prime$. If S is in standard configuration relative to $S^\prime$ then $C^{a^\prime}=0$ for all four values of a=0,1,2,3.

These conditions can be summarised in the single formula

\begin{displaymath}x^{{a^\prime}}=\sum^3_{b=0}{L^{a^\prime}}_b(v)x^b,\end{displaymath}

which can be thought of as a matrix formula with ${L^{a^\prime}}_b(v)$ a $4\times 4$ matrix and $x^{a^\prime}$and xa column vectors,

\begin{displaymath}\left(\matrix{ct^\prime\cr x^\prime\cr y^\prime\cr z^\prime}\...
...r y\cr z}\right)=
L(v)\left(\matrix{ ct\cr x\cr y\cr z}\right).\end{displaymath}

The matrix with components ${L^{a^\prime}}_b$ can be inverted to give xa in terms of $x^{a^\prime}$,

\begin{displaymath}\left(\matrix{ ct\cr x\cr y\cr z}\right)
=L^{-1}(v)
\left(\matrix{ct^\prime\cr x^\prime\cr y^\prime\cr z^\prime}\right),\end{displaymath}

where L-1(v) is the inverse matrix to L(v), i.e. $L(v){L^{-1}}(v)={\bf 1}$ with ${\bf 1}$ the identity matrix. Since the (xa) co-ordinate system is moving in the negative $x^\prime$direction relative to the $(x^{a^\prime})$ with speed v it is clear that $(x^{a^\prime})$ bear the same relation to (xa) as (xa)do to $(x^{a^\prime})$, except that the sign of v is reversed. Mathematically this means that L-1(v)=L(-v).

We shall now determine the sixteen functions ${L^{a^\prime}}_b(v)$.

Step 2): $ \bf {L^{0^\prime}}_2= {L^{0^\prime}}_3= {L^{1^\prime}}_2= {L^{1^\prime}}_3=0 $


At time $t=t^\prime=0$ the two planes $x=x^\prime=0$co-incide for all y and z, as in the following figure



\begin{figure}\special{psfile=/home/bdolan/usr/texfiles/handouts/SR/html/LT3.eps
hscale=100 vscale=100 voffset=-150}
\end{figure}
thus
% latex2html id marker 1098
$\displaystyle \hbox{equation }(2)\quad\Rightarrow$ $\textstyle \quad 0={L^{0^\prime}}_2 \;y +{L^{0^\prime}}_3 \;z\qquad\forall \;y,z$   (4)
% latex2html id marker 1100
$\displaystyle \hbox{equation }(3)\quad\Rightarrow$ $\textstyle \quad 0={L^{1^\prime}}_2 \;y +{L^{1^\prime}}_3 \;z\qquad\forall \;y,z$   (5)
$\displaystyle \hbox{therefore}\qquad\qquad$ $\textstyle {L^{0^\prime}}_2= {L^{0^\prime}}_3=
{L^{1^\prime}}_2= {L^{1^\prime}}_3=0.$   (6)

Step 3): $ \bf z^\prime = z$ and $\bf y^\prime = y$


At time $t=t^\prime=0$ the two planes $z^\prime=z=0$ co-incide. Since the relative motion is in the x-direction and there is no rotation (by assumption), the planes $z^\prime=z=0$co-incide $\forall t$ as in the following figure


\begin{figure}\special{psfile=/home/bdolan/usr/texfiles/handouts/SR/html/LT2.eps
voffset=-200 hscale=100 vscale=100}
\end{figure}


thus

% latex2html id marker 1112
$\displaystyle \hbox{equation } (5)$ $\textstyle \Rightarrow$ $\displaystyle 0={L^{3^\prime}}_0\; ct + {L^{3^\prime}}_1 \;x+
{L^{3^\prime}}_2 \;y \qquad\forall \;t,x,y$ (7)
  $\textstyle \Leftrightarrow$ $\displaystyle {L^{3^\prime}}_0= {L^{3^\prime}}_1=
{L^{3^\prime}}_2=0$ (8)
$\displaystyle \hbox{therefore}$   $\displaystyle \quad z^\prime = {L^{3^\prime}}_3(v)z.$ (9)

We can apply the same argument with S and $S^\prime$interchanged, which requires that we replace L(v) by L-1(v)=L(-v), to deduce that $z = {L^{3^\prime}}_{3}(-v)z^\prime$. Hence ${L^{3^\prime}}_3(v){L^{3^\prime}}_{3}(-v)=1$.

Now if we reflect $x\rightarrow -x$, without changing the other co-ordinates in S, it should be clear that z and $z^\prime$do not change (since we have just proven that $z^\prime$is independent of x). But changing the sign of xchanges the sign of v, since the relative motion is in the x-direction. Hence ${L^{3^\prime}}_{3}(-v)={L^{3^\prime}}_{3}(v)$, thus ${L^{3^\prime}}_3(v)^2=1$, so ${L^{3^\prime}}_3(v)=\pm 1$. The sign can be determined by the trivial observation that v=0 should give the identity transformation, thus ${L^{3^\prime}}_3(v)=1$.

A similar argument applied to the two planes $y=y^\prime=0$allows us to conclude that

\begin{displaymath}{L^{2^\prime}}_0= {L^{2^\prime}}_1=
{L^{2^\prime}}_3=0\end{displaymath}

and ${L^{2^\prime}}_2(v)=1$.


In summary, we have now that the transformation matrix must be of the form

\begin{displaymath}{L^{a^\prime}}_b(v)=\left(\matrix{
{L^{0^\prime}}_0(v)&{L^{0^...
...e}}_0(v)&{L^{1^\prime}}_1(v)&0&0\cr
0&0&1&0\cr
0&0&0&1}\right).\end{displaymath}

Step 4): The functional form of $\bf {L^{0^\prime}}_0(v)$, $\bf {L^{0^\prime}}_1(v)$, $\bf {L^{1^\prime}}_0(v)$ and $\bf {L^{1^\prime}}_1(v)$


Up until now we have only really used the postulates of relativity to streamline the notation. Now it will be used to full effect. First suppose a flash of light is emitted from the origin O=(0,0,0) of S at t=0 (and so also from the origin $O^\prime=(0,0,0)$ of $S^\prime$ at $t^\prime=0$). The flash expands with the speed of light, c which is the same in both reference frames, as a spherical shell whose radius at time t is given by x2+y2+z2=c2t2 in S and by $x^{\prime\; 2}+y^{\prime\; 2}+z^{\prime\; 2}=c^2t^{\prime\; 2}$ in $S^\prime$. Now we already know that $y^\prime =y$ and $z^\prime = z$ so

\begin{displaymath}(ct^\prime)^2-(x^\prime)^2 =(ct)^2-x^2=y^2+z^2.\end{displaymath}

Also

\begin{displaymath}x^\prime = {L^{1^\prime}}_0(v)\;ct+{L^{1^\prime}}_1(v)\;x \qq...
...quad
ct^\prime= {L^{0^\prime}}_0(v)\; ct+{L^{0^\prime}}_1(v)\;x\end{displaymath}

so

\begin{displaymath}\bigl({L^{0^\prime}}_0\;ct+{L^{0^\prime}}_1\;x\bigr)^2-
\bigl({L^{1^\prime}}_0\;ct+{L^{1^\prime}}_1\;x\bigr)^2=
(ct)^2-x^2.
\end{displaymath}

Demanding that this hold true for all t and any x with |x|<ct gives three conditions
$\displaystyle \bigl({L^{1^\prime}}_1\bigr)^2- \bigl({L^{0^\prime}}_1\bigr)^2=$ 1   (10)
$\displaystyle \bigl({L^{0^\prime}}_0\bigr)^2-\bigl({L^{1^\prime}}_0\bigr)^2=$ 1   (11)
$\displaystyle {L^{1^\prime}}_0 {L^{1^\prime}}_1 - {L^{0^\prime}}_0 {L^{0^\prime}}_1=$ 0   (12)

on four unkowns. We can express these as four functions of a single parameter by using the identity $\cosh^2\alpha - \sinh^2\alpha=1$for any real $\alpha$ to write
$\displaystyle {L^{1^\prime}}_1{=\cosh\alpha}$ $\textstyle \qquad$ $\displaystyle {L^{0^\prime}}_0=\cosh\alpha$ (13)
$\displaystyle {L^{0^\prime}}_1={-\sinh\alpha}$ $\textstyle \qquad$ $\displaystyle {L^{1^\prime}}_0=-\sinh\alpha,$ (14)

where $\alpha(v)$ is a function of v which is yet to be determined (the minus sign is for later convenience). We have now arrived at the following form for the transformation matrix:

\begin{displaymath}{L^{a^\prime}}_b(v)=\left(\matrix{
\cosh\alpha(v)&-\sinh\alph...
...ha(v)&\cosh\alpha(v)&0&0\cr
0&0&1&0\cr
0&0&0&1}\right).\eqno(5)\end{displaymath}

Step 5): The functional form of $\bf \alpha (v)$


The spatial origin $O^\prime$ of $S^\prime$ is determined by $x^\prime=y^\prime=z^\prime=0$. In S the point $x^\prime=y^\prime=z^\prime=0$ moves with speed v in the x-direction, i.e. it has x co-ordinate x=vt. Thus

\begin{displaymath}x^\prime=-ct\sinh\alpha + vt\cosh\alpha =0 \qquad\Rightarrow\qquad
\tanh\alpha = {v\over c}\end{displaymath}

so $\alpha$ can be written as an inverse hyper-trigonometric function

\begin{displaymath}\alpha(v)=\tanh^{-1}(v/c).\end{displaymath}

Note that the properties of the $\tanh$ function now imply that -c<v<c (see the figure below). $\alpha$ is called the rapidity of the transformation.



\begin{figure}\special{psfile=/home/bdolan/usr/texfiles/handouts/SR/html/tanh.ps
hscale=60 vscale=60 angle=-90}
\end{figure}


A plot of $v/c=\tanh\alpha$ as a function of the rapidity $\alpha$.


Using $\cosh^2\alpha - \sinh^2\alpha=1$ we have

\begin{displaymath}\cosh\alpha={1\over\sqrt{1-\tanh^2\alpha}}={1\over\sqrt{1-(v/...
...uad\sinh\alpha={1\over\sqrt{1-(v/c)^2}}\left({v\over c}\right).\end{displaymath}

It is conventional to define

\begin{displaymath}\gamma(v)={1\over\sqrt{1-(v/c)^2}}\end{displaymath}

and then the transformation matrix can be written as

\begin{displaymath}{L^{a^\prime}}_b(v)=\left(\matrix{
\gamma(v)&-\gamma(v)v/c&0&0\cr
-\gamma(v)v/c&\gamma(v)&0&0\cr
0&0&1&0\cr
0&0&0&1}\right).\end{displaymath}

Thus we have finally arrived at the following form for the transformation
$\displaystyle t^\prime$ = $\displaystyle \gamma(v)(t-xv/c^2)$ (15)
$\displaystyle x^\prime$ = $\displaystyle \gamma(v)(x-vt)$ (16)
$\displaystyle y^\prime$ = y (17)
$\displaystyle z^\prime$ = z. (18)

These are called Lorentz Transformations or sometimes Lorentz Boosts, to distinguish them from rotations - the name ``boost'' is unfortunate as there is no acceleration involved.

In matrix notation the Lorentz Transformations can be represented as

\begin{displaymath}\left(\matrix{ct^\prime\cr x^\prime\cr y^\prime \cr z^\prime}...
...0\cr
0&0&0&1}\right)
\left(\matrix{ct\cr x\cr y \cr z}
\right).\end{displaymath}

The rapidity has the useful property that it is additive under successive transformations in the same direction with $v_1=c\tanh\alpha_1$ and $v_2=c\tanh\alpha_2$. This is most easily established using matrix multiplication to show that

\begin{displaymath}L(\alpha_1)L(\alpha_2)=L(\alpha_1+\alpha_2)\end{displaymath}

(use the hyperbolic trigonometric identities
$\displaystyle \cosh\alpha_1\cosh\alpha_2+\sinh\alpha_1\sinh\alpha_2$ = $\displaystyle \cosh(\alpha_1 + \alpha_2)$ (19)
$\displaystyle \cosh\alpha_1\sinh\alpha_2+\sinh\alpha_1\cosh\alpha_2$ = $\displaystyle \sinh(\alpha_1 + \alpha_2)$ (20)

). Thus these two transformations are equivalent to a single transformation with rapidity $\alpha_3 = \alpha_1 + \alpha_2$.

 
next up previous
Next: About this document ...
Brian Dolan
1998-11-27