Computed tomography (the CT or CAT scan) has become an absolutely fundamental tool in the modern diagnosis of disease. Unlike the X-Ray scan, which presents a two dimensional view of the general tissue density in a scanned region, a CT scan allows doctors to see a detailed 3D scan of a patient’s body, essentially peering into the body without exploratory surgery. Take a look at the image below to see the dramatic difference between a normal 2D X-ray (left) and a CT scan (right).

Image courtesy of EmergencyMD

The Principles Behind Computed Tomography

CT scans work by taking multiple X-rays and using the “overlapping” information to combine them into a composite 3D image. From a single X-Ray scan, you really only get the equivalent of a “shadow” of the insides.

Notice how the array of detectors in the image to the left only sees a one-dimensional line of intensity measurements from the object being scanned. This is essentially what is seen in a regular X-ray, like the scans returned when a portable chest X-ray is ordered. As more X-ray scans are taken from different angles, the internal structure of the object being scanned becomes more apparent.

Notice how the “shadows” of each scan start to overlap and form a darker area in the center, displaying details of the inside of the scanned object, which would otherwise be hidden away by the surface. The principles of computed tomography are relatively simple to grasp, but implementing the computation and reconstruction of the collected data is a much more math-y and challenging task.

The Mathematics

Defining the problem

The first mathematical tool we need to discuss is the “Radon transform.” On Wikipedia, the Radon transform is defined as “the integral transform which takes a function f defined on the plane to a function Rf defined on the (two-dimensional) space of lines in the plane, whose value at a particular line is equal to the line integral of the function over that line.” If you aren’t well versed in multivariable calculus, this definition is a little unapproachable, but the concept is really quite simple.

The Radon transform of an object is basically the raw* data that the CT scanner collects during the scan, which is later transformed into a reconstruction of the slice of the patient later on.

Source. Imagine that the linear array of detectors is on the left side of the object in the GIF. The right square shows a plot of the intensity values of the detectors as a function of the angle of rotation of the object.

In fancier, more mathematical terminology, the Radon transform is defined as:

Where the bottom line is the parameterization of the line integral in terms of a line at angle α. Here’s an example of the Radon transform (right image) of a slice of the patient, and the reconstructed slice (left image).

Source: Buzug, Einführung in die Computertomography, Springer Verlag , 2004

CT-scanners basically “collect” the Radon transforms of slices of patients’ bodies and then reconstruct those transforms back into the slices that generated them.

Thus, the main mathematical challenge presents itself: how do we calculate the inverse of the Radon transform?

Approaches

There exist both iterative and “exact” solutions to the Radon transform which we will go through, although only a handful are really used in practice.

First, let’s take the most basic approach. Imagine a slice of your patient as a two dimensional grid, where each pixel of the image can be represented by cartesian coordinates (x, y). The intensity of an X-ray passing through the slice of your patient is basically** the sum of the intensities of all the pixels that the ray passes through. So really, the radon transform represents a huge*** system of simultaneous linear equations. Luckily, these equations are extremely sparse, since simple geometry tells us that a ray will pass through about √n pixels on average, where n is the total number of pixels in the image.

In a perfect world, there exist exact solutions to these massive systems of equations which will yield exactly the image of the slice of the patient. In practice, iterative methods and approximations are used to solve such huge systems, since it’s computationally impractical to directly solve them. This method of solution is know as the “Simultaneous Algebraic Reconstruction Technique.”

Other methods take a sort of enhanced “guess and check” process, where an initial image of the slice is guessed randomly and transformed into what the CT would see if that guess was correct. That transformation is then compared to what the CT actually saw. The guess is then modified slightly, and the process is repeated until the predicted transformation matches the observed data within a certain error bound.

Modern methods rely on the Fourier slice theorem, which involves mathematics a little beyond the expected scope of the target reader of this blog. The Fourier slice theorem basically says that taking a special transformation (called the Fourier transformation) of an image and then taking a slice of that transformation is equivalent to first taking a slice of the image, then taking the Fourier transform of that slice.

Additional Information

For those more math inclined, I highly recommend you take a look at a more in-depth presentation on the mathematics and physics behind the CT scan, which this piece is heavily based off.

*A ton of preprocessing is done to the raw data collected, but unfortunately it’s far beyond my knowledge of signal processing

**(absorption of electromagnetic radiation is a little complicated, see Beer’s Law)

Source. This equation shows the actual relation of a detected X-ray’s intensity (δ) based on the stuff the ray travels through (left side of the equation) and a bunch material/setup constants (right side).

***equivalent to the number of pixels in the image, e.g. 2,073,600 equations for a 1080p image