The main objective of this project is to develop an odometry system which is easily installable and non invasive to the vehicle. The system should provide good estimates of velocity, distance traveled and orientation of the car. This chapter contains a study of the solutions including a final proposed solution.
As stated in the State of the Art Chapter, there are three kinds of odometry solutions to solve this problem, common odometry with wheel's sensors, visual odometry and inertial sensors.
Visual odometry and inertial sensors have great advantages over the common odometry, mostly because this sensors are small, compact and can be mounted in a non-invasive way. When using wheels sensors there is the need to develop an apparatus to support them and this apparatus needs to be easily installable.
Using only inertial sensors is not good, as to obtain the displacement one needs to integrate the acceleration, from accelerometers, two times and to obtain orientation we need to integrate angular velocity, from gyroscopes, one time. Integrating acceleration introduces cumulative errors due to noise which makes the inertial sensors the least suitable for the objective.
Actual GPS equipments became affordable and accurate enough for speed measurement. However they depend on satellite visibility, which makes it ineffective in environments with obstacles and indoor locations.
In the following subsections two of the possible solutions will be presented, a mechanical solution using encoders to measure wheel's velocity and orientation; and a visual odometry solution using a line scan camera to compute velocity plus a gyroscope to compute orientation.
\section{Mechanical Solution}
For a solution using encoders there are several ways and combinations to solve the problem. The next topics are the chosen solutions to be discussed.
\begin{itemize}
\item One encoder in each of the rear wheels
\begin{itemize}
\item with this configuration it is possible to calculate the velocity and orientation with the difference in velocity on the two wheels. This is a simple configuration, with low setup complexity and low cost. Although we can calculate the vehicle orientation angle, this system doesn't provide the front wheel orientation which is a very important information;
\end{itemize}
\item Encoder on one of the wheels and angle monitoring of the steering wheel
\begin{itemize}
\item This solution implies the existence of some device on the inside of the vehicle cabin; it is very hard to install a device on the steering wheel without it being an obstacle to the driver;
\end{itemize}
\item Encoder and angle monitoring in a single front wheel
\begin{itemize}
\item This is a practical solution because with one single device we can monitor all the wanted magnitudes, this makes it a good solution in terms of compactness and easy to setup. The major downside is the complexity of the apparatus needed to support the sensors to keep with all the front wheel degrees of freedom;
\end{itemize}
\item Encoder in one of the rear wheels and angle monitoring of one of the front wheels
\begin{itemize}
\item With this configuration we can monitor both wheel's velocity and steering angle. There is the downside of the need of two separate devices, one for the front and one for the rear wheel. However, this is a low complexity system to produce and setup.
\end{itemize}
\end{itemize}
\begin{figure}[htbp]
\centering
\includegraphics[width=12cm]{./Imagens/esquema.png}
\caption{Illustration of a possible mechanical solution and its main components.}
\label{f:mech_solution}
\end{figure}
A mechanical solution is proposed, as in the Figure \ref{f:mech_solution}. The solution consists of an apparatus mounted on one of the front wheels to serve as a support for an encoder for wheel's velocity measurement and another encoder or potentiometer for angle measurement. As shown in Figure \ref{f:montagem}, the equipment is fixed to the car's body with suction cups and there are several adjustments to meet the geometry variability of the vehicle.
\begin{figure}[htbp]
\centering
\includegraphics[width=10cm]{./Imagens/montagem2.png}
\caption{Illustration of the device assembly.}
\label{f:montagem}
\end{figure}
This solution provides all the necessary information in only one wheel. A high resolution encoder can be used to provide a high sampling rate of velocity. For the angle of the wheel, a multi-turn potentiometer or an absolute encoder might be used to provide an absolute positioning of the steering angle. The system can be removed and installed on a variety of vehicles due to the adjustments that it provides. However, the rigidity of the system and its constraints can make it fragile for a more dynamic and aggressive driving situation. Also the system being out of the bounds of the car's body may limit its use due to being to exposed to the environment. Given these limitations, the optical solution will be further detailed and studied.
\section{Optical Solution}
Horn et. al. \cite{horn} used two cameras on a robot, one looking forward to estimate yaw rate and forward velocity and the other facing down to estimate two dimension velocity. They found that the camera facing down gave the best results for longitudinal and lateral velocity. The tests were made at low speed, lower than 1 m/s.
Nourani-Vatani et. al. \cite{Nourani} also used a similar method with a ground-facing matrix camera mounted on a passenger car. The experiments were at around 1m/s. The velocity measurements had errors comparable to the wheel speed sensor and the cumulative odometry errors were at around 5$\%$ after 100m traveled.
It is clear that common matrix cameras are not suitable for odometry to vehicles with car like speeds. Even with high speed matrix cameras, the amount of data to process would be extremely difficult to compute in real-time.
In his PhD thesis, K\'{a}lm\'{a}n \cite{Kalman}, suggests the use of line scan cameras to estimate robot's longitudinal velocity. His method suggests that one can compute correlation between successive lines and with that information compute the pixel displacement between them. Line scan cameras provide high resolution and high frame rate at a lower cost than matrix cameras, this makes them suitable to be used at higher speeds. K\'{a}lm\'{a}n created a simulator in which he tested the ability for line sensors to be used to estimate robot's velocity; he found that a sensor such as the S3901 from Hamamatsu, which works at 15600 fps, can measure velocity to a maximum of 120 m/s or 432 km/h.
Instinctively, one might think that image overlap is impossible when only a line is captured from the ground. Wide pixel sensors or the integrating effect from the lens can be used to capture more information from the ground. Figure \ref{f:integrating_effect} from \cite{Kalman}, shows the integrating effect of the lens on a line sensor. On the left we can see the area being projected on a matrix camera pixels and on the right we can see that the same area is projected on a single line of a line scan camera, providing an integrating effect.
\begin{figure}[htbp]
\centering
\includegraphics[width=10cm]{./Imagens/integrating_effect.jpg}
\caption[Illustration of the imaging of a matrix camera and a line-scan camera with
cylindrical lens]{Illustration of the imaging of a matrix camera and a line-scan camera with
cylindrical lens \cite{Kalman}.}
\label{f:integrating_effect}
\end{figure}
A line scan camera can only detect movements that are parallel to its axis, so it only provides one dimensional information. This brings the problem that the sensor needs to ensure good measurements even when movements perpendicular to its axis happen. This happens when the vehicle has a circular or perpendicular movement, this can induce measurement errors as the successive line might have a pattern change from the previous one. This is illustrated in Figure \ref{f:side_motion} from \cite{Kalman}. On the left we can see a parallel movements and it is clear that the successive snapshots can have overlap and a common area in which we can compute correlation. On the right we can see the problem of a sideways movement causing the overlap to be smaller, which will cause a weaker correlation and induce error. This error can be reduced using the integrating effect from the lens, a wider pixel sensor or by a larger field of view. Also the sampling rate can be high enough that the successive snapshots will a small displacement and thus capture the same texture elements.
\begin{figure}[htbp]
\centering
\includegraphics[width=10cm]{./Imagens/side_motion.jpg}
\caption[ Illustration of the problem of sideways motion]{ Illustration of the problem of sideways motion \cite{Kalman}.}
\label{f:side_motion}
\end{figure}
\section{Solution Description}
The velocity of vehicles and mobile robots can be measured in many ways. The best solution depends mostly on the application. The type of sensor to be used may also depend on the platform kinematics.
Optical sensors provide the most information, and with the current processing capabilities available, these sensors are widely used today. The main advantage is that they provide information to movement estimation which is independent to the locomotion system. A measuring system using optical sensors also can be compact and self-contained making it easier to install, this is important as one of the objectives of this work is to develop a system which is removable.
For the reasons stated, the optical solution is the most suited for the objectives of this thesis. The mechanical solution has constraints which may limit its use on some vehicles, has a complicated setup and causes a visual impact. With the given potential of the optical sensors, this work will focus on the study of an optical based solution.
\begin{figure}[htbp]
\centering
\includegraphics[width=10cm]{./Imagens/esquema_sensor.eps}
\caption{Optical solution proposed.}
\label{f:esquema_sensor}
\end{figure}
The optical solution proposed is based on a line scan sensor. These sensors are only one dimensional and, thus, only provide longitudinal velocity. There is the need of another sensor to estimate the orientation, to do so, the use of gyroscopes is proposed. On Figure \ref{f:esquema_sensor} there is a schematic of the solution proposed. The main components are as follows:
\textbf{Line scan sensor:} the main component is the line scan sensor; as said before, it should have a high rate and, if possible, have wide pixels to provide an integrating effect and thus capture more information from the ground. Several line sensors are available, the LIS-770i from Panavision Imaging \cite{panavision} and the S3901 from Hamamatsu \cite{hamamatsu} both provide wide pixels. The RPLIS2K-EX form Dynamax Imaging \cite{dynamax} is a high resolution sensor (2048x1), they also provide a 2048x4 resolution sensor, the DLIS2K, which can be a interesting solution to the problem of sideways movements.
\textbf{Gyroscopes:} The use of gyroscopes is for orientarion measurements. For more reliable readings the use of two gyroscopes is proposed. As stated by Rafeiro \cite{rafeiro}, using redundancy from two gyroscopes decreases the errors from the gyroscope's drift.
\textbf{Optics:} this unit should be used to provide the adequate field of view and also the integrating effect. The distance to the ground is constantly changing do to suspension motion and unevenness of the ground, resulting in a variable field of view and miscalibration, which can induce measurement errors. To solve this problem a telecentric lens can be used; these lenses have a magnification which is invariant to the distance \cite{telecentric}. Magnification of a conventional lens can be made invariant to defocus by simply adding an aperture at an analytically derived location \cite{telecentric2}.
\textbf{Illumination:} the illumination proposed is a high power LED array. Because of the low exposure time, high speed camera sensors need a big amount of light to have a good quality image. Infra-red illumination is the best solution, specially if the sensor is to be used on a car and the light might be a distraction.
\textbf{Processing Unit:} the processing unit needs to be able to do the acquisition of the high speed sensor and process the data. Microprocessors might not be suitable to do that as line scan sensors usually have a pixel clock that works in the MHz range and common microprocessors are not able to do acquisition at that rate. Thus the proposed is a unit composed of a FPGA or CPLD, which are equipments commonly used for high speed data computing and can do parallel computing which speeds the process.
\section{Principles of Linear Image Sensors}
\label{section:image_sensors}
An image sensor is a device that converts an optical image into an electric signal. They are used mostly in digital cameras and other imaging displays. The early sensors were video camera tubes; the modern sensors are semiconductor charge-coupled devices (CCD) or active pixel sensors in complementary metal-oxide-semiconductor (CMOS), Figure \ref{f:image_sensors}. Both have the same basic principle of converting light to an electric signal \cite{dalsa_sensor}.
\begin{figure}[htbp]
\centering
\subfloat{\includegraphics[width=4cm]{./Imagens/image_sensors.jpg}}
\quad
\subfloat{\includegraphics[width=4cm]{./Imagens/image_cmos.jpg}}
\caption[Image sensors from Teledyne DALSA - CCD (left) and CMOS (right)]{Image sensors from Teledyne DALSA - CCD (left) and CMOS (right) \cite{dalsa_sensor}.}
\label{f:image_sensors}
\end{figure}
A CCD is a device in which every pixel's charge is converted to voltage through a very limited of outputs, often just one. After the conversion the output is sent as an analog signal. They have the advantage of high uniformity that contributes to image quality. In a CMOS sensor, each pixel has its own hardware for charge to voltage conversion and other circuitry so the chip output is completely digital. This has the disadvantage of increasing complexity and reduced area for light capture. The uniformity of the pixels is lower due to each pixel having its own voltage conversion, but with that it is the advantage of high speed \cite{dalsa_sensor}.
\begin{figure}[htbp]
\centering
\subfloat{\includegraphics[width=4cm]{./Imagens/ccd_linear.jpg}}
\quad
\subfloat{\includegraphics[width=4cm]{./Imagens/cmos_linear.jpg}}
\caption[Linear image sensors from Teledyne DALSA - CCD (left) and CMOS (right)]{Linear image sensors from Teledyne DALSA - CCD (left) and CMOS (right) \cite{dalsa_sensor}.}
\label{f:linear_sensors}
\end{figure}
The solution proposed in this work is based on line scan image sensors. This sensors are also available as CCD or CMOS, like the ones shown in Figure \ref{f:linear_sensors}. Most of the applications of linear sensors in machine vision require high speed and high frame rates, for this reason CCDs are a technology in decay. On Figure \ref{f:linear_scheme} it is possible to see how a linear image sensor works. A CCD has usually one charge to voltage converter that operates for all the pixels, opposed to the individual converters on a CMOS. On a CMOS the conversions are made in parallel so the image acquisition is extremely fast.
Line scan cameras usually use multiple outputs methods to increase image sampling rates. Typical line-scan cameras use the following output methods: single tap, dual taps, triple taps, quad taps, and octal taps \cite{taps}.
\begin{figure}[htbp]
\centering
\includegraphics{./Imagens/cmos_line.jpg}
\caption[Illustration of the working principles of a CCD vs CMOS.]{Illustration of the working principles of a CCD vs CMOS \cite{dalsa_sensor}.}
\label{f:linear_scheme}
\end{figure}
\FloatBarrier
The most common output methods in line scan cameras are the single tap, dual taps and quad taps \cite{taps}. The single tap is a method in which the photodiodes of a linear image sensor converts the light captured into an electrical signal and transmit it through a single output, this method is illustrated in Figure \ref{f:single}.
\begin{figure}[htbp]
\centering
\includegraphics{./Imagens/single.png}
\caption[Illustration of the single taps data output method.]{Illustration of the single taps data output method \cite{taps}.}
\label{f:single}
\end{figure}
The dual tap is a method in which the photodiodes of a linear image sensor converts the light captured into an electrical signal and transmit it through a dual output. Dual taps can be of two different types, the first is \textit{Even/Odd Output}; in this method separates the light captured into even and odd components to be converted to electrical signal. This method is illustrated in Figure \ref{f:dual_even_odd}. The other method for dual tap is the \textit{Front and Rear Output} and the main diference from the previous one is the it separates the linear image sensor in front and rear components, as illustrated in Figure \ref{f:dual_front_rear}.
\begin{figure}[htbp]
\centering
\includegraphics{./Imagens/dual_even_odd.png}
\caption[Illustration of the dual taps - even/odd data output method.]{Illustration of the dual taps - even/odd data output method \cite{taps}.}
\label{f:dual_even_odd}
\end{figure}
\begin{figure}[htbp]
\centering
\includegraphics{./Imagens/dual_front_rear.png}
\caption[Illustration of the dual taps - front/rear data output method.]{Illustration of the dual taps - front/rear data output method. \cite{taps}.}
\label{f:dual_front_rear}
\end{figure}
The quad taps are a combination of the two dual taps methods. In this case the two sets of even and odd components are divided each by front and rear to generate four outputs. This process is illustrated in Figure \ref{f:quad}.
\begin{figure}[htbp]
\centering
\includegraphics{./Imagens/quad.png}
\caption[Illustration of the quad taps data output method.]{Illustration of the quad taps data output method. \cite{taps}.}
\label{f:quad}
\end{figure}
\FloatBarrier
\section{Ackermann Steering Geometry}
The purpose of the Ackermann steering geometry is to avoid the necessity for tires to slip sideways when following a curved path. This geometry assures that all the wheels have the axles arranged so that when describing a curve they point to the same center point. As the rear wheels are fixed together, this center point needs to be a line extension of the rear axle. For the axles of the front wheels to intersect this centre point it is required that the inside wheels is turned at a greater angle than the outside one \cite{acker}.
\begin{figure}[htbp]
\centering
\includegraphics[width=10cm]{./Imagens/esquema.JPG}
\caption[In an Ackerman-steered vehicle, the extended axes for all wheels intersect in a common point.]{In an Ackerman-steered vehicle, the extended axes for all wheels intersect in a common point \cite{whereami}.}
\label{f:esquema_acker}
\end{figure}
The model of the Ackermann steering geometry is illustrated in Figure \ref{f:esquema_acker}. Such a geometry satisfies the Ackermann equations \cite{vehicle_dyna}:
\begin{equation}
\label{eq:acker}
cot (\theta_{i}) - cot (\theta_{o}) = \frac{d}{l}
\end{equation}
where:
$\theta_{i}$ is the relative steering angle of the inner wheel; $\theta_{o}$ is the relative steering angle of the outer wheel; $l$ is the longitudinal wheel separation and $d$ is the lateral wheel separation.
The vehicle steering angle $\theta_{SA}$ can be simplified as the angle of an imaginary wheel located at the center, $P_{2}$ shown in the Figure \ref{f:esquema_acker}. $\theta_{SA}$ can be expressed in terms of either the inside or outside steering angles ($\theta_{i}$ or $\theta_{o}$) as follows \cite{vehicle_dyna}:
\begin{equation}
\label{eq:acker}
cot (\theta_{SA}) = \frac{d}{2l} + cot (\theta_{i})
\end{equation}
or,
\begin{equation}
\label{eq:acker}
cot (\theta_{SA}) = cot (\theta_{o}) - \frac{d}{2l}
\end{equation}
Modern cars now used an Ackermann derived steering and not a \textit{pure} Ackermann. This is due to dynamic concerns where a traditional geometry has disadvantages, but for low-speed maneuvers the principle is the same \cite{vehicle_dyna}. Ackermann-like steering provides good traction and ground clearance for all-terrain operation. Ackerman steering is thus the method of choice for outdoor autonomous vehicles.
The characteristics of this steering geometry are important as if the image sensor is mounted on one of the car's axles, the line captured will always be tangential to the arcs described by the wheels. This means that Ackermann steering, under normal conditions, ensures that no sideways movement is made. This assumption is of great importance to the solution proposed as this is essential to obtain correlation between consecutive lines. The solution proposed is not suitable for high lateral displacement vehicles, such as omnidirectional vehicles; for these a different combination of sensors is required.
\section{Correlation Methods}\label{sec:algorithms}
\label{sec:correlation}
Two successive images of a line sensor are only two vectors of 1 x n dimensions where n is the resolution of the sensor. This fact makes it possible to use simpler algorithms than the ones needed to process 2-D images from matrix sensors. Correlation between 2-D images is based on optical flow algorithms and template matching techniques. When dealing with 1-D samples there are several methods to compute its similarity. K\'{a}lm\'{a}n \cite{Kalman}, suggests the use of similarity coefficients such as Pearson' correlation, cosine similarity and others, described in following subsection. An obvious and effective method to compute similarity between two samples is cross-correlation, widely used in signal processing, described in subsection \ref{subsection:cross}.
\subsection{Similarity Coefficients}
{\bf Euclidean distance} - is the distance between two points in a straight line. For the points \textit{P}($p_{1}$, $p_{2}$, ..., $p_{n}$) and \textit{Q}($q_{1}$, $q_{2}$, ..., $q_{n}$) the distance is calculated as follows:
\begin{equation}
\label{eq:euclidian}
d(p,q) = \sqrt{\sum_{i=1}^n \left(q_{i} - p_{i}\right)^{2}}
\end{equation}
{\bf Manhattan distance} - is the distance between two points in a grid-like path, with a strictly horizontal and/or vertical path, as opposed to the diagonal or straight line distance:
\begin{equation}
\label{eq:manhattam}
d(p,q) = \sum_{i=1}^n \vert q_{i} - p_{i}\vert
\end{equation}
{\bf Pearson correlation} - Pearson correlation coefficient measures the linear correlation between two variables \textit{X}($x_{1}$, $x_{2}$, ..., $x_{n}$) and \textit{Y}($y_{1}$, $y_{2}$, ..., $y_{n}$). The coefficient can take a value between +1 and -1, where 1 is total positive correlation, 0 is no correlation, and -1 is total negative correlation. For a sample $X_{i}$ and $Y_{i}$. the Pearson correlation coefficient is:
\begin{equation}
\label{eq:pearson}
r = \frac{1}{n-1} \times \sum_{i=1}^n \left( \frac{X_{i}-\bar{X}}{s_{X}} \right) \left( \frac{Y_{i}-\bar{Y}}{s_{Y}} \right)
\end{equation}
where
\begin{equation}
\label{eq:pearson}
\frac{X_{i}-\bar{X}}{s_{X}}
\end{equation}
\begin{equation}
\label{eq:pearson}
\bar{X}=\frac{1}{n}\sum_{i=1}^n X_{i}
\end{equation}
\begin{equation}
\label{eq:pearson}
s_{x}=\sqrt{\frac{1}{n-1} \sum_{i=1}^n \left( X_{i}-\bar{X} \right)^2}
\end{equation}
are the standard score, sample mean, and sample standard deviation, respectively.
{\bf Spearman correlation} - The Spearman correlation coefficient is the nonparametric version of the Pearson correlation coefficient. Spearman's correlation coefficient, measures the strength of association between two ranked variables. Spearman's rank correlation coefficient, $\rho$, is computed from these:
\begin{equation}
\label{eq:pearson}
\rho = \frac{\sum_{i=1}^n(x_{i}-\bar{x})(y_{i}-\bar{y})}{\sqrt{\sum_{i=1}^n (x_{i}-\bar{x})^2 \sum_{i=1}^n (y_{i}-\bar{y})^2}}
\end{equation}
As the Pearson's correlation coefficient, the Spearman's correlation coefficient also can take a value between +1 and -1.
{\bf Cosine similarity} - is a measure of similarity between two vectors by calculating the cosine of the angle between them. For $0^{o}$ the cosine is 1, and less than 1 for any other angle. It measures the similarity of two vectors by calculating its orientation: two vectors with the same orientation have a Cosine similarity of 1, the value decreases to 0 as the vectors differ in orientation, and two opposed vectors have a similarity of -1, independently of their magnitude. Given two vectors $X$ and $Y$, the cosine similarity, $cos(\theta)$, is derived from the dot product and magnitude as follows:
\begin{equation}
\label{eq:pearson}
similarity = cos(\theta) = \frac{X \cdot Y}{\Vert X\Vert \Vert Y\Vert} = \frac{\sum_{i=1}^n X_{i} Y_{i}}{\sqrt{\sum_{i=1}^n (X_{i})^2} \sqrt{\sum_{i=1}^n (Y_{i})^2}}
\end{equation}
\subsection{Cross-Correlation}
\label{subsection:cross}
To compute the similarity between two vectors an obvious algorithm is the cross-correlation. Cross correlation is mostly used in signal processing and is a measure of similarity of two waveforms as a function of a time-lag applied to one of them. This method has applications in pattern recognition, digital signal processing and others \cite{cross}.
Cross-correlation is mostly used to compute similarity between signals in the time domain. $l$, called lag, is the time-shift between the two signals. A sequence $y(n)$ is said to be shifted by $l$ samples with respect to the reference $x(n)$. Cross-correlation is defined as follows:
\begin{equation}
\label{eq:correlation}
r_{xy}(l) = (x\star y)(l) = \sum_{n=-\infty}^\infty x(n)y(n-l) = \sum_{m=-\infty}^\infty x(m + l)y(m) = r_{yx}(-l)
\end{equation}
for convenience it can be normalized as follows:
\begin{equation}
\label{eq:correlation_normalized}
\rho_{xy}(l) = \frac{r_{xy}(l)}{\sqrt{r_{xx}(0)r_{yy}(0)}}
\end{equation}
Considering the sequences $x$ and $y$ to be similar and differing only by a shift. Cross-correlation can be used to compute how much $y$ must be shifted to be identical to $x$. This method slides the sequences along each other and calculates the integral of the product at each shift. The value of $(x\star y)$ reaches a maximum when the two sequences are more similar, this happens because the product of the two functions is higher when the peaks are aligned.
For machine vision applications, several authors \cite{lewis2, keane} suggest the use of cross correlation in frequency domain. For this the discrete Fourier transform (DFT) is used. The sequence of N complex numbers $x_{0}, x_{1}, \ldots, x_{N-1}$ is transformed into an N-periodic sequence of complex numbers:
\begin{equation}
\label{eq:DFT}
X(k) = \sum_{n=0}^{N-1} x(n)e^{-j2 \pi kn/N}, 0\leq k \leq N-1.
\end{equation}
the Inverse discrete Fourier transform is given by:
\begin{equation}
\label{eq:DFT}
x(n) = \frac{1}{N} \sum_{n=0}^{N-1} X(k)e^{j2 \pi kn/N}, 0\leq n \leq N-1.
\end{equation}
Being $\mathcal{F}$ the Fourier transform operator, and $\mathcal{F}\{x\}$ and $\mathcal{F}\{y\}$ are the Fourier transforms of $x$ and $y$, respectively. Then, similarly to convolution theorem:
\begin{equation}
\label{eq:cross1}
\mathcal{F}\{x\star y\} = \mathcal{F}\{x\} \cdot \mathcal{F}\{y\}
\end{equation}
where $\cdot$ denotes point-wise multiplication. By applying the inverse Fourier transform $\mathcal{F}^{-1}$, we can write:
\begin{equation}
\label{eq:cross2}
x\star y= \mathcal{F}^{-1}\big\{\mathcal{F}\{x\}\cdot\mathcal{F}\{y\}\big\}
\end{equation}
The fast Fourier transform (FFT) is an algorithm to compute the discrete Fourier transform (DFT) and its inverse. An FFT computes those transformations by factorizing the DFT matrix into a product of sparse (mostly zero) factors. Such factorization can result in significant savings in the computational complexity. With the Fourier transforms one can use the property from equation \ref{eq:cross2} to reduce the cross-correlation to simple products, this reduces significantly the computational complexity of the whole process of cross-correlation. Several FFT algorithms are described in detail in \cite{cross}.
\subsection{Algorithms} \label{sec:algorithm}
To use the similarity coefficients methods to compute the displacement between two snapshots it is necessary an algorithmic procedure. The algorithm used is as illustrated in Figure \ref{f:algo}. First a line scan is captured, then the second line; the correlation coefficient is computed; a shift is applied to the samples and the correlation is computed again, this step is repeated $k$ times until the lines reach 60 $\%$ overlap, as suggested by K\'{a}lm\'{a}n \cite{Kalman}; at last the maximum correlation is found and its index corresponds to the displacement between the samples. This procedure is repeated for each new line scan.
For example, in Figure \ref{f:pearson_corr_lines} are shown two successive snapshots, the plots represent the pixel intensity of each image. After applying the algorithm there is a correlation coefficient for each shift, as shown in Figure \ref{f:pearson_corr}, and the displacement of the two snapshots can be found by locating the maximum of the plot. In this example the displacement was 80 pixels and the correlation coefficient at that peak was 0.9028. The procedure is then repeated for the following lines.
\begin{figure}[htbp]
\centering
\includegraphics[width=12cm]{./Imagens/algoritmo.eps}
\caption[Illustration of the algorithm using the similarity coefficients.]{Illustration of the algorithm using the similarity coefficients. (Adapted from \cite{Kalman}.)}
\label{f:algo}
\end{figure}
\begin{figure}[htbp]
\centering
\includegraphics[width=12cm, height=4cm]{./Imagens/pearson_corr.eps}
\caption{Two successive line scans with 80 pixels displacement.}
\label{f:pearson_corr_lines}
\end{figure}
\begin{figure}[htbp]
\centering
\includegraphics[width=12cm, height=4cm]{./Imagens/pearson_corr_lines.eps}
\caption{Pearson's correlation coefficients between the two line scans.}
\label{f:pearson_corr}
\end{figure}
For the cross-correlation the process is simpler, the algorithm consists in performing a \textit{sliding dot product} between the two image samples. This method is already implemented in MatLab, using FFT, by the xcorr() function.
On Figure \ref{f:cross_corr} there is an example of applying cross-correlation to the line-scans from Figure \ref{f:pearson_corr_lines}. If the lines have $N$ number of pixels, the result will be a $2N-1$ size vector of correlation coefficients, in this case the lines have 2048 pixels and the result from cross-correlation is a 4095 length vector with the correlation coefficients for each shift. In this example the displacement calculated was also 80 pixels with a correlation value of 0.9305.
\begin{figure}[htbp]
\centering
\includegraphics[width=12cm, height=4cm]{./Imagens/cross_corr_lines.eps}
\caption{Cross-correlation coefficients between the two line scans.}
\label{f:cross_corr}
\end{figure}