In [ 16 ] an approach for collision avoidance of a UAV using optical flow is discussed, which was evaluated by simulations only. In contrast to that this paper shows reliable empirical data from autonomous experimental flights under real indoor conditions and evaluated with an independent optical tracking system.

Though approaches using vision-based position sensors for autonomous navigation exist [ 10 – 16 ], those are either not suitable or not adaptable to our system, since a low-cost, quick, accurate, reliable and simple solution was required. Complex solutions suffer from a high computational and implementation burden as well as higher risk of failure because of unknown system behaviour. A vision based SLAM (simultaneous localization and mapping) algorithm is presented in [ 12 ], but the high computational burden is done on an external CPU, breaking our definition of autonomy. This is also the case for the optical flow computations of [ 14 ] and [ 15 ], which must be performed on an external computer because of the complexity of the design and the high computational burden. Therefore, this paper presents a simple design and a quick solution for autonomous quadrocopter navigation using only the principle of optical flow for positioning. The presented solution can easily be fused with other solutions and is adaptable to any system as well as extendable. This paper explains in detail a realization of this approach for autonomous flight, its capability and its drawbacks.

Due to the progress in sensor, actuator and processor technology and the associated reduced costs, combined with the improved performance of such parts, the construction of semi-autonomous and fully autonomous quadrocopters is possible today [ 1 – 4 ]. As there are different definitions of an autonomous system, this phrase is used here for a system that can operate completely independently from any external devices for all of its functionality. In comparison to systems using GPS [ 2 – 4 ] or Optical Tracking Cameras [ 5 ] for positioning, which would better be called semi-autonomous, autonomous systems are capable of operating in unknown and GPS-denied environments such as in houses, caves, tunnels or any other places where no accurate GPS signal is available, since they are not dependent on an external device or signal. There exist many different approaches for an autonomous system using ultrasonic sensors [ 6 ], infrared sensors [ 3 ], laser scanners [ 7 , 8 ], stereo cameras or the Kinect camera from Microsoft [ 9 ]. Each suffers from its own drawbacks, which are reliability, price, weight and size. For reliability reasons a multi-sensor system is mandatory and video camera based systems benefit from low weight, price and size.

One advantage of this algorithm is that it can easily be implemented using one loop through all pixels. Furthermore, the inversion of a 2×2 matrix presents no difficulty, so the algorithm is for real time execution. [ 21 ] presents a speed-up version of the Lucas-Kanade-Method and [ 22 ] provides an open-source implementation of this algorithm.

A derivation of Equation 2 can be also found in [ 20 ], but another explanation is that it follows as a result of the simple substitution of the partial derivative (Equation 1 ) by the potential difference (Equation 3 ) with respect to the axis or time.

This optimization can be implemented by applying the algorithm of Srinivasan [ 19 ]. It can be simplified by comparing the intensities of Pixels [ Figure 1 ] and reducing M to 1. The current valid pixel P t is compared with the previous pixel P t-1 and is scaled within the left and right neighbour P 2 and P 1 as well as the upper and lower neighbour P 4 and P 2 . With Equation 2 the optical flow can then be computed.

Here P x (i,j), P y (i,j) and P t (i,j) are the partial intensity derivatives of point P(i, j) in the x- or y-axis or after time t and u and v are the searched optical flow values in the x-and y-axis.

Lucas and Kanade [ 17 ] assume that the change between two pictures is small and constant. This means that the transformation is valid within the neighbourhood M and a translation in x-y-plane is assumed, not taking rotations and translations in the z-axis into account. Comparing two iterative pictures, this leads to the following least-squares optimization solution [ 19 ]:

The most common methods for optical flow computation are differential, matching, energy-based and phase-based. Matching and phase-based methods have a high computational burden. Since the differential method of Lucas and Kanade [ 17 ] is widespread and shows acceptable computational burden and good performance [ 18 ], it was decided to implement this method on different hardware to compare the result with the ADNS-3080 optical flow sensor.

The implemented differential method assumes that the transformation between two pictures is not higher than one pixel. Therefore, only speeds according to the frame-rate are measurable. For the OV7670 this means that 30 pixel translations per second are completely detectable. This could be an explanation for the poor performance of the OV7670, since the ADNS-3080 provides 6400fps.

Only the OV7670 and the ADNS-3080 seemed to be suitable for our requirements, so a closer evaluation of both systems was made. Both sensors were moved about 20cm back and forth in one direction with different speeds and the results were compared [ Figure 2 – 3 ]. Both systems are capable of detecting the movement and its direction with the principle of optical flow. Furthermore, it showed clearly that the implementation of the OV7670 with 30fps dependents on the speed, while this is not the case for the ADNS-3080.

The ADNS-3080 is an optical mouse sensor and is available on a breakout-board with SPI interface as a ready optical flow sensor for two degrees of freedom. The sensor has a resolution of 30×30 pixels and achieves 6400fps. For a shorter shutter speed and better results under bad lighting conditions, the frame rate was set to 2000fps. The board cost about $40 and is small (2×3cm). Its biggest disadvantage is that the software is not open source and its functionality therefore is unknown, not adaptable and not extendable. Also, dust and dirt are a big problem. The lens is replaceable and it was switched to a focal length of f = 16mm for a better focus on objects with a distance of about 1m.

The OV7670 is a VGA (colour) camera chip with a resolution of 640×480. For data processing the STM32 F4 Discovery Board [ 27 ] was used and its 8bit parallel data interface was connected to the sensor for reading pictures. The SCCB, an I 2 C like interface, is for sending commands to the camera (configuration) only. Because of the limited memory of the microprocessor, QVGA instead of VGA format was used, which is also supported by the camera. From this 320×240 QVGA picture only a 64×64 pixel sub-window is processed, also because of the limited memory (192kB RAM) of the microprocessor. With this system 30fps was achieved.

The CMUCam 4 is a microcontroller-board for computer vision with an open source firmware and 32kB memory. The processor has eight cores and is programmed in SPIN. Therefore, it has a high initial training effort and the low memory is a problem. Furthermore, the price and size of this system are not suitable for our application.

The µCAM is easy to connect using USART, but we only achieved a maximal sample rate of 13fps, which is too slow for the differential method and our application. Reasons for this were the slow USART communication interface and a processing delay of the camera.

The Tam2 has a resolution of 16×16 (black-white) with 25fps. Centeye provides a breakout-board with an implementation of the optical flow algorithm. This is a good starting point and the system showed good results for detecting the movement of nearby objects, but because of the fixed lens and low-resolution, more distant objects were blurred.

The goal was to implement an optical flow sensor pointing to the ground in order to compute the relative position of a quadrocopter flying at a height of about 1m. To add this to our autonomous quadrocopter for navigation, different solutions were investigated. Because of the high data volume and rate of visual systems as well as the limited memory and computational power on board the quadrocopter, picking the right camera sensor is not trivial. Constraints on the camera are a simple interface to a microprocessor, frame rate, resolution, price and weight. A closer look was taken at the Centeye Vision Chip Tam2 [ 22 ], the 4D USART µCAM [ 23 ], the CMUCam 4 [ 24 ], the OV7670 from OmniVision [ 25 ] and the ADNS-3080 [ 26 ].

For debugging and evaluation purposes, as well as to control the quadrocopter, a control program was developed using Qt [ 30 ]. The program was used to display pictures of the OV7670 and the ADNS-3080 and to trace and change parameters. The position of the quadrocopter, scaled in metres, can be tracked on a 2D map [ Figure 7 ], using either the optical tracking system as a reference or the optical flow sensor. The first one corresponds to the true position and the second one to the assumed position used by the quadrocopter for position control and autonomous flight. Using mouse clicks, set points for positions can be changed in real time and sent to the quadrocopter remotely (Bluetooth).

The optical flow estimation is the input (measurement) of the two position controllers (forward, sideward). The 2 nd input for the forward or sideward controller is the set point for position, x or y respectively, which can be changed remotely. This enables, together with the changeable height, an autonomous 3D flight. The differences between the set points and optical estimations are the errors, which are the inputs of the two PID position controllers (x- and y-axis). The outputs of the position controllers are the set point of the roll and pitch axis attitude controllers.

For each degree of freedom an empiric optimized PID controller was implemented. This cumulates for a six DOF system to six cascading PID controllers [ Figure 6 ]. The height estimation and the set height are the inputs of the height control, which determine the total voltage of all four motors and regulate in this way the lift of the system. One fourth of the voltage corresponds to an 8bit value and the total voltage is distributed to the four motors according to the outputs of the attitude control. This fusion is done by superposition and the restrictions of the motors have to be considered here.

The resulting scaling factors F s are 259.51, 257.64 and 259.21. Because the minima-maxima-method is the simplest method and showed the same results as MLS, this method can be used for scaling the sensor into metres.

The result needs to be scaled according to height. To find the correct scaling factor, F s , the sensor was moved several times 2m along one direction at a height of 1m and its translation was tracked with the optical tracking system PPT X4 [ 29 ] as the true reference value [ Figure 5 ].

The position estimation is carried out, integrating the optical flow measurements. The used parameters of the ADNS-3080 are 2000fps, 30×30 pixel resolution, 400 CPI and automatic shutter speed. These are the default parameters of the sensor. A higher frame rate achieved no better results under normal light conditions.

The hardware-design of the quadrocopter is shown in Figure 4 . The total price for all hardware components is about €500. The fusion of the infrared, ultrasonic, pressure and inertial sensor for height estimation is described in [ 28 ].

5. Evaluation (Autonomous UAV)

The capabilities of the system were extensively evaluated [20]. The evaluation consists of the static position hold, a simple position change as an easy dynamic test case and two complex dynamic test cases performing a fully autonomous flight.

5.2 Dynamic Control To investigate the control behaviour of the system, an experiment was performed, where the quadrocopter had to react to a step response. The quadrocopter was in position hold at p 0 = (x = 0, y = 0) and a new set point p 1 = (x = 2m, y = 0) was set remotely. Figure 11 shows the reaction of the system. Download Open in new tab Download in PowerPoint The figure proves the stability of the controller. The new set point is reached within about 3s (rise time). The overshoot is about 15% (ca. 0.3m) and the settling time is between 7s and 9s. The final position error is about 0.15m. This could be a scaling error, but then a linear transformation would exist, which converts one line into the other. Taking a closer look at the graph, this is not the case [Figure 11]. The source of error must be somewhere else. Since the quadrocopter rotates around its pitch axis to achieve a new position on the x-axis and pitch rotations also cause incorrectly measured position changes through the optical flow sensor, this could be an explanation. This is most likely, since Figure 11 shows that this offset corresponds to a similar difference between both curves at 85s. At 85s the quadrocopter received the new set point and pitched. This pitching (pitch axis rotation) causes an incorrect optical flow measurement, which can be seen in the figure. Though the optical tracking shows that the quadrocopter is moving forward, the odometry indicates incorrectly the opposite at the beginning of the manoeuvre.