Determining the position and orientation of a robot by analyzing associated camera images
Theoptical flow vector of a moving object in a video sequence
Inrobotics andcomputer vision,visual odometry is the process of determining the position and orientation of a robot by analyzing the associated camera images. It has been used in a wide variety of robotic applications, such as on theMars Exploration Rovers.[1]
Innavigation,odometry is the use of data from the movement of actuators to estimate change in position over time through devices such asrotary encoders to measure wheel rotations. While useful for many wheeled or tracked vehicles, traditional odometry techniques cannot be applied tomobile robots with non-standard locomotion methods, such aslegged robots. In addition, odometry universally suffers from precision problems, since wheels tend to slip and slide on the floor creating a non-uniform distance traveled as compared to the wheel rotations. The error is compounded when the vehicle operates on non-smooth surfaces. Odometry readings become increasingly unreliable as these errors accumulate and compound over time.
Visual odometry is the process of determining equivalent odometry information using sequential camera images to estimate the distance traveled. Visual odometry allows for enhanced navigational accuracy in robots or vehicles using any type of locomotion on any[citation needed] surface.
Traditional VO's visual information is obtained by the feature-based method, which extracts the image feature points and tracks them in the image sequence. Recent developments in VO research provided an alternative, called the direct method, which uses pixel intensity in the image sequence directly as visual input. There are also hybrid methods.
Check flow field vectors for potential tracking errors and remove outliers.[7]
Estimation of the camera motion from the optical flow.[8][9][10][11]
Choice 1:Kalman filter for state estimate distribution maintenance.
Choice 2: find the geometric and 3D properties of the features that minimize acost function based on the re-projection error between two adjacent images. This can be done by mathematical minimization orrandom sampling.
Periodic repopulation of trackpoints to maintain coverage across the image.
An alternative to feature-based methods is the "direct" or appearance-based visual odometry technique which minimizes an error directly in sensor space and subsequently avoids feature matching and extraction.[4][12][13]
Another method, coined 'visiodometry' estimates the planar roto-translations between images usingPhase correlation instead of extracting features.[14][15]
Egomotion is defined as the 3D motion of a camera within an environment.[16] In the field ofcomputer vision, egomotion refers to estimating a camera's motion relative to a rigid scene.[17] An example of egomotion estimation would be estimating a car's moving position relative to lines on the road or street signs being observed from the car itself. The estimation of egomotion is important inautonomous robot navigation applications.[18]
The goal of estimating the egomotion of a camera is to determine the 3D motion of that camera within the environment using a sequence of images taken by the camera.[19] The process of estimating a camera's motion within an environment involves the use of visual odometry techniques on a sequence of images captured by the moving camera.[20] This is typically done usingfeature detection to construct anoptical flow from two image frames in a sequence[16] generated from either single cameras or stereo cameras.[20] Using stereo image pairs for each frame helps reduce error and provides additional depth and scale information.[21][22]
Features are detected in the first frame, and then matched in the second frame. This information is then used to make the optical flow field for the detected features in those two images. The optical flow field illustrates how features diverge from a single point, thefocus of expansion. The focus of expansion can be detected from the optical flow field, indicating the direction of the motion of the camera, and thus providing an estimate of the camera motion.
There are other methods of extracting egomotion information from images as well, including a method that avoids feature detection and optical flow fields and directly uses the image intensities.[16]
^Scaramuzza, D.; Siegwart, R. (October 2008). "Appearance-Guided Monocular Omnidirectional Visual Odometry for Outdoor Ground Vehicles".IEEE Transactions on Robotics.24 (5):1015–1026.doi:10.1109/TRO.2008.2004490.hdl:20.500.11850/14362.S2CID13894940.
^Corke, P.; Strelow, D.; Singh, S. "Omnidirectional visual odometry for a planetary rover".Intelligent Robots and Systems, 2004.(IROS 2004). Proceedings. 2004 IEEE/RSJ International Conference on. Vol. 4.doi:10.1109/IROS.2004.1390041.
^Campbell, J.; Sukthankar, R.; Nourbakhsh, I.; Pittsburgh, I.R. "Techniques for evaluating optical flow for visual odometry in extreme terrain".Intelligent Robots and Systems, 2004.(IROS 2004). Proceedings. 2004 IEEE/RSJ International Conference on. Vol. 4.doi:10.1109/IROS.2004.1389991.
^Sunderhauf, N.; Konolige, K.; Lacroix, S.; Protzel, P. (2005). "Visual odometry using sparse bundle adjustment on an autonomous outdoor vehicle". In Levi; Schanz; Lafrenz; Avrutin (eds.).Tagungsband Autonome Mobile Systeme 2005(PDF). Reihe Informatik aktuell. Springer Verlag. pp. 157–163. Archived fromthe original(PDF) on 2009-02-11. Retrieved2008-07-10.
^Konolige, K.; Agrawal, M.; Bolles, R.C.; Cowan, C.; Fischler, M.; Gerkey, B.P. (2008). "Outdoor Mapping and Navigation Using Stereo Vision".Experimental Robotics. Springer Tracts in Advanced Robotics. Vol. 39. pp. 179–190.doi:10.1007/978-3-540-77457-0_17.ISBN978-3-540-77456-3.
^Zaman, M. (2007). "High Precision Relative Localization Using a Single Camera".Robotics and Automation, 2007.(ICRA 2007). Proceedings. 2007 IEEE International Conference on.doi:10.1109/ROBOT.2007.364078.
^Zaman, M. (2007). "High resolution relative localisation using two cameras".Journal of Robotics and Autonomous Systems.55 (9):685–692.doi:10.1016/j.robot.2007.05.008.
^Burger, W.; Bhanu, B. (Nov 1990). "Estimating 3D egomotion from perspective image sequence".IEEE Transactions on Pattern Analysis and Machine Intelligence.12 (11):1040–1058.doi:10.1109/34.61704.S2CID206418830.