Abstract
State estimation error occurs due to the uncertainty of sensor measurements. By using multiple sensors fusion technical, the error can be bounded and reduced. The multi-sensor fusion system (MSFS) is a kernel technology to develop a navigation system, in which the simultaneous localization and mapping (SLAM) based on the MSFS is an essential solution for autonomous mobile robots. In this paper, we present a concise study on the MSFS towards the visual-inertial navigation system(VINS), which is imitated the human localization system comprised of inertial sensors and cameras. Firstly, this paper introduces the fundamentals of the MSFS and the inertial sensor-based kinetic model. Secondly, state-of-the-art methodologies and a concise review of VINS are presented. Then, a summary of modern VINS frameworks is indicated to come up with a robust SLAM structure. Finally, the challenges and discussions of MSFS are given.