Thanks for stopping by! Please note that this page will not be updated anymore and all content has been moved into an open-source textbook. The books is available open-source on github, compiled as PDF, and in print on Amazon.com.
Simultaneous Localization and Mapping (SLAM) is the “hen-egg” problem of localizing both a robot as well features on a map at the same time. Usually, a robot obtains an initial estimate of where it is using some onboard sensors (odometry, optical flow, etc.) and uses this estimate to localize features (walls, corners, graphical patterns) in the environment. As soon as a robot revisits the same feature twice, it can update the estimate on its location. This is because the variance of an estimate based on two independent measurements will always be smaller than any of the variances of the individual measurements. As consecutive observations are not independent, but rather closely correlated, the refined estimate can then be propagated along the robot’s path. This is formalized in EKF-based SLAM. A more intuitive understanding is provided by a spring-mass analogy: each possible pose (mass) is constrained to its neighboring pose by a spring. The higher the uncertainty of the relative transformation between two poses (e.g., obtained using odometry), the weaker the spring. Every time a robot gains confidence on a relative pose, the spring is stiffened instead. Eventually, all poses will be pulled in place. This approach is known as Graph-Based SLAM. This lecture has two parts:
- Part I: Formulating SLAM as a maximum-likelihood estimation (MLE) problem.
- Part II: Efficient numerical techniques to approximate solutions to the SLAM MLE.
Part I: SLAM as a Maximum-Likelihood Estimation Problem
The classical formulation of SLAM describes the problem as maximizing the posterior probability of all points on the robot’s trajectory given the odometry input and the observations. Formally,
where are all discrete positions from time 1 to time , are the observations, and are the odometry measurements. This formulation makes heavily use of the temporal structure of the problem. In practice, solving the SLAM problem requires
- A motion update model, i.e., the probability to be at location given an odometry measurement and being at location .
- A sensor model, i.e., the probability to make observation given the robot is at location and the map .
A possible solution to this problem is provided by the Extended Kalman Filter, which maintains a probability density function for the robot pose as well as the positions of all features on the map. Being able to uniquely identify features in the environment is of outmost importance and is known as the data association problem. Graph-based SLAM does not solve this problem and will fail if features are confused.
In graph-based SLAM, a robot’s trajectory forms the nodes of a graph whose edges are transformations (translation and rotation) that have a variance associated with it. An alternative view is the spring-mass analogy mentioned above. Instead of having each spring wiggle a node into place, graph-based SLAM aims at finding those locations that maximize the joint likelihood of all observations. As such, graph-based SLAM is a maximum likelihood estimation problem.
Lets revisit the normal distribution:
It provides the probability for a measurement to have value given that this measurement is normal distributed with mean and variance . We can now associate such a distribution with every node-to-node transformation, aka constraint. This can be pairs of distance and angle, e.g. In the literature, e.g., [Gris2010], the measurement of a transformation between node i and a node j is denoted . Its expected value is denoted . This value is expected for example based on a map of the environment that consists of previous observations.
Formulating a normal distribution of measurements with mean and a covariance matrix (containing all variances of the components of in its diagonal) is now straightforward. As graph-based SLAM is most often formulated as information filter context, usually the inverse of the covariance matrix (aka information matrix) is used, which we denote by .
As we are interested in maximizing the joint probability of all measurements over all edge pairings following the maximum likelihood estimation framework, it is customary to express the PDF using the log-likelihood. By taking the natural logarithm on both sides of the PDF expression, the exponential function vanishes and becomes or , where is the log-likelihood distribution for .
Again, the log-likelihood for observation is directly derived from the definition of the normal distribution, but using the information matrix instead of the covariance matrix and is ridden of the exponential function by taking the logarithm on both sides.
The optimization problem can now be formulated by
with the error between measurement and expected value. Note that the sum actually needs to be minimized as the individual terms are technically the negative log-likelihood.
Part II: Numerical Techniques for Graph-based SLAM
Solving the MLE problem is non-trivial, especially if the number of constraints provided is large. A classical approach is to linearize the problem at the current configuration and reducing it to a problem of the form . The intuition here is to calculate the impact of small changes in the positions of all nodes on all , refer to [Gris2010, Section IV.A] for details. After performing this motion, linearization and optimization can be repeated until convergence. More recently, more powerful numerical methods have been presented in [Ols06] and [Gris07], e.g., with source code available from the authors websites.
Instead of solving the MLE, [Ols06] proposes a stochastic gradient descent algorithm. A gradient descent algorithm is an iterative approach to find the optimum of a function by moving along its gradient. Whereas a gradient descent algorithm would calculate the gradient on a fitness landscape from all available constraints, a stochastic gradient descent picks only a (non-necessarily random) subset. Intuitive examples are fitting a line to a set of n points, but taking only a subset of these points when calculating the next best guess. As gradient descent works iteratively, the hope is that the algorithm takes a large part of the constraints into account. For solving Graph-based SLAM, a stochastic gradient descent algorithm would not take into account all constraints available to the robot, but iteratively work on one constraint after the other. Here, constraints are observations on the mutual pose of nodes i and j. Optimizing these constraints now requires moving both nodes i and j so that the error between where the robot thinks the nodes should be and what it actually sees gets reduced. As this is a trade-off between multiple, maybe conflicting observations, the result will approximate a Maximum Likelihood estimate.
More specifically, with the error between an observation and what the robot expects to see, based on its previous observation and sensor model, the approach described in [Ols06] distributes the error along the entire trajectory between both features that are involved in the constraint. That is, if the constraint involves features i and j, not only i and j‘s pose will be updated but all points inbetween will be moved a tiny bit.
This approach is cumbersome and quickly gets out of control if a robot is mapping an environment over multiple hours – leading to millions of nodes in the graph and constraints. To overcome this problem, [Gris07] propose to (1) merge nodes of a graph as it is build up by relying on accurate localization of the robot within the existing map and (2) to chose a different graph representation.
In Graph-based SLAM, edges encode the relative translation and rotation from one node to the other. Thus, altering a relationship between two nodes will automatically propagate to all nodes in the network. In [Ols06] the graph is essentially a chain of nodes whose edges consist of odometry measurements. This chain then becomes a graph whenever observations (using any sensor) introduce additional constraints. Whenever such a “loop-closure” occurs, the resulting error will be distributed over the entire trajectory that connects the two nodes. This is not always necessary, for example when considering the robot driving a figure-8 pattern. If a loop-closure occurs in one half of the 8, the nodes in the other half of the 8 are probably not involved.
[Gris07] addresses this problem by constructing a minimum spanning-tree (MST) of the constraint graph. The MST is constructed by doing a Depth-First Search (DFS) on the constraint graph following odometry constraints. At a loop-closure, i.e., an edge in the graph that imposes a constraint to a previously seen pose, the DFS backtracks to this node and continues from there to construct the spanning tree. Updating all poses affected by this new constraint still requires modifying all nodes along the path between the two features that are involved, but inserting additional constraints is greatly simplified. Whenever a robot observes new relationships between any two nodes, only the nodes on the shortest path between the two features on the MST need to be updated. Example graphs illustrating this are shown in Figures 2a and 2b in [Gris07].
[Gris2010] G. Grisetti, R. Kuemmerle, C. Stachniss and W. Burgard. A Tutorial on Graph-Based SLAM. IEEE Intelligent Transportation Systems Magazine, 2(4):31-43, 2010.
[Ols06] E. Olson, J. Leonard and S. Teller. Fast Iterative Alignment of Pose Graphs with Poor Initial Estimates. Proc. of ICRA, pp 2262-2269, Orlando, FL, 2006. Source code.
[Gris07] G. Grisetti, C. Stachniss, S. Grzonka and W. Burgard. A Tree Parameterization for Efficiently Computing Maximum Likelihood Maps using Gradient Descent. Robotics: Science and Systems (RSS), Atlanta, GA, USA, 2007. Source code.