Robots are systems that combine sensing, actuation, computation, and communication. Except for computation, all of its sub-systems are subject to a high degree of uncertainty. This can be observed in daily life: phone calls often are of poor quality, making it hard to understand the other party, characters are difficult to read from far away,  the front wheels of your car slip when accelerating on a rainy road from a red light, or your wireless device has a hard time getting a connection. In robotics, measurements taken by on-board sensors are sensitive to changing environmental conditions and subject to electrical and mechanical limitations. Similarly, actuators are not accurate as joints and gears have backlash and wheels do slip. Finally, communication, in particular, wireless either via radio or infrared, is notoriously unreliable.

The goals of this lecture are to understand

  • how to treat uncertainty mathematically using probability theory
  • introduce how measurements with different uncertainty can be combined

A brief review on probability theory

As quantities such as “distance to a wall”, “position on the plane” or “I can see a blue cross” are uncertain, we can consider them random variables. A random variable can be thought of us the outcome of a “random” experiment, such as throwing a dice. (In this example, the random variable holds the outcome of the experiment.) Random variables can describe either discrete variables, such as the result from throwing a dice, or continuous variables such as measuring a distance. In order to learn about the likelihood that a random variable has a certain outcome, we can repeat the experiment many times and record the resulting random variates, that is the actual values of the random variable, and the number of times they occurred. For a perfectly cubic dice we will see that the random variable can hold natural numbers from 1 to 6, that have the same likelihood of 1/6.
The function that describes the probability of a random variable to take certain values is called a probability distribution.
As the likelihood of all possible random variates in the dice experiment is the same, the dice follows what we call a uniform distribution. More accurately, as the outcomes of rolling a dice are discrete numbers, it is actually a discrete uniform distribution. Most random variables are not uniformly distributed, but some variates are more likely than others. For example, when considering a random variable that describes the sum of two simultaneously thrown dice, we can see that the distribution is anything but uniform:
2: 1+1
3: 1+2, 2+1
4: 1+3, 2+2, 3+1
5: 1+4, 2+3, 3+2, 4+1
6: 1+5, 2+4, 3+3, 4+2, 5+1
7: 1+6, 2+5, 3+4, 4+3, 5+2, 6+1
8: 2+6, 3+5, 4+4, 5+3, 6+2
9: 3+6, 4+5, 5+4, 6+3
10: 4+6, 5+5, 6+4
11: 5+6, 6+5
12: 6+6
As one can see, there are many more possibilities to sum up to a 7 than there are to a 3, e.g. While it is possible to store probability distributions such as this one as a look-up table to predict the outcome of an experiment (or that of a measurement), it is hard to reason about the random process analytically. We therefore aim to describe the probability distributions of random processes that we need to deal with using one of the “standard” probability distribution functions (or more accurately probability density function). These functions have been well studied and come with a broad set of tools for their analysis.  Consequently, more often than not we use a well studied probability density function even if the underlying data doesn’t fully support its use for the simple reason that we can work its associated mathematical tools.
One of the most prominent distributions is the Gaussian or Normal Distribution. The Normal distribution is characterized by a mean and a variance. Here, the mean corresponds to the average value of a random variable (or the peak of the distribution) and the variance is a measure of how broadly variates are spread around the mean (or the width of the distribution).

Normal Distribution with different means and variances

 The Normal distribution is defined by the following function
where \mu is the mean and \sigma^2 the variance. (\sigma on its own is known as the standard deviation.) Then, f(x) is the probability for a random variable X to have value x.
The mean is calculated by
or in other words, each possible value x is weighted by its likelihood and added up.
Exercise: do this for the double dice experiment. Calculate the probability of each outcome from the table above, multiply it with its value and add them all up.
The variance is calculated as follows
or in other words, we calculate the deviation of each random variable from the mean, square it, and weigh it by its likelihood. Although it is tantalizing to perform this calculation also for the double dice experiment, the resulting value is questionable, as the double dice experiment does not follow a Normal distribution. We know this, because we actually enumerated all possible outcomes. For other experiments, such as grades in this class, we don’t know what the real distribution is.
The Normal Distribution is not limited to random processes with only one random variable. For example, the X/Y position of a robot in the plane is a random process with two dimensions. In case of a multi-variate distribution with k dimensions, the random variable X is a k-dimensional vector of random variables, \mu is a k-dimensional vector of means, and \sigma gets replaced with \Sigma,  a k by k dimensional covariance matrix (a matrix that carries the variances of each random variable in its diagonal).

Error Propagation

We will see that the Gaussian Distribution is actually very appropriate  to model the prominent random processes in robotics: the robot’s position and distance measurements. Results of a laser-scanner or GPS are more often than not close to the “real” value, or, in other words, readings that are far off are much less likely than readings that are very close. A robot that drives along a straight line, and is subject to slip, will actually increase its uncertainty the farther it drives. Initially at a known location, the expected value (or mean) of its position will be increasingly uncertain, corresponding to an increasing variance. This variance is obviously somehow related to the variance of the underlying mechanism, namely the slipping wheel and (comparably small) encoder noise. Similarly, when estimating distance and angle to a line feature, uncertainty of these random variables is somewhat related to the uncertainty of each point measured on the line. These relationships are formally captured by the error propagation law.
The key intuition behind the error propagation law is that the variance of each component that contributes to a random variable should be weighted as a function of how strongly this component influences this random variable. Measurements that have little effect on the aggregated random variable should also have little effect on its variance and vice versa. “How strongly” something affects something else can be expressed by the ratio of how little changes of something relate to little changes in something else. This is nothing else than the partial derivative of something with respect to something else.

Line Fitting

Lets consider an example of estimating angle \alpha and distance r of a line from a set of points given by (\rho_i,\theta_i). We can now express the relationship of changes of a variable such as \rho_i to changes in \alpha by
\frac{\partial \alpha}{\partial \rho_i}
Similarly, we can calculate \frac{\partial \alpha}{\partial \theta_i}\frac{\partial r}{\partial \rho_i} and \frac{\partial r}{\partial \theta_i}. We can actually do this, because we have derived analytical expressions for \alpha and r as a function of \theta_i and \rho_i just last week.
We are now interested in deriving equations for calculating the variance of \alpha and r as a function of the variances of the distance measurements. Lets assume, each distance measurement \rho_i has variance \sigma^2_{\rho_i} and each angular measurement \theta_i has variance \sigma^2_{\theta_i}. We now want to calculate \sigma^2_{\alpha} as the weighted sum of  \sigma^2_{\rho_i} and \sigma^2_{\theta_i}, each weighted by its influence on \alpha.
More generally, if we have I input variables X_i and J output variables Y_j, the covariance matrix of the output variables C_Y can be expressed as
where C_X is the covariance matrix of input variables and F_X is a Jacobian matrix of a function f that calculates Y from X and has the form

F_x=\left[\begin{array}[ccc]\frac{\partial f_1}{\partial X_1} & \ldots & \frac{\partial f_1}{\partial X_I}\\\vdots & \ldots & \vdots\\\frac{\partial f_J}{\partial X_1} & \ldots & \frac{\partial f_J}{\partial X_I}\end{array}\right]

In the line fitting example F_X would contain the partial derivatives of \alpha with respect to all \rho_i (i-entries) followed by the partial derivates of \alpha with respect to all \theta_i in the first row. In the second row, F_X would hold the partial derivates of r with respect to \rho_i followed by the partial derivates of r with respect to \theta_i. As there are two output variables, \alpha and r, and 2*I input variables (each measurement consists of an angle and distance), F_X is a 2 x (2I) matrix.

The result is therefore a 2×2 covariance matrix that holds the variances of \alpha and r on its diagonal.


We are now interested in applying error propagation to a robot’s odometry model. With that we cannot only express its position, but also the variance of this estimate. This becomes extremely useful when reasoning about the robot’s next action. What do we need?

  1. What are the input variables and what are the output variables
  2. What are the functions that calculate output from input
  3. What is the variance of the input variables
As usual, we describe the robot by a tuple (x,y,\theta). These are the output variables (3). We can measure the distance each wheel travels \Delta s_r and \Delta s_l based on the encoder ticks and the known wheel radius. These are the input variables (2). We can now calculate the change in the robot’s position by calculating
\Delta x = \Delta s cos(\theta+\Delta \theta /2)
\Delta y= \Delta s sin(\theta+\Delta \theta/2)
\Delta \theta = \frac{\Delta s_r-\Delta s_l}{2}
\Delta s=\frac{\Delta s_r + \Delta s_l}{2}
The new robot’s position is then given by
f(x,y,\theta,\Delta s_r, \Delta s_l)=[x,y,\theta]^T + [\Delta x \qquad \Delta y \qquad \Delta \theta]^T
We thus have now a function that relates our measurements to our output variables. What makes things complicated here is that the output variables are a function of their previous values. Therefore, their variance does not only depend on the variance of the input variables, but also on the previous variance of the output variables. We can therefore write
\Sigma_{p'}=\nabla_p f \Sigma_p \nabla_p f^t + \nabla_{\Delta_{r,l}}f \Sigma_{\Delta}\nabla_{\Delta_{r,l}}f^T
The first term is the error propagation from a position p=[x,y,\theta] to a new position p'. For this we need to calculate the partial derivatives of f with respect to x, y and \theta. This is a 3×3 matrix
\nabla_p f=\left[\frac{\partial f}{\partial x} \quad \frac{\partial f}{\partial y} \quad \frac{\partial f}{\partial \theta}\right]=\left[\begin{array}{ccc}1 & 0 & -\Delta s sin(\theta +\Delta \theta /2)\\0 & 1 & \Delta s cos(\theta + \Delta \theta/2)\\0 & 0 &1\end{array}\right]
The second term is the error propagation of the actual wheel slip. This requires calculating the partial derivatives of f with respect to \Delta s_r and \Delta s_l, which is a 3×2 matrix. The first column contains the partial derivatives of x,y,\theta with respect to \Delta s_r. The second column contains the partial derivatives of x,y,\theta with respect to \Delta s_l.
Finally, we need to define the covariance matrix for the measurement noise. As the error is proportional to the distance travelled, we can define \Sigma_{\Delta} by
\Sigma_{\Delta}=\left[\begin{array}{cc}k_r|\Delta s_r| & 0\\0 & k_l|\Delta s_l|\end{array}\right]
Here k_r and k_l are constants that need to be found experimentally and |\cdot | indicating the absolute value of the distance traveled. We also assume that the error of the two wheels is independent, which is expressed by the zeros in the matrix.

Take-home lessons

  • Uncertainty can be expressed by means of a probability density function
  • More often than not, the Gaussian distribution is chosen as it allows treating error with powerful analytical tools
  • In order to calculate the uncertainty of a variable that is derived from a series of measurements, we need to calculate a weighted sum in which each measurement’s variance is weighted by its impact on the output variable. This impact is expressed by the partial derivative of the function relating input to output.

2 Responses to Introduction to Robotics #8: Uncertainty and Error Propagation

  1. Devin Younge says:

    This made sense up until the very end, with the rather complicated formula describing the total variance of the robot’s movement. I didn’t understand the connection between all the symbols and what they represented, and got lost in the terminology. Also, the formula itself was confusing, and would need to be broken down further for me to better understand it. Lastly, I think there was a rendering/html error somewhere near the end that rendered the explanation of a few of the variables nonsensical.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Set your Twitter account name in your settings to use the TwitterBar Section.