Robots are systems that combine sensing, actuation, computation, and communication. Except for computation, all of its sub-systems are subject to a high degree of uncertainty. This can be observed in daily life: phone calls often are of poor quality, making it hard to understand the other party, characters are difficult to read from far away, the front wheels of your car slip when accelerating on a rainy road from a red light, or your wireless device has a hard time getting a connection. In robotics, measurements taken by on-board sensors are sensitive to changing environmental conditions and subject to electrical and mechanical limitations. Similarly, actuators are not accurate as joints and gears have backlash and wheels do slip. Finally, communication, in particular, wireless either via radio or infrared, is notoriously unreliable.
The goals of this lecture are to understand
- how to treat uncertainty mathematically using probability theory
- introduce how measurements with different uncertainty can be combined
A brief review on probability theory
As quantities such as “distance to a wall”, “position on the plane” or “I can see a blue cross” are uncertain, we can consider them random variables. A random variable can be thought of us the outcome of a “random” experiment, such as throwing a dice. (In this example, the random variable holds the outcome of the experiment.) Random variables can describe either discrete variables, such as the result from throwing a dice, or continuous variables such as measuring a distance. In order to learn about the likelihood that a random variable has a certain outcome, we can repeat the experiment many times and record the resulting random variates, that is the actual values of the random variable, and the number of times they occurred. For a perfectly cubic dice we will see that the random variable can hold natural numbers from 1 to 6, that have the same likelihood of 1/6.
The function that describes the probability of a random variable to take certain values is called a probability distribution.
As the likelihood of all possible random variates in the dice experiment is the same, the dice follows what we call a uniform distribution. More accurately, as the outcomes of rolling a dice are discrete numbers, it is actually a discrete uniform distribution. Most random variables are not uniformly distributed, but some variates are more likely than others. For example, when considering a random variable that describes the sum of two simultaneously thrown dice, we can see that the distribution is anything but uniform:
3: 1+2, 2+1
4: 1+3, 2+2, 3+1
5: 1+4, 2+3, 3+2, 4+1
6: 1+5, 2+4, 3+3, 4+2, 5+1
7: 1+6, 2+5, 3+4, 4+3, 5+2, 6+1
8: 2+6, 3+5, 4+4, 5+3, 6+2
9: 3+6, 4+5, 5+4, 6+3
10: 4+6, 5+5, 6+4
11: 5+6, 6+5
As one can see, there are many more possibilities to sum up to a 7 than there are to a 3, e.g. While it is possible to store probability distributions such as this one as a look-up table to predict the outcome of an experiment (or that of a measurement), it is hard to reason about the random process analytically. We therefore aim to describe the probability distributions of random processes that we need to deal with using one of the “standard” probability distribution functions (or more accurately probability density function). These functions have been well studied and come with a broad set of tools for their analysis. Consequently, more often than not we use a well studied probability density function even if the underlying data doesn’t fully support its use for the simple reason that we can work its associated mathematical tools.
One of the most prominent distributions is the Gaussian or Normal Distribution. The Normal distribution is characterized by a mean and a variance. Here, the mean corresponds to the average value of a random variable (or the peak of the distribution) and the variance is a measure of how broadly variates are spread around the mean (or the width of the distribution).
The Normal distribution is defined by the following function
where is the mean and the variance. ( on its own is known as the standard deviation.) Then, is the probability for a random variable to have value .
The mean is calculated by
or in other words, each possible value is weighted by its likelihood and added up.
Exercise: do this for the double dice experiment. Calculate the probability of each outcome from the table above, multiply it with its value and add them all up.
The variance is calculated as follows
or in other words, we calculate the deviation of each random variable from the mean, square it, and weigh it by its likelihood. Although it is tantalizing to perform this calculation also for the double dice experiment, the resulting value is questionable, as the double dice experiment does not follow a Normal distribution. We know this, because we actually enumerated all possible outcomes. For other experiments, such as grades in this class, we don’t know what the real distribution is.
The Normal Distribution is not limited to random processes with only one random variable. For example, the X/Y position of a robot in the plane is a random process with two dimensions. In case of a multi-variate distribution with k dimensions, the random variable is a k-dimensional vector of random variables, is a k-dimensional vector of means, and gets replaced with , a k by k dimensional covariance matrix (a matrix that carries the variances of each random variable in its diagonal).
We will see that the Gaussian Distribution is actually very appropriate to model the prominent random processes in robotics: the robot’s position and distance measurements. Results of a laser-scanner or GPS are more often than not close to the “real” value, or, in other words, readings that are far off are much less likely than readings that are very close. A robot that drives along a straight line, and is subject to slip, will actually increase its uncertainty the farther it drives. Initially at a known location, the expected value (or mean) of its position will be increasingly uncertain, corresponding to an increasing variance. This variance is obviously somehow related to the variance of the underlying mechanism, namely the slipping wheel and (comparably small) encoder noise. Similarly, when estimating distance and angle to a line feature, uncertainty of these random variables is somewhat related to the uncertainty of each point measured on the line. These relationships are formally captured by the error propagation law.
The key intuition behind the error propagation law is that the variance of each component that contributes to a random variable should be weighted as a function of how strongly this component influences this random variable. Measurements that have little effect on the aggregated random variable should also have little effect on its variance and vice versa. “How strongly” something affects something else can be expressed by the ratio of how little changes of something relate to little changes in something else. This is nothing else than the partial derivative of something with respect to something else.
Lets consider an example of estimating angle and distance of a line from a set of points given by . We can now express the relationship of changes of a variable such as to changes in by
Similarly, we can calculate , and . We can actually do this, because we have derived analytical expressions for and as a function of and just last week.
We are now interested in deriving equations for calculating the variance of and as a function of the variances of the distance measurements. Lets assume, each distance measurement has variance and each angular measurement has variance . We now want to calculate as the weighted sum of and , each weighted by its influence on .
More generally, if we have input variables and output variables , the covariance matrix of the output variables can be expressed as
where is the covariance matrix of input variables and is a Jacobian matrix of a function that calculates from and has the form
In the line fitting example would contain the partial derivatives of with respect to all (i-entries) followed by the partial derivates of with respect to all in the first row. In the second row, would hold the partial derivates of with respect to followed by the partial derivates of with respect to . As there are two output variables, and , and 2*I input variables (each measurement consists of an angle and distance), is a 2 x (2I) matrix.
The result is therefore a 2×2 covariance matrix that holds the variances of and on its diagonal.
We are now interested in applying error propagation to a robot’s odometry model. With that we cannot only express its position, but also the variance of this estimate. This becomes extremely useful when reasoning about the robot’s next action. What do we need?
- What are the input variables and what are the output variables
- What are the functions that calculate output from input
- What is the variance of the input variables
As usual, we describe the robot by a tuple . These are the output variables (3). We can measure the distance each wheel travels and based on the encoder ticks and the known wheel radius. These are the input variables (2). We can now calculate the change in the robot’s position by calculating
The new robot’s position is then given by
We thus have now a function that relates our measurements to our output variables. What makes things complicated here is that the output variables are a function of their previous values. Therefore, their variance does not only depend on the variance of the input variables, but also on the previous variance of the output variables. We can therefore write
The first term is the error propagation from a position to a new position . For this we need to calculate the partial derivatives of with respect to x, y and . This is a 3×3 matrix
The second term is the error propagation of the actual wheel slip. This requires calculating the partial derivatives of with respect to and , which is a 3×2 matrix. The first column contains the partial derivatives of with respect to . The second column contains the partial derivatives of with respect to .
Finally, we need to define the covariance matrix for the measurement noise. As the error is proportional to the distance travelled, we can define by
Here and are constants that need to be found experimentally and indicating the absolute value of the distance traveled. We also assume that the error of the two wheels is independent, which is expressed by the zeros in the matrix.
- Uncertainty can be expressed by means of a probability density function
- More often than not, the Gaussian distribution is chosen as it allows treating error with powerful analytical tools
- In order to calculate the uncertainty of a variable that is derived from a series of measurements, we need to calculate a weighted sum in which each measurement’s variance is weighted by its impact on the output variable. This impact is expressed by the partial derivative of the function relating input to output.