Robots are systems that combine sensing, actuation, computation, and communication. Except for computation, all of its sub-systems are subject to a high degree of uncertainty. This can be observed in daily life: phone calls often are of poor quality, making it hard to understand the other party, characters are difficult to read from far away, the front wheels of your car slip when accelerating on a rainy road from a red light, or your wireless device has a hard time getting a connection. In robotics, measurements taken by on-board sensors are sensitive to changing environmental conditions and subject to electrical and mechanical limitations. Similarly, actuators are not accurate as joints and gears have backlash and wheels do slip. Finally, communication, in particular, wireless either via radio or infrared, is notoriously unreliable.
The goals of this lecture are to understand
- how to treat uncertainty mathematically using probability theory
- introduce how measurements with different uncertainty can be combined
A brief review on probability theory
As quantities such as “distance to a wall”, “position on the plane” or “I can see a blue cross” are uncertain, we can consider them random variables
. A random variable can be thought of us the outcome of a “random” experiment, such as throwing a dice. (In this example, the random variable holds the outcome of the experiment.) Random variables can describe either discrete variables, such as the result from throwing a dice, or continuous variables such as measuring a distance. In order to learn about the likelihood that a random variable has a certain outcome, we can repeat the experiment many times and record the resulting random variates, that is the actual values of the random variable, and the number of times they occurred. For a perfectly cubic dice we will see that the random variable can hold natural numbers from 1 to 6, that have the same likelihood of 1/6.
The function that describes the probability of a random variable to take certain values is called a probability distribution
As the likelihood of all possible random variates in the dice experiment is the same, the dice follows what we call a uniform distribution
. More accurately, as the outcomes of rolling a dice are discrete numbers, it is actually a discrete uniform distribution
. Most random variables are not uniformly distributed, but some variates are more likely than others. For example, when considering a random variable that describes the sum of two simultaneously thrown dice, we can see that the distribution is anything but uniform:
3: 1+2, 2+1
4: 1+3, 2+2, 3+1
5: 1+4, 2+3, 3+2, 4+1
6: 1+5, 2+4, 3+3, 4+2, 5+1
7: 1+6, 2+5, 3+4, 4+3, 5+2, 6+1
8: 2+6, 3+5, 4+4, 5+3, 6+2
9: 3+6, 4+5, 5+4, 6+3
10: 4+6, 5+5, 6+4
11: 5+6, 6+5
As one can see, there are many more possibilities to sum up to a 7 than there are to a 3, e.g. While it is possible to store probability distributions such as this one as a look-up table to predict the outcome of an experiment (or that of a measurement), it is hard to reason about the random process analytically. We therefore aim to describe the probability distributions of random processes that we need to deal with using one of the “standard” probability distribution functions
(or more accurately probability density function
). These functions have been well studied and come with a broad set of tools for their analysis. Consequently, more often than not we use a well studied probability density function even if the underlying data doesn’t fully support its use for the simple reason that we can work its associated mathematical tools.
One of the most prominent distributions is the Gaussian or Normal Distribution
. The Normal distribution is characterized by a mean
and a variance
. Here, the mean corresponds to the average value of a random variable (or the peak of the distribution) and the variance is a measure of how broadly variates are spread around the mean (or the width of the distribution).
Normal Distribution with different means and variances
The Normal distribution is defined by the following function
is the mean and
the variance. (
on its own is known as the standard deviation
is the probability for a random variable
to have value
The mean is calculated by
or in other words, each possible value
is weighted by its likelihood and added up.
Exercise: do this for the double dice experiment. Calculate the probability of each outcome from the table above, multiply it with its value and add them all up.
The variance is calculated as follows
or in other words, we calculate the deviation of each random variable from the mean, square it, and weigh it by its likelihood. Although it is tantalizing to perform this calculation also for the double dice experiment, the resulting value is questionable, as the double dice experiment does not follow a Normal distribution. We know this, because we actually enumerated all possible outcomes. For other experiments, such as grades in this class, we don’t know what the real distribution is.
The Normal Distribution is not limited to random processes with only one random variable. For example, the X/Y position of a robot in the plane is a random process with two dimensions. In case of a multi-variate distribution
with k dimensions, the random variable
is a k-dimensional vector of random variables,
is a k-dimensional vector of means, and
gets replaced with
, a k by k dimensional covariance matrix (a matrix that carries the variances of each random variable in its diagonal).
We will see that the Gaussian Distribution is actually very appropriate to model the prominent random processes in robotics: the robot’s position and distance measurements. Results of a laser-scanner or GPS are more often than not close to the “real” value, or, in other words, readings that are far off are much less likely than readings that are very close. A robot that drives along a straight line, and is subject to slip, will actually increase its uncertainty the farther it drives. Initially at a known location, the expected value (or mean) of its position will be increasingly uncertain, corresponding to an increasing variance. This variance is obviously somehow related to the variance of the underlying mechanism, namely the slipping wheel and (comparably small) encoder noise. Similarly, when estimating distance and angle to a line feature, uncertainty of these random variables is somewhat related to the uncertainty of each point measured on the line. These relationships are formally captured by the error propagation law.
The key intuition behind the error propagation law is that the variance of each component that contributes to a random variable should be weighted as a function of how strongly this component influences this random variable. Measurements that have little effect on the aggregated random variable should also have little effect on its variance and vice versa. “How strongly” something affects something else can be expressed by the ratio of how little changes of something relate to little changes in something else. This is nothing else than the partial derivative of something with respect to something else.
Lets consider an example of estimating angle
of a line from a set of points given by
. We can now express the relationship of changes of a variable such as
to changes in
Similarly, we can calculate
. We can actually do this, because we have derived analytical expressions for
as a function of
just last week.
We are now interested in deriving equations for calculating the variance of
as a function of the variances of the distance measurements. Lets assume, each distance measurement
and each angular measurement
. We now want to calculate
as the weighted sum of
, each weighted by its influence on
More generally, if we have
, the covariance matrix of the output variables
can be expressed as
is the covariance matrix of input variables and
is a Jacobian matrix
of a function
and has the form
In the line fitting example would contain the partial derivatives of with respect to all (i-entries) followed by the partial derivates of with respect to all in the first row. In the second row, would hold the partial derivates of with respect to followed by the partial derivates of with respect to . As there are two output variables, and , and 2*I input variables (each measurement consists of an angle and distance), is a 2 x (2I) matrix.
The result is therefore a 2×2 covariance matrix that holds the variances of and on its diagonal.
We are now interested in applying error propagation to a robot’s odometry model. With that we cannot only express its position, but also the variance of this estimate. This becomes extremely useful when reasoning about the robot’s next action. What do we need?
- What are the input variables and what are the output variables
- What are the functions that calculate output from input
- What is the variance of the input variables
As usual, we describe the robot by a tuple
. These are the output variables (3). We can measure the distance each wheel travels
based on the encoder ticks and the known wheel radius. These are the input variables (2). We can now calculate the change in the robot’s position by calculating
The new robot’s position is then given by
We thus have now a function that relates our measurements to our output variables. What makes things complicated here is that the output variables are a function of their previous values. Therefore, their variance does not only depend on the variance of the input variables, but also on the previous variance of the output variables. We can therefore write
The first term is the error propagation from a position
to a new position
. For this we need to calculate the partial derivatives of
with respect to x, y and
. This is a 3×3 matrix
The second term is the error propagation of the actual wheel slip. This requires calculating the partial derivatives of
with respect to
, which is a 3×2 matrix. The first column contains the partial derivatives of
with respect to
. The second column contains the partial derivatives of
with respect to
Finally, we need to define the covariance matrix for the measurement noise. As the error is proportional to the distance travelled, we can define
are constants that need to be found experimentally and
indicating the absolute value of the distance traveled. We also assume that the error of the two wheels is independent, which is expressed by the zeros in the matrix.
- Uncertainty can be expressed by means of a probability density function
- More often than not, the Gaussian distribution is chosen as it allows treating error with powerful analytical tools
- In order to calculate the uncertainty of a variable that is derived from a series of measurements, we need to calculate a weighted sum in which each measurement’s variance is weighted by its impact on the output variable. This impact is expressed by the partial derivative of the function relating input to output.