Error Propagation
- the main topic of this lecture will be a more formal, but still quite practical, approach to localization than we covered in L8
- we are still assuming that there is a world map with known landmarks that the robot can potentially measure using an exteroceptive sensor like a camera or laser rangefinder
- this concerns the differential drive mobility subsystem (the wheels), but we start with a case study of our robot’s arm
- the idea will be to understand how small errors in the setting of the arm joint angles propagate to (hopefully) small errors in the gripper position
- the math we develop to do this will critically depend on the manipulator Jacobian which we introduced in L9
- later we will show that the mobility system in fact also has its own Jacobian(s), so our case study for the arm will translate to the wheels
- recall that we often command the arm joint angles
maintaining the constraint
- specifically, we slave the wrist joint
and only command the shoulder and elbow joints
and
- in this way the gripper always remains horizontal and the arm forward kinematics reduces to
(1)
where
-
is the grip point in the vertical plane of robot frame
-
are constants giving the grip point in link 2 frame
-
are constants giving the shoulder joint location in robot frame
-
are the lengths of arm links 0 and 1
-
are
and
-
are
and
- equations (1) can be collected together into a single forward kinematic function for the arm:
(2)
where
- this is a nonlinear function that transforms a point
in joint space to a point
in task space
- in this example joint space and task space are both two-dimensional
- TBD diagram
- in the remainder of this analysis we will make no assumptions about the workings of the nonlinear function
except that it is differentiable
- this analysis can be applied to any differentiable function with an
input vector and
output vector, even for
- say the arm is nominally set to joint angles
, but these settings have been perturbed by a small error
in joint space
- what is the corresponding error
in task space?
-
can be estimated by taking a local linear approximation of (2):
(3)
where
is the derivative of
evaluated at
:
- this is the same equation we used in L9 as a basis for incremental numeric inverse kinematics
- but here we are using it for another purpose, specifically to estimate
given
and
- subtract
from both sides of (3) to get
(4)
- the Jacobain
is a
matrix in this example, or in general
where task space is
dimensional and joint space is
dimensional
- in this example
- remember
is not constant, it must be computed based on the current nominal joint angles
Gaussian Uncertainty Modeling
- why would the arm joint angles be perturbed?
- we measure them in the robot with potentiometers
- but these measurements are subject to some amount of error
- this is similar to measuring distance with a ruler
- if your ruler is marked in mm then you may be able to read it off to within, for example
mm
- or more generally within
mm, where
is an uncertainty interval
- let’s attempt to model the uncertainty in the arm’s joint angles with a similar approach using uncertainty intervals
for the two joint angles
- we could represent this graphically by drawing an axis-aligned rectangle centered at the nominal joint angle setting
in joint space
- but does this imply some kind of corresponding axis-aligned rectangle representing uncertainty in task space, centered on
?
- unfortunately it does not, but a related idea will work
- instead of using an axis-aligned rectangle, let’s model uncertainty using a possibly rotated ellipse centered at
- it does turn out that in this model there is a corresponding uncertainty ellipse in task space centered at
- we will see how to estimate this latter ellipse below by first-order propagation of uncertainty
- what does such an uncertainty ellipse mean, and how should we parameterize it?
- the generally accepted approach is based on Gaussian uncertainty modeling
- start by recalling the shape of a 1-dimensionial Gaussian distribution
- TBD diagram
- this curve is parameterized by two numbers:
and
-
is the mean value of the Gaussian, i.e. the location of its peak
- if we had to give a single best estimate for the measured quantity, it would be
- the standard deviation
defines the spread of the Gaussian
-
is a compact way to quantify our level of certainty that the actual value is near
- sometimes we use the notation
to refer to the Gaussian (aka normal) distribution with median
and standard deviation
- the corresponding thing in 1D to an uncertainty ellipse is actually an uncertainty interval
- think of “cutting” the Gaussian curve with a horizontal line at some elevation
- the elevation is inversely related to a desired confidence level
(a percentage that we can select, for example
) so that the perturbed value of the measurement is expected to be within the resulting interval with that level of confidence
- higher values of
correspond to cutting at lower elevations, making larger intervals
- because the Gaussian has infinite tails,
implies an infinite interval
- for
the interval degenerates to a single point at
- this can be made precise with a few equations, but we don’t really need that level of detail here
- now extend this approach to two dimensions with mean
and first with independent uncertainties
- the 2D Gaussian distribution looks like a mountain
- TBD diagram
- cutting it with a vertical plane along each axis gives 1D Gaussians
and
- TBD diagram
- cutting it with a horizontal plane at an elevation inversely related to a confidence level
gives an axis aligned (i.e. un-rotated) uncertainty ellipse
- TBD diagram
- it will be useful to collect the independent uncertainties
into a single
diagonal covariance matrix
(5)
-
encapsulates all necessary information about the relative uncertainties in each axis
- when a confidence level
is additionally given,
can be used to calculate the ellipse major and minor diameters
- this can be made precise with a few equations, but we don’t really need that level of detail here
-
is usually positive definite because there is usually some non-zero uncertainty in each axis, i.e.
- to allow rotated ellipses, introduce a local frame with origin at the center of the ellipse and with axes in the directions of the ellipse major and minor diameters (which are mutually perpendicular by definition)
- TBD diagram
- a
basis
for this local frame is given by the familar 2D rotation matrix
where
is the CCW rotation from the global
axis to
- then a more general covariance matrix
(6)
can be defined where
is the diagonal covariance matrix from (5)
- read (6) from right to left as
- rotate from global to local frame
- model uncertainty independently along the local frame axes
- rotate back to global frame
- in that way, the general covariance matrix
encapsulates both the rotation of the ellipse
and the relative lengths of the ellipse diameters
- for particular ellipse diameters a confidence level
is must also be specified, as usual
- by construction
is symmetric and (usually) positive definite
- all of this generalizes to higher dimensions
- summing up, we can define a Gaussian (aka normal) distribution in
dimensions as
-
is the nominal (mean) value of the distribution
-
is a symmetric and (usually) positive definite matrix that defines the orientation of uncertainty ellipses (ellipsoids for
) for the distribution, as well as the ratios of their principal diameters
- to actually draw a particular uncertainty ellipse (or ellipsoid, or interval for
) you also need to supply a confidence level
, which combined with
gives the specific diameters of that ellipse/ellipsoid/interval
Propagation of Uncertainty
- now we are ready to see how a nominal setting of the robot arm joint angles
and a corresponding covariance matrix
propagates to a nominal gripper position
and its covariance matrix
- TBD figure
- in fact, we have already done half of this in (2):
(7)
- the other half is based on (4):
(8)
- to understand (7), think of the Jacobian
(and remember this is the particular
evaluated at
) as a mapping from errors in joint space to errors in task space
- then read (7) from right to left (note the similarity to (6))
- transform from task to joint space
- apply the joint space uncertainty
- transform from joint space back to task space
- while (4) motivates the idea that
transforms errors from joint to task space, it is not necessarily obvious that
would do the opposite, and not, for example
or a pseudoinverse of
when it is not invertible
- we will not cover the proof, but in fact
is the right thing here
- equations (7) and (8) constitute first-order propagation of uncertainty
- they work for any general differentiable function
Differential Drive Odometry with Uncertainty
- in L3 we developed equations for incremental forward position kinematics, i.e. odometry, for the differential drive mobility system on our robot
- at each time step we calculate a new robot pose
given the prior pose
and the incremental wheel rotations
that occurred in this timestep:
(9)
(10)
where
is the wheel radius and
is the robot baseline (the distance between the wheels)
- equations (9) and (10) are another nonlinear but differentiable mapping
- here there are two inputs:
and
- and one output:
- because we will use
for a different purpose below, we will call this mapping simply
(instead of
as we did for the arm):
(11)
- equation (11) can be considered to have two Jacobians:
,
(
is
and
is
)
- for incremental odometry we repeatedly apply (11) to get a sequence of nominal robot poses
-
is the original pose when the robot was turned on or when odometry was most recently reset
-
is the current pose
- now we can use first-order propagation of uncertainty to also maintain a sequence of
covariance matrices
modeling uncertainty in the pose at each step
- apply the uncertainty propagation (8) to
like this:
(12)
-
is the uncertainty of the prior pose
- in the typical case that we define the first robot pose
then
is all zeros (no initial uncertainty)
-
models the process uncertainty for the mobility system given that the wheel encoders reported an incremental wheel rotation of
- it is a bit of an art to come up with a useful model for
- one possibility is to make it a constant diagonal matrix
where
is a standard deviation modeling uncertainty in knowing the effective wheel rotation, which may vary from the actual wheel rotation due to slip between the wheel and the ground
- another possibility is to make
depend on
, for example
- this models the uncertainty for each wheel to be proportional to the amount it rotated
- in any case the parameters of the uncertainty model (such as
) may need to be determined by experiments
- or you could make a reasonable guess
- because (12) not only projects the prior pose uncertainty
forward, but also adds on some additional uncertainty from
, the pose uncertainty will increase at each timestep
- this matches our intuition that the longer the robot operates “blindly” estimating its pose only with odometry, the more error will accumulate
- TBD figure
- if the robot has no additional exteroceptive sensors, this is arguably the best estimate of its position it can make
- but of course we can add sensors like cameras and laser scanners, and shortly we will see how to use them to attempt to periodically reduce pose uncertainty
- note that since
is three dimensional technically we would have to draw a 3D plot and show the robot’s trajectory in
state space
- uncertainty at each step could be visualized as an ellipsoid in 3D
- but it is common to plot only the
(position) part of the state
- the 2D error ellipse for the
part of state
is described by the upper left
submatrix of the corresponding
EKF Localization
- the incremental odometry update equations (11) and (12) can be considered the prediction phase of an extended Kalman filter (EKF)
- the original Kalman filter was designed only to apply to linear process functions
- the extended Kalman filter handles nonlinear but differentiable functions
by taking locally linear approximations as we did in equation (3)
- this is approximation but it often works well in practice
- in general, a Kalman prediction transforms a prior state estimate
and its uncertainty modeled as a covariance matrix
into a new state estimate
and its (generally larger, possibly equal, but never smaller) uncertainty
, accounting for a known differentiable process function
and process update
with uncertainty
:
(13 Kalman prediction)
where
,
- the other phase in an EKF is called a correction phase
- for mobile robot localization with a known map, Kalman correction is the process by which the exteroceptive sensor(s) on the robot (camera, laser rangefinder, etc) are used to
- observe the environment
- look for features that might correspond to known landmarks in the map
- propose a particular correspondence between (some of) the observed features and known landmarks
- infer from these correspondences some information about the actual pose called the innovation
with uncertainty
- use the innovation to reduce the pose uncertainty
and to refine the nominal pose estimate
- these will be detailed below
- the EKF is a recursive filter in the sense that the only “memory” it keeps is the most recent state estimate
and its covariance
- each update makes a new
and
to replace the old ones
- in many applications every update has both a prediction phase and a correction phase
- but for mobile robot localization often there are many updates where only the prediction phase is applied, hence growing the pose uncertainty
- correction phases which reduce pose uncertainty are often applied at a lower frequency due to their computational cost, and in some cases to the limited speed and/or relatively high power consumption of exteroceptive sensors
- to develop the equations for Kalman correction, first consider a special case where we abstract away feature observation and correspondence, and assume that the “innovation” from the exteroceptive sensor is simply a full state estimate
and its
uncertainty matrix
- here the innovation is
(14)
but in general this need not be the case because the data from sensing need not actually be a pose
- this special case could occur in practice if the robot has something like a GPS sensor to estimate
and something like a compass to estimate
- if GPS were available you may wonder why we would need to do anything further
- in practice, GPS is not always available, particularly indoors
- also the resolution of GPS, at least for economical equipment, is on the order of meters
- and while GPS reports position in a geo-referenced coordinate frame, what may be more important to a robot is its position relative to nearby obstacles, and the transformation from a map of those obstacles to the geo-referenced frame may not be precisely known
- in this case the correction phase boils down to combining the prior pose estimate
and the estimate
from sensing
- the main idea is to make the refined nominal pose estimate
a weighted average of the prior estimate
and the estimate from sensing
:
(15)
- where we have used the fact that the covariance of the innovation
is
, which can be found by propagation of uncertainty on (14):
(16)
- equation (15) gives more weight to
if the uncertainty for
is high and vice-versa
- to understand equation (15), consider a simplified case where the state is one-dimensional, i.e.
and
are just scalars
- then the covariance matrices reduce to just variances and (15) can be written like this:
(17)
- equation (17) makes it a bit more clear that when the variance for
is high more weight is given to
and vice-versa, but the same logic still holds for the multidimensional version in (15)
- it is typical to rearrange (15) like this
(18)
which is more explicitly recursive
-
is called the Kalman gain
- the covariance of the refined estimate
is found by yet another application uncertainty propagation, here on (18):
$\begin{aligned} {p’} &= () p ()^T
- () q ()^T \ &= (I-K)p(I-K)^T + KqK^T \ &= p-pKT-Kp+KpKT+KqK^T \ &= p-p(pv{-1})T-(pv{-1})p+ K(p+q)K^T \ &= p-pv{-T}pT-pv{-1}p+ KvK^T \ &= p-pv{-1}p-pv{-1}p+ (pv{-1})v(pv{-1})T \ &= p-2pv{-1}p+ pv{-1}vv{-1}p \ &= p-2pv{-1}p+ pv{-1}p \ &= p-pv{-1}p \ &= p-(pv{-1})v(v^{-1}p) \ \end{aligned}