I would like to explain the localisation concept learned through the udacity and hope this will be helpful for others. Expectation of this article to give you an idea about localisation in general and this can be foundation to my Particle Filters article.

Localisation overview

Self-driving cars use maps because these maps help them figure out what the world is supposed to look like. Determining the vehicle’s precise position on the map is called “localization”. By localizing itself, the vehicle can determine its precise relationship to all of the elements on the map. Is the vehicle in the middle lane or the right lane? How far away is the curb? What about the next intersection?

Broadly speaking, localizing the vehicle involves determining where on the map the vehicle is most likely to be by matching what the vehicle sees to the map. We might identify specific landmarks — poles and mailboxes and curbs — and measure the vehicle’s distance from each, in order to estimate the vehicle’s position.

Markov Localization or Bayes Filter for Localization is a generalized filter for localization and all other localization approaches are realizations of this approach.

We generally think of our vehicle location as a probability distribution, each time we move, our distribution becomes more diffuse (wider). We pass our variables (map data, observation data, and control data) into the filter to concentrate (narrow) this distribution, at each time step. Each state prior to applying the filter represents our prior and the narrowed distribution represents our Bayes’ posterior.

Bayes Rule

Bayes Rule is the fundamentals of markov localisation. Bayes’ Rule enables us to determine the conditional probability of a state given evidence P(a|b) by relating it to the conditional probability of the evidence given the state (P(b|a) in the form of:

Which can be rearranged to:

In other words the probability of state a, given evidence b, is the probability of evidence b, given state a, multiplied by the probability of state a, normalized by the total probability of b over all states.

We can apply Bayes’ Rule to vehicle localization by passing variables through Bayes’ Rule for each time step, as our vehicle moves. This is known as a Bayes’ Filter for Localization. You may recognize this as being similar to a Kalman filter. In fact, many localization filters, including the Kalman filter are special cases of Bayes’ Filter.

With respect to localization, these terms are:

P(location|observation): This is P(a|b), the normalized probability of a position given an observation (posterior). P(observation|location): This is P(b|a), the probability of an observation given a position (likelihood) P(location): This is P(a), the prior probability of a position P(observation): This is P(b), the total probability of an observation

Without going into detail yet, be aware that P(location) is determined by the motion model.

Localisation Posterior

We aim to estimate state beliefs bel(xt​) without the need to carry our entire observation history. We will accomplish this by manipulating our posterior p(xt​∣z1:t−1​,μ1:t​,m), obtaining a recursive state estimator. For this to work, we must demonstrate that our current belief bel(xt​) can be expressed by the belief one step earlier bel(xt−1​), then use new data to update only the current belief. This recursive filter is known as the Bayes Localization filter or Markov Localization, and enables us to avoid carrying historical observation and motion data. We will achieve this recursive state estimator using Bayes Rule, the Law of Total Probability, and the Markov Assumption.

Formal Definition of Variables: z_{1:t}​ represents the observation vector from time 0 to t (range measurements, bearing, images, etc.). u_{1:t} represents the control vector from time 0 to t (yaw/pitch/roll rates and velocities). m represents the map (grid maps, feature maps, landmarks) xt​ represents the pose (position (x,y) + orientation \θ)

We take the first step towards our recursive structure by splitting our observation vector z1:t​ into current observations zt​ and previous information z1:t−1​. The posterior can then be rewritten as:

Now, we apply Bayes’ rule, with an additional challenge, the presence of multiple distributions on the right side (likelihood, prior, normalizing constant).

Bayes rule

Bayes rule applied result of belief state:

Motion Model

It’s a probability distribution of position set xt given the overservation z1:t-1, control u1:t​ and map m.

Motion model

The probability returned by the motion model is the product of the transition model probability (the probability of moving from xt−1 → xt and the probability of the state xt−1.

Next step is to simplify the Motion model is using Markov assumption.

A Markov assumption is one in which the conditional probability distribution of future states (ie the next state) is dependent only upon the current state and not on other preceding states. This can be expressed mathematically as: P(xt​∣x_1−t,….,xt−i​,….,x0​)=P(xt​∣xt−1​) It is important to note that the current state may contain all information from preceding states. That is the case discussed in this lesson.

Graphical representation before markov assumption

In this case we can come up with two assumption for the motion model:

(a) Since we (hypothetically) know in which state the system is at time step t-1, the past observations z1:t−1​and controls u1:t−1​ would not provide us additional information to estimate the posterior for xt​, because they were already used to estimate xt−1​. This means, we can simplify p(xt​∣xt−1​,z1:t−1​,u1:t​,m) to p(xt​∣xt−1​,ut​,m).

(b) Since ut​ is “in the future” with reference xt−1​,ut​ does not tell us much about xt−1​. This means the term p(xt−1​∣z1:t−1​,u1:t​,m) can be simplified to p(xt−1​∣z1:t−1​,u1:t−1​,m).

After applying the Markov Assumption, the term p(xt−1​∣z1:t−1​,u1:t−1​,m) describes exactly the belief at xt−1​ ! This means we achieved a recursive structure!

Graphical representation after markov assumption

As per the law of probability, we can write motion model as below:

If we rewrite the second term in our integral to split z1−t​ to zt−1​ and zt−2​ we arrive at a function that is exactly the belief from the previous time step, namely bel(xt−1​).

The amazing thing is that we have a recursive update formula and can now use the estimated state from the previous time step to predict the current state at t. This is a critical step in a recursive Bayesian filter because it renders us independent from the entire observation and control history.

Finally, we replace the integral by a sum over all xi​ because we have a discrete localization scenario in this case.

Discret formula

If you look at bayes rule, in order to calculate prior, we need to apply total probability.

Bayes rule

Similarly, Motion model is a prior and in order to calculate that we are applying total probability concept as shown in the discrete case of the markov assumption.

Let’s remember what the summation above is doing. It’s calculating the probability that the vehicle wound up at a given location, xt​. How is the summation doing that? It’s looking at each prior location where the vehicle could have been, xt−1​. Then the summation iterates over every possible prior location, xt−1(1)​…xt−1(n)​. For each possible prior location in that list, xt−1(i)​, the summation yields the total probability that the vehicle really did start at that prior location and that it wound up at xt​. That now raises the question, how do we calculate the individual probability that the vehicle really did start at that prior location and that it wound up at xt​, for each possible starting position xt−1​? That’s where each individual element of the summation contributes. The likelihood of starting at x{t-1} and arriving at x{t}​ is simply p(xt​∣xt−1​)∗p(xt−1​). We can say the same thing, using different notation and incorporating all of our knowledge about the world, by writing the above discrete formula.

Just to give an intuition of how code looks like, a sample code in c++ is given below. Just assume we have only two time steps.

Before getting into this code, remember: The probability returned by the motion model is the product of the transition model probability (the probability of moving from xt−1 → xt and the probability of the state xt−1.

#include <iostream>

#include <algorithm>

#include <vector>

#include “helpers.h”

using namespace std;

std::vector<float> initialize_priors(int map_size, std::vector<float> landmark_positions,

float control_stdev);

float motion_model(float pseudo_position, float movement, std::vector<float> priors,

int map_size, int control_stdev);

int main() {



//set standard deviation of control:

float control_stdev = 1.0f;

//meters vehicle moves per time step

float movement_per_timestep = 1.0f;

//number of x positions on map

int map_size = 25;

//initialize landmarks

std::vector<float> landmark_positions {5, 10, 20};



// initialize priors

std::vector<float> priors = initialize_priors(map_size, landmark_positions,

control_stdev);



//step through each pseudo position x (i)

for (unsigned int i = 0; i < map_size; ++i) {

float pseudo_position = float(i);

//get the motion model probability for each x position

float motion_prob = motion_model(pseudo_position, movement_per_timestep,

priors, map_size, control_stdev);



//print to stdout

std::cout << pseudo_position << “\t” << motion_prob << endl;

}

return 0;

};

//motion model: calculates prob of being at an estimated position at time t

float motion_model(float pseudo_position, float movement, std::vector<float> priors,

int map_size, int control_stdev) {

//initialize probability

float position_prob = 0.0f;

//loop over state space for all possible positions x (convolution):

for (unsigned int j=0; j< map_size; ++j) {

float next_pseudo_position = float(j);

//distance from i to j

float distance_ij = pseudo_position-next_pseudo_position;

//transition probabilities:

float transition_prob = Helpers::normpdf(distance_ij, movement,

control_stdev);

//estimate probability for the motion model, this is our prior

position_prob += transition_prob*priors[j];

}

return position_prob;

}

//initialize priors assumimg vehicle at landmark +/- 1.0 meters position stdev

std::vector<float> initialize_priors(int map_size, std::vector<float> landmark_positions,

float control_stdev) {

//initialize priors assuming vehicle at landmark +/- 1.0 meters position stdev

//set all priors to 0.0

std::vector<float> priors(map_size, 0.0);

//set each landmark positon +/1 to 1.0/9.0 (9 possible postions)

float normalization_term = landmark_positions.size() * (control_stdev * 2 + 1);

for (unsigned int i = 0; i < landmark_positions.size(); i++){

int landmark_center = landmark_positions[i];

priors[landmark_center] = 1.0f/normalization_term;

priors[landmark_center — 1] = 1.0f/normalization_term;

priors[landmark_center + 1] = 1.0f/normalization_term;

}

return priors;

}

Observation Model

It’s a probability distribution of overservation set zt, given the position xt, observation z1:t-1, control u1:t​ and map m.

Observation model

The Markov assumption can help us simplify the observation model. Recall that the Markov Assumption is that the next state is dependent only upon the preceding states and that preceding state information has already been used in our state estimation. As such, we can ignore terms in our observation model prior to xt​ since these values have already been accounted for in our current state and assume that t is independent of previous observations and controls.

Graphical representation

With these assumptions we simplify our posterior distribution such that the observations at t are dependent only on x at time t and the map.

After Markov assumption applied

Since zt​ can be a vector of multiple observations we rewrite our observation model to account for the observation models for each single range measurement. We assume that the noise behavior of the individual range values zt1​ to ztk​ is independent and that our observations are independent, allowing us to represent the observation model as a product over the individual probability distributions of each single range measurement. Now we must determine how to define the observation model for a single range measurement.

In general there exists a variety of observation models due to different sensor, sensor specific noise behavior and performance, and map types. For our 1D example we assume that our sensor measures to the n closest objects in the driving direction, which represent the landmarks on our map. We also assume that observation noise can be modeled as a Gaussian with a standard deviation of 1 meter and that our sensor can measure in a range of 0–100 meters.

To implement the observation model we use the given state xt​, and the given map to estimate pseudo ranges, which represent the true range values under the assumption that your car would stand at a specific position xt​, on the map. For example, if our car is standing at position 20 it would make use xt​, and m to make pseudo range (zt∗​) observations in the order of the first landmark to the last landmark or 5, 11, 39, and 57 meters. Compared to our real observations (zt​ = [19, 37]) the position xt​, = 20 seems unlikely and our observation would rather fit to a position around 40.

The observation model uses pseudo range estimates and observation measurements as inputs.

The observation model will be implemented by performing the following at each time step:

Measure the range to landmarks up to 100m from the vehicle, in the driving direction (forward)

Estimate a pseudo range from each landmark by subtracting pseudo position from the landmark position

Match each pseudo range estimate to its closest observation measurement

For each pseudo range and observation measurement pair, calculate a probability by passing relevant values to norm_pdf: norm_pdf(observation_measurement, pseudo_range_estimate, observation_stdev)

Return the product of all probabilities

C++ implementation of observation model as follows:

#include <iostream>

#include <algorithm>

#include <vector>

#include “helpers.h”

using namespace std;

//function to get pseudo ranges

std::vector<float> pseudo_range_estimator(std::vector<float> landmark_positions,

float pseudo_position);

//observation model: calculates likelihood prob term based on landmark proximity

float observation_model(std::vector<float> landmark_positions, std::vector<float> observations,

std::vector<float> pseudo_ranges, float distance_max,

float observation_stdev);

int main() {

//set observation standard deviation:

float observation_stdev = 1.0f;

//number of x positions on map

int map_size = 25;

//set distance max

float distance_max = map_size;

//define landmarks

std::vector<float> landmark_positions {5, 10, 12, 20};

//define observations

std::vector<float> observations {5.5, 13, 15};

//step through each pseudo position x (i)

for (unsigned int i = 0; i < map_size; ++i) {

float pseudo_position = float(i);

//get pseudo ranges

std::vector<float> pseudo_ranges = pseudo_range_estimator(landmark_positions,

pseudo_position);

//get observation probability

float observation_prob = observation_model(landmark_positions, observations,

pseudo_ranges, distance_max,

observation_stdev);

//print to stdout

std::cout << observation_prob << endl;

}

return 0;

};

//observation model: calculates likelihood prob term based on landmark proximity

float observation_model(std::vector<float> landmark_positions, std::vector<float> observations,

std::vector<float> pseudo_ranges, float distance_max,

float observation_stdev) {

//initialize observation probability:

float distance_prob = 1.0f;

//run over current observation vector:

for (unsigned int z=0; z< observations.size(); ++z) {

//define min distance:

float pseudo_range_min;



//check, if distance vector exists:

if(pseudo_ranges.size() > 0) {

//set min distance:

pseudo_range_min = pseudo_ranges[0];

//remove this entry from pseudo_ranges-vector:

pseudo_ranges.erase(pseudo_ranges.begin());

}

//no or negative distances: set min distance to maximum distance:

else {

pseudo_range_min = distance_max;

}

//estimate the probabiity for observation model, this is our likelihood:

distance_prob *= Helpers::normpdf(observations[z], pseudo_range_min,

observation_stdev);

}

return distance_prob;

}

std::vector<float> pseudo_range_estimator(std::vector<float> landmark_positions,

float pseudo_position) {



//define pseudo observation vector:

std::vector<float> pseudo_ranges;



//loop over number of landmarks and estimate pseudo ranges:

for (unsigned int l=0; l< landmark_positions.size(); ++l) {

//estimate pseudo range for each single landmark

//and the current state position pose_i:

float range_l = landmark_positions[l] — pseudo_position;



//check if distances are positive:

if (range_l > 0.0f) {

pseudo_ranges.push_back(range_l);

}

}

//sort pseudo range vector:

sort(pseudo_ranges.begin(), pseudo_ranges.end());



return pseudo_ranges;

}

Bayes Filter for localisation or Markov Localization full formula

The filter formula can be written as:

Formula

And this can be simplified to:

This concept can be represented graphically as below:

Graphical representation

Or the above graphical representation can also be represented as:

Implement the Bayes’ localization filter by first initializing priors, then doing the following within each time step:

Algorithm for filter

C++ implementation of full filter as follows:

#include <iostream>

#include <algorithm>

#include <vector>

#include “helpers.h”

using namespace std;

std::vector<float> initialize_priors(int map_size, std::vector<float> landmark_positions,

float control_stdev);

float motion_model(float pseudo_position, float movement, std::vector<float> priors,

int map_size, int control_stdev);

//function to get pseudo ranges

std::vector<float> pseudo_range_estimator(std::vector<float> landmark_positions,

float pseudo_position);

//observation model: calculates likelihood prob term based on landmark proximity

float observation_model(std::vector<float> landmark_positions, std::vector<float> observations,

std::vector<float> pseudo_ranges, float distance_max,

float observation_stdev);

int main() {

//set standard deviation of control:

float control_stdev = 1.0f;

//meters vehicle moves per time step

float movement_per_timestep = 1.0f;

//set observation standard deviation:

float observation_stdev = 1.0f;

//number of x positions on map

int map_size = 25;

//set distance max

float distance_max = map_size;

//define landmarks

std::vector<float> landmark_positions {3, 9, 14, 23};

//define observations vector, each inner vector represents a set of observations

//for a time step

std::vector<std::vector<float> > sensor_obs {{1,7,12,21}, {0,6,11,20}, {5,10,19}, {4,9,18},

{3,8,17}, {2,7,16}, {1,6,15}, {0,5,14}, {4,13},

{3,12},{2,11},{1,10},{0,9},{8},{7},{6},{5},{4},{3},{2},{1},{0},

{}, {}, {}};

// initialize priors

std::vector<float> priors = initialize_priors(map_size, landmark_positions,

control_stdev);



//initialize posteriors

std::vector<float> posteriors(map_size, 0.0);

//specify time steps

int time_steps = sensor_obs.size();



//declare observations vector

std::vector<float> observations;



//cycle through time steps

for (unsigned int t = 0; t < time_steps; t++){

if (!sensor_obs[t].empty()) {

observations = sensor_obs[t];

} else {

observations = {float(distance_max)};

}

//step through each pseudo position x (i)

for (unsigned int i = 0; i < map_size; ++i) {

float pseudo_position = float(i);

//get the motion model probability for each x position

float motion_prob = motion_model(pseudo_position, movement_per_timestep,

priors, map_size, control_stdev);

//get pseudo ranges

std::vector<float> pseudo_ranges = pseudo_range_estimator(landmark_positions,

pseudo_position);

//get observation probability





float observation_prob = observation_model(landmark_positions, observations,

pseudo_ranges, distance_max,

observation_stdev);



//calculate the ith posterior

posteriors[i] = motion_prob * observation_prob;

}



//normalize

posteriors = Helpers::normalize_vector(posteriors);



//update

priors = posteriors;



}

//print final posteriors vector to stdout

for (unsigned int p = 0; p < posteriors.size(); p++) {

std::cout << posteriors[p] << endl;

}

return 0;

};

//observation model: calculates likelihood prob term based on landmark proximity

float observation_model(std::vector<float> landmark_positions, std::vector<float> observations,

std::vector<float> pseudo_ranges, float distance_max,

float observation_stdev) {

//initialize observation probability:

float distance_prob = 1.0f;

//run over current observation vector:

for (unsigned int z=0; z< observations.size(); ++z) {

//define min distance:

float pseudo_range_min;



//check, if distance vector exists:

if(pseudo_ranges.size() > 0) {

//set min distance:

pseudo_range_min = pseudo_ranges[0];

//remove this entry from pseudo_ranges-vector:

pseudo_ranges.erase(pseudo_ranges.begin());

}

//no or negative distances: set min distance to maximum distance:

else {

pseudo_range_min = distance_max;

}

//estimate the probabiity for observation model, this is our likelihood:

distance_prob *= Helpers::normpdf(observations[z], pseudo_range_min,

observation_stdev);

}

return distance_prob;

}

std::vector<float> pseudo_range_estimator(std::vector<float> landmark_positions,

float pseudo_position) {



//define pseudo observation vector:

std::vector<float> pseudo_ranges;



//loop over number of landmarks and estimate pseudo ranges:

for (unsigned int l=0; l< landmark_positions.size(); ++l) {

//estimate pseudo range for each single landmark

//and the current state position pose_i:

float range_l = landmark_positions[l] — pseudo_position;

//check if distances are positive:

if (range_l > 0.0f) {

pseudo_ranges.push_back(range_l);

}

}

//sort pseudo range vector:

sort(pseudo_ranges.begin(), pseudo_ranges.end());



return pseudo_ranges;

}

//motion model: calculates prob of being at an estimated position at time t

float motion_model(float pseudo_position, float movement, std::vector<float> priors,

int map_size, int control_stdev) {

//initialize probability

float position_prob = 0.0f;

//step over state space for all possible positions x (convolution):

for (unsigned int j=0; j< map_size; ++j) {

float next_pseudo_position = float(j);



//distance from i to j

float distance_ij = pseudo_position-next_pseudo_position;

//transition probabilities:

float transition_prob = Helpers::normpdf(distance_ij, movement,

control_stdev);



//estimate probability for the motion model, this is our prior

position_prob += transition_prob*priors[j];

}

return position_prob;

}

//initialize priors assumimg vehicle at landmark +/- 1.0 meters position stdev

std::vector<float> initialize_priors(int map_size, std::vector<float> landmark_positions,

float control_stdev) {

//initialize priors assumimg vehicle at landmark +/- 1.0 meters position stdev

//set all priors to 0.0

std::vector<float> priors(map_size, 0.0);

//set each landmark positon +/1 to 1.0/9.0 (9 possible postions)

float normalization_term = landmark_positions.size() * (control_stdev * 2 + 1);

for (unsigned int i = 0; i < landmark_positions.size(); i++){

int landmark_center = landmark_positions[i];

priors[landmark_center] = 1.0f/normalization_term;

priors[landmark_center — 1] = 1.0f/normalization_term;

priors[landmark_center + 1] = 1.0f/normalization_term;

}

return priors;

}

Summary

The Bayes Localization Filter or Markov Localization is a general framework for recursive state estimation.

That means this framework allows us to use the previous state (state at t-1) to estimate a new state (state at t) using only current observations and controls (observations and control at t), rather than the entire data history (data from 0:t).

The motion model describes the prediction step of the filter while the observation model is the update step.