Math - MAP(Maximum A-Posteriori) and MLE (Maximum Likelihood Estimate) - Rico贾若童的博客

Likelihood VS Probabilities

Probabilities describe the chances of discrete, mutually-exclusive possible states. These chances should sum up to 1. Likelihoods describe the chances or the plausibility of non-mutually exclusive, and potentially infinite and continuous hypotheses. For example, we have a robot that could only be in positions {1, 2, 3}. There is a landmark from a distance. The robot has a probability distribution of its own position: P(X) = {p(x=1), p(x=2), p(x=3)}. These probabilities should sum up to 1. In the mean time, the robot has a likelihood function of the landmark observation: L(Z|X). Part of the likelihood function L(z|x = 1) could look like a bell curve. The landmark could be in multiple highly-likely positions, and the likelihood need not sum up to 1.

MLE (最大似然) vs MAP （最大后验估计）

MLE (Maximum Likelihood Estimate_ focuses on finding the most likely state variables x that maximizes the observation data likelihood, z. So this is purely data-driven. It does NOT consider prior information:

\[\begin{gather*} \begin{aligned} & argmax(L(z|x)) \end{aligned} \end{gather*}\]

In MLE, $x$ is unknown but fixed (so it’s not a variable)

MAP (Maximum A-Posterior) not only considers MLE, but also considers the prior states x. It is more stale than MLE and can work better in a Bayesian Filter framework. When the data is limited, MAP might be better. When observation and single state variable data are abundant, the prior’s influence diminishes

\[\begin{gather*} \begin{aligned} & argmax(P(x|z)) \propto L(z|x) P(x) \end{aligned} \end{gather*}\]

In MAP, $x$ is unknown but fixed (so it’s not a variable)

The Log Trick While Working With Joint Multivariate Gaussian Distributions In MLE

\[\begin{gather*} \begin{aligned} & argmax(L(x) L(y)) = argmax \sum(log(L(x)) + log(L(y))) \end{aligned} \end{gather*}\]

Example of MLE

This example is inspired by this post.

Assume now my robot is at an unknown location, $\mu$. A landmark is at x=(0). The robot has 3 measurements: 5m, 8m, 9m. We assume that the likelihoods of these measurements follow a Gaussian noise distribution: $P(z x) = \frac{1}{\sigma\sqrt{2 \pi}} e^{-\frac{(x - \mu)^2}{2 \sigma^2}}$

Therefore, the joint likelihood of having these measurements at the location $\mu$ is:

\[\begin{gather*} \begin{aligned} & P(z_1, z_2, z_3|x) = \frac{1}{\sigma\sqrt{2 \pi}} e^{-\frac{(5 - \mu)^2}{2 \sigma^2}} \cdot \frac{1}{\sigma\sqrt{2 \pi}} e^{-\frac{(8 - \mu)^2}{2 \sigma^2}} \cdot \frac{1}{\sigma\sqrt{2 \pi}} e^{-\frac{(9 - \mu)^2}{2 \sigma^2}} \end{aligned} \end{gather*}\]

Now, we are going to find $\mu$ such that this joint likelihood is the smallest. We can do that by taking its partial derivative w.r.t $\mu$, then set it to 0. For the ease of computation, we do the log trick:

\[\begin{gather*} \begin{aligned} & f = ln(P(z_1, z_2, z_3|x)) = 3 ln(\frac{1}{\sigma\sqrt{2 \pi}}) - {\frac{(5 - \mu)^2}{2 \sigma^2} + \frac{(8 - \mu)^2}{2 \sigma^2} + \frac{(9 - \mu)^2}{2 \sigma^2}} \\ & \rightarrow \frac{\partial f}{\partial \mu} = \frac{\mu -5 + \mu - 8 + \mu - 9}{\sigma^2} = 0 \\ & \mu = \frac{22}{3} \end{aligned} \end{gather*}\]

References

https://blog.csdn.net/Leon_winter/article/details/86557024

Math - MAP(Maximum A-Posteriori) and MLE (Maximum Likelihood Estimate)

Likelihood vs Probabilities

Likelihood VS Probabilities

MLE (最大似然) vs MAP （最大后验估计）

The Log Trick While Working With Joint Multivariate Gaussian Distributions In MLE

Example of MLE

References

CATALOG

FEATURED TAGS

FRIENDS