Quantified MaxEnt

Taking into account noise $ \eta$ which is always present in expermental data, we have to rewrite (8) as

$\displaystyle A_i = \sum_{j=1}^{N} M_{ij} \hat{\rho}_j + \eta_i$ (9)

In order to infer $ \hat{\rho}$, we apply Bayesian probability theory for the calculation of the posterior probability $ p({\boldsymbol{\hat{\rho}}}\vert{\boldsymbol{A}},\mathcal{I})$ of a density distribution $ \boldsymbol{\hat{\rho}}$ given the absorbance data $ \boldsymbol{A}$. Using Bayes' Theorem we obtain:

$\displaystyle p({\boldsymbol{\hat{\rho}}}\vert{\boldsymbol{A}},\mathcal{I}) = \...
...boldsymbol{\hat{\rho}}}\vert\mathcal{I})}{p({\boldsymbol{A}}\vert\mathcal{I})},$ (10)

Assuming uncorrelated Gaussian noise with mean zero and variance $ \sigma_i^2$ the likelihood is given by:

$\displaystyle p({\boldsymbol{A}}\vert{\boldsymbol{\hat{\rho}}},\mathcal{I}) = \...
...N_d} \left[\frac{A_i - \sum_j M_{ij} \hat{\rho}_j }{\sigma_i} \right]^2 \right)$ (11)

As prior we choose the MaxEnt prior (see e.g. [Siv96] for further discussion):

$\displaystyle p(\boldsymbol{\hat{\rho}}\vert\alpha,\boldsymbol{m},I) = \frac{(2...
...0 \sqrt{\hat{\rho}_1\cdots \hat{\rho}_N}} \, \textcolor{red}{\exp{(\alpha S)}},$ (12)

with hyperparameter $ \alpha$, default model $ \boldsymbol{m}$ and entropy $ S$ defined by:

$\displaystyle S = \sum_{j=1}^N \hat{\rho}_j - m_j - \hat{\rho}_j \log \frac{\hat{\rho}_j}{m_j}$ (13)

The default model $ \boldsymbol{m}$ is the MaxEnt-solution in case of strong regularization $ (\alpha\rightarrow\infty)$.

In order to obtain the maximum posterior or MAP solution for fixed hyperparameter $ \alpha$, we have to maximize

$\displaystyle \exp\left(-\frac{\chi^2}{2} + \alpha S \right)

denoting the misfit by $ \chi^2 = \sum_i (A_i - \sum_j
M_{ij}\hat{\rho}_j)^2/\sigma_i^2$ and ignoring the denominator in (12) since we assume that it does not vary much compared to the exponential. The maximum is calculated numerically by Newton's method, for the details we refer to [Lin01].

Finally we have to specify the optimal hyperparameter $ \alpha$, which is fixed by the relation:

$\displaystyle \chi^2 = N_d,$ (14)

since we expect each data point $ A_i$ to deviate by $ \sigma_i$ from its true value on average. Starting with large $ \alpha$ (in order to ensure convergence of the Newton iterations), the hyperparameter is determined by interval bisection.

Danilo Neuber 2003-10-03