Welcome back. In deriving the maximum likelihood, and map estimates of the original image, we need to statistically model the original image, and the additive noise by defining the probability density functions. These represent our prior knowledge about the original image and the noise. And they are therefore, justifiably, referred to as prior models. Then based on base theorem, our prior knowledge is modified by the likelihood function. It represents the most probable image that gave rise to the observed data, to form probability of the image after the degraded image got observed. Justifiably referred to as the posterior density function. If the likelihood function is maximized. In another words, no image prior model is used, a maximum likelihood estimate of the original image results. If on the other hand, the posterior is maximized, then a map estimation results. We derive the exact expressions for the maximum likelihood the map estimates for a [UNKNOWN] model and the so called simultaneous auto regressive image prior. We see that familiar expressions result in this case, that these, the expressions for the [UNKNOWN] squares. And the [UNKNOWN] squares or winner filters. So, let us now look into the [UNKNOWN] of this material. Let us now discuss the Bayesian Formulation of the image restoration problem. We will derive the maximum likelihood and map estimates, and the specific prior models. And we will say a few words about the hierarchical Bayesian approach. We show here in matrix vector form, the, the [UNKNOWN] model we have been using all along. So, y is the observed image, h the known blurring matrix f the original image and w the additive noise. With [UNKNOWN] approaches, f is assumed to be a sample of a random field, and so is w, and therefore, y becomes a sample of a random field. From here, clearly I can write that w equals y minus hf. So the standard model that has been used to model noise is the normal or Gaussian distribution, named after Gauss. This is the celebrated distribution that has found numerous application from image processing, we are discussing here to economics and beyond. So, this is the expression for the multi variate Gaussian distribution. So this is the probability density function of y, given f. Cww is the covariance matrix. [SOUND] And these vertical bars denote the determinant of the matrix. [SOUND] We assume that the image has n samples. We can, fail the simplify this model if we assume that in addition to being Gaussian the noise is also white. This means that the covariance matrix is equal to a constant times the identity matrix therefore it's determinant is equal to beta to the minus n since again the image here is M pixels. And from here of course the inverse of this is just that identity. So the form this this distribution takes is shown here. This symbol denotes that the left hand side of the problem is the density function of y given f. Is proportional to the expression on the right and clearly here, I have the inner product, y minus hf transpose times y minus hf, which gives rise to the norm squared. So, this is the, the standard model that has been used almost exclusively, I would say, for this additive type of noise in [UNKNOWN] approaches and we are going to proceed with this and see how it affects the calculations we are going to undertake. As we discussed earlier, when we were talking about the, the deterministic restoration techniques. Prior knowledge is of the utmost importance in solving restoration recovery type of problems. Since they restrict the solution space. So, with [UNKNOWN] approaches, we assume that the original image is smooth, and the way you did that, we constrained the energy of the image at high frequencies. So we took a Laplacian high, a high pass filter, and filtered the image. We measured the energy at high frequencies and we tried to make that quantity as small as possible. We can do a similar thing here by imposing this simultaneous autoregressive as is typically refered SAR image prior. So, the original image this the [UNKNOWN] function of the original image is proportional to a Gaussian here and exponential with this in the exponent. If we expand the norm then I have this expression here. So this is a zero mean Gaussian with inverse covariants here equal to alpha c transpose c. So, c again is a hypers filter, such as [UNKNOWN] and one way to, to tamper this prior knowledge is if we look at this again at the norm. When the energy at high frequencies is small then I have e to the minus a small number so that's close to one. So images that are smooth, images that have low high frequency energy are more probable according to this model. We'll look at other image prior models later on but lets proceed with these and see how we can find a restored image based on these two models, for the noise in the previous slide and for the image in this slide. We showed here the joint distribution between f and y. So, it's the probability of f and y. And according to the product rule of probability equals the conditional of y given f times the margin of distribution of f or the probability of f. Similarly the joint between y and f is shown here according to this product rule. Now since these joints are equal the right hand sides are equal as well and this gives rise to base theorem. So this expression, these, the expression of base theorem. So, this is a posterior, the probability of the unknown image based on, on the data, so equals our prior belief or knowledge about the image and this is the likelihood function. So our prior is altered by the likelihood to give the posterior. And the likelihood is the most probable f here originally was that gave rise to the data. So according to the maximum likelihood estimation approach, we find an estimate of the original image which maximizes this likelihood function. We find again the most probable original image that describes the data we observed. We also have the maximum a posterior or MAP estimate. Which maximizes the posterior, since the denominator here denominator of the posterior is independent of f, it's not included in the optimization. So, it's the argument which will maximize the likelihood times the prior. So both are some of the commonly used estimation approaches the maximum likelihood and maximum [UNKNOWN] and we'll see next. Under the models we introduced earlier, part of the estimates we obtained, under these two conditions. Let us see now what is the expression for the maximum likelihood restored image for the Gaussian noise model that we introduced earlier. It's shown here again, so this is a condition of y given f, I only keep the terms that are relevant to the optimization and therefore, I use here, the [UNKNOWN] symbol. CWW minus one is the inverse covariance matrix of the noise. So I do not use the assumption yet but the noise is white. So according to the maximum likelihood, estimation approach we need to maximize this condition probability here. Since there are exponentials involved in these Gaussian functions a standard [UNKNOWN] we do instead of maximizing the likelihood directly. We maximize instead the log of the likelihood, since the log is a [UNKNOWN] increasing function the maximum point does not change due to that. So, I maximize the log of the likelihood, and if we take the log of the exponential then the exponent just comes down and therefore I need to maximize this expression. Now, maximizing a negative function equals to the minimization of the positive function. And therefore, we need to minimize this norm so, I've taken the inverse covariance and I just used it here as the weight of this norm. So I have to minimize this weighted norm. We look, we, we look [UNKNOWN] what is the minimum of weighted norm and if you followed the steps, that we just explained earlier and used detection multiple times in these two weeks. That we cover restoration we can easily find that the maximum likelihood estimate, is given by this expression. So this is the generalized inverse here. In case the matrix is not invertible. Now if the noise is white then the covariance equals to beta minus one identity, so is actually the variance of the noise. And in this case, if I substitute CWW in the expression above, I find that the maximum likelihood. Estimate is given by this expression. So this exactly the expression we obtained when we covered this squares restoration in deterministic fashion this result calls for any matrix H. If the degradation is due to the linear spacial invariance system then we know that h becomes block circular. And further more we can take this expiration to the discrete frequency domain and this is what we call the generalized inverse or the inverse filter. So we see that under the assumption of Gaussian noise, the result of the maximum likelihood restoration. Equals the result of the, least squares estimation in a deterministic fashion and here is a weighted list squares, and here is the unweighted least squares or the inverse filter that we developed early on when we started talking about restoration. Let us look at the expression, we can obtain for the map. Based restoration. We will use the noise model that we introduced earlier, and used also for maximum likelihood, and for the image prior, we'll use the SAR model that we also introduced earlier. I am reminding you here that the inverse covariants of the image, it's simply equal to alpha c transpose c. Something, you can see right away if you expand this squared norm here. According to the map estimation formula, we are looking for f here, the restored image that will maximize the log of the posterior. This is the numerator of the posterior, the denominators independent of f, so I have the likelihood here and the prior. So, if we multiply the exponentials and take the log. The exponents come down and also instead of maximization as before I carry out a minimization of the negative function. I've used the noise covariance here as a weight to this norm. So we have this some of a weighted norm and the L2 norm again have looked at this. Multiple times and in different versions. We now that we have to take the gradient of what's inside the brackets and set it equal to zero, and if we do so, using what we have learned and use multiple times in this class, we can obtain that the map estimate is given by this expression. So this again the generalized inverse this plus and do you see that both the, as expected, the noise covariance is involved in this parameter alpha. Now if the noise is white then this is the noise variance here. And these are, I substitute this expression in what we already obtained. We find this form for the map estimate. So, we say right away this is the familiar constrained least square solution that we obtain when we talk about deterministic restoration, instead of [UNKNOWN] alpha over beta here. We had one parameter lets call it lambda. The [UNKNOWN] parameter or the [UNKNOWN] multiplier. We also know that the CLS filter is equivalent to the winner filter based on the appropriate interpretation of the error covariants and the image covariants. And most specifically, if I substitute here for the image covariants, this expression then from here, I, this is a same expression we obtained from the Wiener filter. So in summary, under these two specific models for the noise, and the image are prior knowledge about the image. We see that the map restoration gives exactly the same result as the constraining squares restoration, and the Wiener restoration that we have covered before.