One of the reasons for the increasing popularity of map inference is that there
is a large subclass of map problems that actually allow tractable inference.
Now we've already see one class of map problems that allows tractable inference,
which is the graphical models that have low tree width, for which you can apply
variable elimination or clique trees. But it turns out that besides that,
there's actually a rich set of other sub-classes that can be solved using
efficient algorithms, often based on ideas from computer science theory.
So let's look at some of these examples and see how we might go about solving
them. Once such important class of problems is
that of correspondence, or data association problems.
Here we have two classes of objects. In this case the red object on the one
side, and the blue objects in the other. And we'ld like to find a correspondence,
or matching between them. And in fact it's the matching means that
each that, that the assignment has to be one to one.
So you can't have a single red dot match more than one blue dot, or vice-versa.
So here for example, is one. Such matching.
And we may additionally require that all the red dots be matched or all the blue
dots be matched, which can't be done in this case.
So what is the quality of a matching given the the set of possible matching,
so how do we select between them. So we have in the model a notion of the
quality of the match between I and J which we donate in this case as the
parameter theta I J. And so our goal in this in this problem
is to find the highest scoring matching where the score is defined as the sum of.
The matches multiplied by the quality of their score, and we constrain the
variables X I J, these binary variables that indicate whether I is mapped to J to
via the mutual exclusion constraint. That we discussed earlier.
Now, this type of problem is easily solved by using bipartite matching
problems developed in computer science theory.
And, it turns out that, in the context of graphical models.
or probabilistic inference. This type of problem has numerous
applications. So, for example, we might have, on the
one hand, say, the the red side. A set of sensor readings from a number
of, say, airplanes, that we're trying to keep track of.
And on the blue side, we might have a set, a set of airplanes or objects that
we've previously detected. And we'd like to figure out which sensor
reading corresponds to which object. Or we might have features in two
unrelated images. And we'd like to figure out, which
feature. Image one matches which feature in image
two. Or if we're trying to do some kind of
text analysis we might have a set of entities.
Say people or organizations that we're trying to reason about, or learn
something about. And we're trying to match mentions in the
text to those entities. Subjects in the constraint that each
mention correspond to exactly one entity. So lets look at one example application
of this which corresponds to one of the examples that we showed on the
[INAUDIBLE], mentioned on the previous slide which is that of 3D cell
reconstruction. In this case we have over here a cell
that we are trying to assess with 3D structure and we are looking at various
slices through that cell taken from different angles via microscope.
And so these are various two dimensional projections of the 3D object and we are
trying to figure out from that three dimensional structure.
Now within the cell we might have these little tiny gold beads that were put into
the cell so that we can somehow register the different images of the cell to each
other. So, here these are examples of two such
real images and these are the little gold beads you can see them in the little
circles and what we would like to do is we would like to.
Decide that this beat over here is the same as that beat over there.
And once we have that correspondence, we can then compute the 3D reconstruction
using some volumetric reconstruction algorithms.
And in this case the matching weights; those data IJ's that we saw on the
previous slide, cor, represent the similarity between this position.
I mean this dot and this dot. In terms of the location, in the image.
And in terms of the neighborhood appearance in in the vicinity of that
point. A very different application for what is
effectively the same kind of idea is if we have a 3-D mesh scan of a, of an
object such as a human being. And we'd like to figure out how points on
this mesh match up to points on a somewhat related but different mesh.
Whether it's different pose or different shape.
So here we would like to realize that this point over here maps to this point
over here. Or that this point on this mesh maps to
this point on this mesh. And once again these matching weights,
the Vetta IJ represent exactly the similarity of both location and, again
the local neighborhood appearance. Another very commonly used class of
retractable math sub problems are those that involve what are called associative
potential. Associative also called regular in some
cases or attractive are cases where the variables that are connected to each
other like to take the same values. So let's consider this example on the
context of binary value tangen variables where we have a variable XI.
And the variable XJ that are connected to each other and we have the on diagonal
potentials of the zero, zero and the one, one which
are the cases where XI and XJ take the same value.
And we have the off diagonal potentials zero one, one zero where the variables
take on different values and what we would like in an associative model is for
the zero, zero and the one, one cases to be preferred over the off diagonal cases.
And that can be formulated using the following constraint in which we have
that A plus D which are the on diagonal entries are in sum greater than or equal
to B plus C. So that in aggregate we prefer the on
diagonal to the off diagonal entries. It turns out that.
If you have even an arbitrary network in terms of fully dense clinic tivity over
binary variables, that use only the Singleton potentials, and, these,
pairwise potentials that are associative. They're also called super modular, which
is a term that comes from mathematics. if the pairwise potential satisfy this
constraint that we saw over here, then you can find an exact solution regardless
of the network connectivity. And that can be done using algorithms
from communitorial optimization from computer science theory for finding
minimum cuts in graphs. For non-binary valued random variables
exact solutions are no longer possible, but it turns out that very simple
heuristics can get very, very high quality answers very efficiently.
and so many related variance that are not necessarily binary admit either exact or
approximate solutions and specifically we talked before about the class of metric
MRFs that occur a lot in computer vision as well as other applications.
And it turns out that for metric MRFs even over non-binary variables you can
you can use these very efficient algorithms.
So, one such example is in the context of depth reconstruction.
Where we have two views of the same object taken from slight disparity, say,
using in a, using a stereo camera. So here's view one and view two in the
same scene. You can see that they're slightly
different from each other and and if we can and we now have for a given
we're trying to infer the following image that tells us for each pixel in the image
what is the actual depth. That is the Z dimension.
And one of the constraints that we like to enforce is that because of smoothness
in the image, the z value for a pixel is similar to the z value of an adjacent
pixel. And that allows us to clean up a lot of
noise that might arise if we just try to infer the depth of individual pixels by
themselves. And there are many other such examples in
computer vision that are used quite commonly and use this kind of associative
potentials. Examples include, for example, denoising
of images infilling of missing regions, and a whole variety of other tasks,
including foreground, background segmentation, and so on.
. Another kind of factor that admits
efficient solutions is a cardinality factor.
And this is a factor that in general can involve arbitrarily many binary variables
X1 up to XK. But we're going to require the, that it's
sparsely representable in the sense that the score for an assignment that X1 up to
XK is a function of how many XI's are turned on.
So to take an example, here is a factor like this over A B C and D and you can
see that we have only five different values for entries in this factor.
We have this pink entry over here when the sum of the X Is, is zero.
We have this blue entry over here that occurs when the sum of the entries is
one. Now for these four entries you have the
purple one that occurs when we have two XI's on the green one occurs when we have
three and the dark blue a bit occurs when we have four.
Okay so there is only five different possible values here even though there's
potentially two to four different combination's.
And it turns out, that this actually occurs in, quite a number of different
application, when we talked about, parody constraints for example, in message
decoding, parody constraints are, something that only looks at the number
of XI's that are on, if you want to have digital image segmentation and you want
to have some kind of prior on the number of pixels in a given category, you want
to say that, in this image we expect the number of pixels, to be between you know,
that are labeled sky, to be between 30 and 70 percent.
This is potentially a factor over all pixels in the image.
But it's, but it only counts the total number of pixels.
And so again, it turns out to be efficient.
And there's many other examples as well. A different one that also turns out to be
very useful in practice is the notion of sparse sparse patterns.
So here again you could have an arbitrary number of variables X1 of the XK, but you
only specify score for some small number of assignments.
So here for example is a factor again over a, b, c and d and I have only now
three factor, only three entries in that factor that gets some values, you don't
have to get the same value, but only they get some value and all of the one's that
are white are all have exactly the same score.
Generally zero. And.
This is actually surprisingly useful in practice because it's very useful to give
a higher score to combinations that occur in real data.
So for example if we were doing a spelling problem you want to give a
higher probability to whatever letter combinations occur in a dictionary.
And that's a finite number, it's bounded by the size of the dictionary.
And now you have all the combinatorially many letter combinations that don't occur
in the dictionary. You don't have to give them all a
separate weight, and now again it turns out to be a sparse factor.
The same thing actually occurs in an image analysis problem where, for
example, you can look at all the five by five image patches that actually occur in
natural images; that's actually a surprisingly small set of patches
relative to all possible 25 pixel combinations.
And so, once again, this is a very commonly used factor over.
[INAUDIBLE] that arises on practice. a different factor that has tractable
properties is a convexity factor. Here we have [INAUDIBLE].
Go on up to XK that are denoted by these blue dots over here.
And and, but now they're ordered. So this is X1, X2 and XK.
And the convexity constraint says you can assign these values, these variables one
or zero. But you have to assign them in a way
that. Only a convex set is assigned the value
one. So, for example, you can have this set be
one, and everything else be zero. Or you can have this set be one, and
everything else be zero. But you can't sort of have a scattered
set of ones that are interspersed with the zeros.
And again, this occurs in a surprising number of applications.
You can look at parts in an image segmentation problem and say that the
parts should be convex. You can think about word labeling in text
and say that the words labeled with a certain part of speech or a certain
a certain segment that corresponds to a part in the parse tree, for example,
needs to be contiguous in the sequence if you're trying to label a video, with a
bunch of sub activities, you can say that each sub activity needs to have a
coherent span with a beginning and an end for example.
So to summarize, there is many specialized models that admit a map,
solution, using tractable algorithms. And, a surprising number of those do not
actually have tractable algorithms for computing marginals.
And these specialized models are useful in many cases on their own like the
matching problem that I showed. but also, as we'll see in a min, as we'll
see as a component, in a larger model with other types of factors.
And the fact that this, as a sub-model, has a tractable solution, will turn out
to be very useful as well.