ACEnetica

Discussion about the use of self-organisation for automatically "programming" networks of processing nodes.

Sunday, October 23, 2005

Stochastic vector quantiser

We can now generalise the basic self-organising network known as the vector quantiser by combining results from several earlier postings:

  1. In Basic self-organising network: vector quantiser I described a basic type of self-organising network (SON) called a vector quantiser (VQ), which does two simultaneous jobs: Encoding - which takes each continuous-valued input vector x and use an encoder k(x) to map it to a discrete-valued code k, Decoding - which takes each discrete-valued code k and use a decoder x(k) to map it to a continuous-valued input x. This creation of mappings to and from new "representation" spaces is a defining feature of SONs, and the VQ is the simplest example using non-linear mappings that I know of.
  2. In Vector quantiser simulation I included Mathematica code for simulating a VQ, which learnt to partition a 2-dimensional regions into "code cells".
  3. In Bayes theorem I introduced Bayes theorem for manipulating joint probabilities. This is the key result that makes it possible to do useful computations with joint probabilities, without being constrained to those special cases that admit analytic solutions.
  4. In Markov chain Monte Carlo sampling (MCMC) I illustrated how to use Bayes theorem to draw samples from a joint probability, which generates a Markov chain of samples that visits each joint state with a frequency that is proportional to its joint probablity, with the caveat that the sequence thus generates is correlated because it has a memory of its past history.

What do these results give you when they are folded together? The basic result is item 4 above on MCMC sampling joint probabilities. If several MCMC updates are applied to a joint state (x, k) to create the state sequence (?, ?) → (x, ?) → (x, k) → (x′, k), where the MCMC updates alternate between x and k spaces, then the encoding/decoding process that occurs in a VQ can be mimicked, giving the encode/decode loop xkx′ that I described in Basic self-organising network: vector quantiser. More generally, if the MCMC updates alternate randomly, then the same VQ encoding/decoding process occurs, but less efficiently.

This connection between VQ encoding/decoding and MCMC algorithms allows a stochastic generalisation of the VQ to be formulated. This is called a stochastic vector quantiser (SVQ), and its structure is shown in the diagram below.

where Pr(kx) is the encoder's conditional probability over codes k given the input vector x, and Pr(x´│k) is the corresponding decoder's conditional probability.

Compare this diagram with the following two previous diagrams:

  1. The basic VQ that I showed in Basic self-organising network: vector quantiser. The only difference is the notation that is used to describe the "encode" and "decode" blocks of the diagrams.
  2. The first few updates of the MCMC simulation that I showed in Markov chain Monte Carlo sampling. Each pair of successive diagrams is potentially a stochastic encoding followed by a decoding, assuming that the MCMC algorithm happens to update the right hand space followed by the left hand space.

A slightly more appropriate version of the MCMC diagram would look like the diagram below, where a larger number of states has been used in each of the two spaces than before, and arrows have been drawn to show which space the MCMC algorithm happens to update at each step. In this case the first two steps behave like an SVQ encode/decode loop, because the arrows form the characteristic out-and-back form shown in the SVQ diagram above.

To encode you compute the encoder's conditional probability Pr(kx), and to decode you then compute the decoder's conditional probability Pr(x´│k). Overall, this specifies the chain xkx´ leading from the original input vector, via the code, and then back to a reconstruction of the input vector. Each of the arrows in this chain is stochastic in an SVQ. This means that for each input vector x there is a whole ensemble of alternative possible codes k, each member of which is weighted by the corresponding conditional probability Pr(kx), and one such member is randomly selected as the coded version of the input. Similarly, for each code k there is a whole ensemble of alternative possible reconstructions x´, each member of which is weighted by the corresponding conditional probability Pr(x´│k). This procedure is equivalent to two steps of an MCMC algorithm.

The SVQ generalisation of the Euclidean VQ objective function is

D = ∫dx Pr(x) ∑k Pr(kx) ∫dx´Pr(x´│k) ║x - x´║2

This may be interpreted thus:

  1. The joint probability Pr(x, k, x´) of the chain xkx´ can be split up as Pr(x, k, x´) = Pr(x) Pr(kx) Pr(x´│k).
  2. x - x´║2 is the Euclidean distance between the original input vector and its reconstruction after passing along the chain xkx´.
  3. The ∫dx Pr(x) ∑k Pr(kx) ∫dx´Pr(x´│k) (...) sums and integrates over all of the alternative choices of the chain xkx´.

By using Bayes theorem this objective function can be simplified to the following

D = 2 ∫dx Pr(x) ∑k Pr(kx) ║x - x´(k)║2

where x´(k) is defined as

x´(k) ≡ (∫dx Pr(x) Pr(kx) x) / (∫dx Pr(x) Pr(kx))

Compare these expressions with the results in Basic self-organising network: vector quantiser to see that the SVQ and the VQ are mathematically very similar, except that in an SVQ the encoder is stochastic because it is described by Pr(kx). The overall factor of 2 in the SVQ objective function is not important; it comes from the symmetrical way that I defined D for an SVQ. The VQ is a special case of an SVQ where Pr(kx) = δk,k(x).

Overall, an SVQ is the same as a VQ, except that there is now some "slack" in the choice of how to encode each input vector. Also, a computational structure that is equivalent to an SVQ is embedded within an MCMC simulation of the joint probability Pr(x, k). It turns out that the MCMC viewpoint is ultimately the most useful for making further generalisations of the VQ and SVQ self-organising networks.

0 Comments:

Post a Comment

<< Home