ACEnet = discretised quantum field theory
A while ago I blogged about Markov chain Monte Carlo sampling from joint probability density functions, which is background material for my paper Discrete network dynamics. Part 1: Operator theory (DND1). I want to explain this connection in more detail here.
The purpose of DND1 is to introduce an operator framework for expressing MCMC algorithms that explore discrete state spaces, where the state space itself looks like the bins of a histogram, and each state is a particular joint occupancy of all of the histogram bins.
Special cases of this type of space are:
- A single discrete-valued quantity that takes values in the set {0, 1, 2, ...} corresponds to a single histogram bin with a variable occupancy.
- m binary-valued quantities, exactly one of which is 1 whilst the rest are 0, corresponds to m histogram bins with a single occupant in exactly one of the bins, whilst the rest of the bins are empty.
- m discrete-valued quantities correspond to m histogram bins each with the appropriate occupancy. This directly generalises case (1) above.
- If the occupancy in case (3) is constrained in any way, then the corresponding histogram bin occupancies are analogously constrained. Case (2) above is a special case of this.
- nm discrete-valued quantities can be factorised as n separate instances of m quantities, which thus corresponds to n separate histograms each having m bins. This sort of factorisation generalises in a straightforward way.
- A graph with n nodes, each containing m discrete values, corresponds to n histograms each having m bins. This is an example of case (5) above.
Assume that a single histogram with m=5 bins and n=10 occupants has a discrete-time evolution that is generated by hopping a single sample from one bin to another when the time index advances by one unit. A typical evolution might look like the diagram below.
In the diagram time advances from left to right, at each time instant the histogram state is represented as a bar chart, and at each advance in the time index the change in the histogram state is indicated by the red highlighted connection which shows where a sample has hopped from one bin to another.
In this example each hopping operation is generated by first annihilating a sample and then creating a sample. The annihilation operation chooses which sample to annihilate at random from the whole histogram, whereas the creation operation generates a sample in bin i with probability pi.
The update operator H that generates this is given by (using standard quantum field theory notation for bosonic creation and annihilation operators)
H ≡ ∑i pi ai† ∑j aj
where ∑j aj is a sum (from j = 1 to m) of annihilation operators that annihilates a sample at random, and ∑i pi ai† is a probability-weighted sum (from i = 1 to m) of creation operators that creates a sample in the required probability-weighted fashion. If H is applied to a histogram it produces an ensemble of histograms, whose members correspond to each of the possible outcomes of applying H. In the diagram above a single member of this ensemble is randomly chosen at each time step, using an appropriate probability weighting to take account of the pi factor.
In this case, the equilibrium mixture of histograms that is stationary under application of the update operator can be readily deduced. Each update is a random annihilation followed by a probability-weighted creation, which is overall a memoryless probability weighted hopping operation. Thus the equilibrium mixture of histograms is the same as is obtained by starting with an empty histogram (the "vacuum" state), and then randomly and independently placing n samples into its bins using the same probability weighting for selecting bins as above.
This probability-weighted placement of n samples into m bins is exactly the procedure that is used in the ACEnet self-organising encoder network, where ACEnet is a generalisation of a stochastic vector quantiser to the case of encodings that use more than one sample to represent the code.
So, the discrete-time evolution of a histogram under the action of the update operator H defined above has an equilibrium mixture of histograms that is the same as the distribution of output codes that is obtained from ACEnet.
This is an incredibly useful result, because it allows ACEnet to be replaced by a discrete-time dynamical system whose update operator H is built out of standard QFT creation and annihilation operators. Essentially, discretised QFT is a way of describing a MCMC algorithms, and ACEnet is a special case of a discretised QFT. All information processing in ACEnet can now be analysed using standard QFT techniques. This result is what motivated me to write of the DND1 paper.
Expressing ACEnet in the language of QFT has made it possible to use a lot of existing physical insight into and theoretical techniques from QFT in the analysis of ACEnet. Future papers in the DND series will focus on various developments based on these ideas.
0 Comments:
Post a Comment
<< Home