Discussion about the use of self-organisation for automatically "programming" networks of processing nodes.

Sunday, February 26, 2006

Quantum field = tokens in bins

I wonder whether some people find the phrase "quantum field theory" off-putting, especially when it is used in the context of networks that are discrete time dynamical systems, such as is described in my paper Discrete network dynamics. Part 1: Operator theory.

Quantum field theory is defined in wikipedia as:

Quantum field theory (QFT) is the application of quantum mechanics to fields. It provides a theoretical framework, widely used in particle physics and condensed matter physics, in which to formulate consistent quantum theories of many-particle systems, especially in situations where particles may be created and destroyed.

This is a physicists' definition of QFT, which ignores the wider use of the very useful algebraic properties of QFT in the description of many-particle systems. In the discussion below I will focus on bosonic statistics only, where the algebraic properties are defined using commutation rather than anti-commutation relations. Also, I will assume that the theory is defined on a discrete lattice, rather than on a continuous background space.

Stripped down to its bare essentials, QFT is a theory of how to manipulate "tokens" (e.g. particles) that are stored in "bins" (e.g. states). Each configuration of tokens-in-bins defines a particular set of bin occupancies (e.g. pure state).

Creation and annihilation operators applied to the occupants of these bins causes tokens to be created or annihilated in exactly the way that one would intuitively expect:
  1. A creation operator adds a token to a bin (there is only one way of doing this to a bin), whereas an annihilation operator removes a token from a bin (the number of ways of doing this is equal to the number of tokens in the bin). This corresponds to the commutation relation [ai, aj†] = δi,j.
  2. An annihilation operator completely erases a bin if it is already empty of tokens. This corresponds to annihilation of the vacuum state ai│0> = 0.
  3. A weighted sum of products of creation operators produces a correspondingly weighted sum of bin occupancies (i.e. mixture state).

The above description of the properties of creation and annihilation operators does not mention Planck's constant, nor does it specify what weights to use in a weighted sum of operators. The algebraic structure of QFT is not specifically designed for a quantum theory in which Planck's constant appears, and where the weights are complex-valued probability-amplitudes. Rather, stripped down to its bare essentials, QFT is a very convenient algebraic structure for doing the combinatoric book-keeping associated with the creation and annihilation of tokens in bins, where the weights are real-valued probabilities.

So the algebra of QFT is perfect for describing the manipulation of tokens in bins. Many of the algebraic techniques of QFT that have been developed in the context of physics can be reused in the context of tokens-in-bins. For instance, Feynman diagrams are used in physics to represent algebraic expressions that are formed from operators that generate interlinked sets of particle creation and annihilation operations. These diagrams can also be used to describe interlinked sets of operations on tokens in bins.

So quantum field theory (bosons on a lattice) is entirely appropriate for describing networks that are discrete time dynamical systems, in which the basic update operation consists of the creation and annihilation of tokens in bins.

Saturday, February 04, 2006

ACEnet = discretised quantum field theory

A while ago I blogged about Markov chain Monte Carlo sampling from joint probability density functions, which is background material for my paper Discrete network dynamics. Part 1: Operator theory (DND1). I want to explain this connection in more detail here.

The purpose of DND1 is to introduce an operator framework for expressing MCMC algorithms that explore discrete state spaces, where the state space itself looks like the bins of a histogram, and each state is a particular joint occupancy of all of the histogram bins.

Special cases of this type of space are:

  1. A single discrete-valued quantity that takes values in the set {0, 1, 2, ...} corresponds to a single histogram bin with a variable occupancy.
  2. m binary-valued quantities, exactly one of which is 1 whilst the rest are 0, corresponds to m histogram bins with a single occupant in exactly one of the bins, whilst the rest of the bins are empty.
  3. m discrete-valued quantities correspond to m histogram bins each with the appropriate occupancy. This directly generalises case (1) above.
  4. If the occupancy in case (3) is constrained in any way, then the corresponding histogram bin occupancies are analogously constrained. Case (2) above is a special case of this.
  5. nm discrete-valued quantities can be factorised as n separate instances of m quantities, which thus corresponds to n separate histograms each having m bins. This sort of factorisation generalises in a straightforward way.
  6. A graph with n nodes, each containing m discrete values, corresponds to n histograms each having m bins. This is an example of case (5) above.

Assume that a single histogram with m=5 bins and n=10 occupants has a discrete-time evolution that is generated by hopping a single sample from one bin to another when the time index advances by one unit. A typical evolution might look like the diagram below.

In the diagram time advances from left to right, at each time instant the histogram state is represented as a bar chart, and at each advance in the time index the change in the histogram state is indicated by the red highlighted connection which shows where a sample has hopped from one bin to another.

In this example each hopping operation is generated by first annihilating a sample and then creating a sample. The annihilation operation chooses which sample to annihilate at random from the whole histogram, whereas the creation operation generates a sample in bin i with probability pi.

The update operator H that generates this is given by (using standard quantum field theory notation for bosonic creation and annihilation operators)

H ≡ ∑i pi ai† ∑j aj

where ∑j aj is a sum (from j = 1 to m) of annihilation operators that annihilates a sample at random, and ∑i pi ai† is a probability-weighted sum (from i = 1 to m) of creation operators that creates a sample in the required probability-weighted fashion. If H is applied to a histogram it produces an ensemble of histograms, whose members correspond to each of the possible outcomes of applying H. In the diagram above a single member of this ensemble is randomly chosen at each time step, using an appropriate probability weighting to take account of the pi factor.

In this case, the equilibrium mixture of histograms that is stationary under application of the update operator can be readily deduced. Each update is a random annihilation followed by a probability-weighted creation, which is overall a memoryless probability weighted hopping operation. Thus the equilibrium mixture of histograms is the same as is obtained by starting with an empty histogram (the "vacuum" state), and then randomly and independently placing n samples into its bins using the same probability weighting for selecting bins as above.

This probability-weighted placement of n samples into m bins is exactly the procedure that is used in the ACEnet self-organising encoder network, where ACEnet is a generalisation of a stochastic vector quantiser to the case of encodings that use more than one sample to represent the code.

So, the discrete-time evolution of a histogram under the action of the update operator H defined above has an equilibrium mixture of histograms that is the same as the distribution of output codes that is obtained from ACEnet.

This is an incredibly useful result, because it allows ACEnet to be replaced by a discrete-time dynamical system whose update operator H is built out of standard QFT creation and annihilation operators. Essentially, discretised QFT is a way of describing a MCMC algorithms, and ACEnet is a special case of a discretised QFT. All information processing in ACEnet can now be analysed using standard QFT techniques. This result is what motivated me to write of the DND1 paper.

Expressing ACEnet in the language of QFT has made it possible to use a lot of existing physical insight into and theoretical techniques from QFT in the analysis of ACEnet. Future papers in the DND series will focus on various developments based on these ideas.