<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-13056860</id><updated>2012-01-19T01:50:22.674Z</updated><title type='text'>ACEnetica</title><subtitle type='html'>Discussion about the use of self-organisation for automatically "programming" networks of processing nodes.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Stephen Luttrell</name><uri>http://www.blogger.com/profile/11094835879740297834</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/-D5M50onAlc4/Txd1bXcHtYI/AAAAAAAAAB0/Gc_52Is1k88/s220/me.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>36</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-13056860.post-3422204496533552155</id><published>2007-08-25T20:35:00.000Z</published><updated>2007-08-25T20:36:18.365Z</updated><title type='text'>This blog is now closed</title><content type='html'>Please go to &lt;a href="http://stephenluttrell.blogspot.com/"&gt;The Spline&lt;/a&gt; for future blog postings.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-3422204496533552155?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/3422204496533552155/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=3422204496533552155' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/3422204496533552155'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/3422204496533552155'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2007/08/this-blog-is-now-closed.html' title='This blog is now closed'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-7394419139315253362</id><published>2007-01-26T22:42:00.000Z</published><updated>2007-01-26T22:52:36.243Z</updated><title type='text'>Commentary</title><content type='html'>&lt;a href="http://en.wikipedia.org/wiki/Jorge_Luis_Borges"&gt;Jorge Luis Borges&lt;/a&gt; wrote:&lt;br /&gt;&lt;br /&gt;&lt;em&gt;The composition of vast books is a laborious and impoverishing extravagance. To go on for five hundred pages developing an idea whose perfect oral exposition is possible in a few minutes! A better course of procedure is to pretend that these books already exist, and then to offer a résumé, a commentary.&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;Well, that should save a bit of time!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-7394419139315253362?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/7394419139315253362/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=7394419139315253362' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/7394419139315253362'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/7394419139315253362'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2007/01/commentary.html' title='Commentary'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-116966668604844909</id><published>2007-01-24T19:04:00.000Z</published><updated>2007-01-24T20:19:39.713Z</updated><title type='text'>Discrete network dynamics. Part 1: Operator theory (update 2)</title><content type='html'>I need to update what is going on with my paper "Discrete network dynamics. Part 1: Operator theory", which can be found in arXiv at &lt;a href="http://arxiv.org/abs/cs/0511027"&gt;cs.NE/0511027&lt;/a&gt;. Yes, it was that long ago and the paper is &lt;em&gt;still&lt;/em&gt; not published, nor do I know &lt;em&gt;how&lt;/em&gt; I can get it published. I previously blogged about this &lt;a href="http://acenetica.blogspot.com/2005/11/discrete-network-dynamics-part-1.html"&gt;here&lt;/a&gt;, &lt;a href="http://acenetica.blogspot.com/2006/01/information-processing-researchers.html"&gt;here&lt;/a&gt;, &lt;a href="http://acenetica.blogspot.com/2006/03/discrete-network-dynamics-part-1.html"&gt;here&lt;/a&gt;, and &lt;a href="http://acenetica.blogspot.com/2006/04/information-processing-researchers.html"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I have been getting rather evasive feedback from various people about this paper, and I don't like the sound of what I hear at all. In a nutshell, there is a widespread view that my use of quantum field theory is misguided and/or wrong.&lt;br /&gt;&lt;br /&gt;One critic said that I didn't understand quantum mechanics, which I thought was rather odd given that my PhD is in quantum chromodynamics! That comment immediately told me that he had &lt;em&gt;not&lt;/em&gt; read my paper, or at least had not read it very carefully, because I &lt;em&gt;don't&lt;/em&gt; actually use quantum mechanics in the paper, as least not in the sense that physicists use it (i.e. the sense in which the criticism was made). In my paper I explain how I use operators to implement the elementary processes of adding samples to and removing samples from histogram bins, and that these operators have &lt;em&gt;exactly&lt;/em&gt; the same algebraic properties as the bosonic creation and annihilation operators that are used in physics. To this extent, what I am doing is &lt;em&gt;algebraically&lt;/em&gt; equivalent to a quantum field theory, and I explained this in greater detail in an earlier posting &lt;a href="http://acenetica.blogspot.com/2006/02/quantum-field-tokens-in-bins.html"&gt;here&lt;/a&gt;. What I am certainly &lt;em&gt;not&lt;/em&gt; doing is using a representation of these operators in which they act on an underlying wave function, because there &lt;em&gt;isn't&lt;/em&gt; any wave function defined in my samples-in-histogram-bins model. My creation and annihilation operators should be thought of as little pieces of algorithm whose effect is to add and remove samples from histogram bins.&lt;br /&gt;&lt;br /&gt;All I use is operators for creating and annihilating quanta; the model is defined at the level of these quanta, and there is no deeper level of theory. This is where the QFT that I use is a &lt;em&gt;subset&lt;/em&gt; of the type of QFT that physicists are familiar with; I make use of only the combinatoric properties of field quanta, which are elegantly summarised in the algebraic properties of the corresponding creation and annihilation operators. I also use a lattice QFT, rather than a continuum QFT, but I assume that the difference between these two is relatively benign (&lt;em&gt;pace&lt;/em&gt;, mathematical purists!). Because physicists are exposed to QFT in (relativistic) particle physics and in (non-relativistic) condensed matter physics, they naturally &lt;em&gt;assume&lt;/em&gt; that QFT needs an underlying wave function with the usual quantum mechanical behaviours that they have been taught about. This is a very narrow definition of QFT, because you can have a QFT whose quanta are built in any way that you want.&lt;br /&gt;&lt;br /&gt;One can manipulate these creation and annhilation operators by using their algebraic properties to rearrange "operator products" in various ways, and thus break operator expressions apart into a sum of contributions with different combinatoric properties, each weighted by a combinatoric factor that is automatically generated by the algebraic manipulations. These manipulations have the same general structure as the sorts of manipulation that occur in "operator product expansion" calculations in high energy physics; my PhD dissertation is full of this sort of calculation, and some relevant papers that I contributed to can be found by doing a search of the SPIRES database &lt;a href="http://www.slac.stanford.edu/spires/find/hep/www?rawcmd=FIND+AUTHOR+S+P+LUTTRELL"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;All the above leaves me no choice but to describe what I write about in my paper "Discrete network dynamics. Part 1: Operator theory" as a "quantum field theory".&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-116966668604844909?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/116966668604844909/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=116966668604844909' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/116966668604844909'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/116966668604844909'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2007/01/discrete-network-dynamics-part-1.html' title='Discrete network dynamics. Part 1: Operator theory (update 2)'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-116717701816811394</id><published>2006-12-26T23:44:00.000Z</published><updated>2006-12-26T23:52:10.283Z</updated><title type='text'>Four preconditions for civilisation?</title><content type='html'>If you are wondering why there is no news on the ACEnetica front, it's because none of the &lt;a href="http://luttrellica.blogspot.com/2006/12/four-preconditions-for-civilisation.html"&gt;Four preconditions for civilisation&lt;/a&gt; is currently being met.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-116717701816811394?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/116717701816811394/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=116717701816811394' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/116717701816811394'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/116717701816811394'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2006/12/four-preconditions-for-civilisation.html' title='Four preconditions for civilisation?'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-115679241860883061</id><published>2006-08-28T18:59:00.000Z</published><updated>2007-01-07T12:29:27.213Z</updated><title type='text'>Burden of proof</title><content type='html'>There is an article by &lt;a href="http://en.wikipedia.org/wiki/Marcus_du_Sautoy"&gt;Marcus du Sautoy&lt;/a&gt; entitled &lt;a href="http://www.newscientist.com/channel/fundamentals/mg19125661.400-mathematics-the-burden-of-proof.html"&gt;Burden of Proof&lt;/a&gt; in this week's New Scientist, which discusses the issue of whether a computer generated mathematical proof can be considered to actually be a proof, especially since some such proofs are so lengthy that they cannot be checked by a human. An example of this type of proof is the &lt;a href="http://en.wikipedia.org/wiki/Four_color_theorem"&gt;4-colour map theorem&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;My own view on this issue is that a computer generated proof has &lt;em&gt;exactly&lt;/em&gt; the same status as a human generated proof. The difference is only one of the degree of assistance provided to the brain of the human to help with the generation of the proof. A totally unassisted human would have to somehow do the whole proof mentally, which severely limits the length of proofs that are accessible. A human with the typical assistance that is allowed in an examination room (i.e. pen and paper) has the luxury of at least being able to write things down, which allows &lt;em&gt;much&lt;/em&gt; longer proofs to be reliably generated. The mechanics of generating a proof then reduce to using well-defined rules to manipulate symbolic expressions, where pen and paper are used as a medium for representing these symbols, and the rules are implemented in the human brain.&lt;br /&gt;&lt;br /&gt;The degree of assistence in generating a proof can be taken one stage further by using a computer to implement some or all of the rules for manipulating the symbolic expressions, rather than implementing all of the rules in the human brain. This seems to be a fairly radical step to take, because hitherto the only part of the proof that was "outside" the human brain was its "dumb" representation using pen and paper, whereas the "clever" bit involving the implementation of rules to manipulate this representation was "inside" the human brain.&lt;br /&gt;&lt;br /&gt;Let us consider what these rules of manipulation actually are. Effectively, they define a procedure for taking an initial expression constructed out of symbols, and repeatedly operating on it using the rules to eventually generate the required final expression. The cleverness is in the &lt;em&gt;construction&lt;/em&gt; of the set of rules, which is where a human is the best source of the cleverness needed to create the rules. There is &lt;em&gt;no&lt;/em&gt; cleverness in the repeated &lt;em&gt;application&lt;/em&gt; of these rules; all that is required is that their application is done &lt;em&gt;reliably&lt;/em&gt;, which is where a computer is the best approach, especially if the proof has many steps.&lt;br /&gt;&lt;br /&gt;Use a human to &lt;em&gt;define&lt;/em&gt; the rules of manipulation, and use a computer to &lt;em&gt;implement&lt;/em&gt; these rules. This approach seems to me to be entirely uncontroversial, and it is exactly how computer generated proofs are done. Note that software bugs in the computer part of the proof are dealt with in an &lt;em&gt;analogous&lt;/em&gt; way to "software" bugs in human part of the proof, i.e. try a variety of approaches on a variety of platforms.&lt;br /&gt;&lt;br /&gt;Interestingly, I wrote a paper (see &lt;a href="http://www.luttrell.org.uk/papers/combine2/combine2.html"&gt;here&lt;/a&gt;)&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Luttrell S P, 1999, Self-organised modular neural networks for encoding data, in Combining Artificial Neural Networks: Ensemble and Modular Multi-Net Systems (Perspectives in Neural Computing, Springer-Verlag, ed. Sharkey A J C), pp. 235-263&lt;br /&gt;&lt;/em&gt;&lt;br /&gt;in which I used a computer (running the symbolic algebra program &lt;em&gt;&lt;a href="http://www.wolfram.com/"&gt;Mathematica&lt;/a&gt;&lt;/em&gt;) to help me derive some results on the optimisation of vector quantisers so that their Euclidean distortion was minimised, and this optimisation was done &lt;em&gt;non&lt;/em&gt;-parametrically in order to discover the globally optimal solution. The only way that I have found to do this type of optimisation involves intermediate expressions that can have a &lt;em&gt;large&lt;/em&gt; number of terms, but which eventually leads to results that can be quite &lt;em&gt;compact&lt;/em&gt;, and which are thus interesting. The only sensible way of handling this type of derivation is to automate it using a computer to implement the rules of manipulation (i.e. using symbolic algebra). As far as I can tell other people have &lt;em&gt;not&lt;/em&gt; taken up my idea of using symbolic algebra to derive this type of result; my paper went down like a lead balloon, which is a great pity.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Update:&lt;/strong&gt; OK, I can see that the &lt;em&gt;Mathematica&lt;/em&gt; notebook in which I derive the results that I published in the above paper (see &lt;a href="http://www.luttrell.org.uk/papers/combine2/combine2.html"&gt;here&lt;/a&gt;) has now been downloaded 22 times (as of 31 December 2006). It would be interesting to hear your comments on it. Maybe you think it is just a load of algebraic twiddling which is not much use in the real world, but that conclusion would ignore the fact that &lt;em&gt;exact&lt;/em&gt; algebraic solutions to toy problems are a powerful source of insight for interpreting &lt;em&gt;approximate&lt;/em&gt; numerical solutions to real world problems.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-115679241860883061?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/115679241860883061/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=115679241860883061' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/115679241860883061'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/115679241860883061'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2006/08/burden-of-proof.html' title='Burden of proof'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-115367743090073334</id><published>2006-07-23T18:50:00.000Z</published><updated>2006-07-23T17:57:10.910Z</updated><title type='text'>Holiday</title><content type='html'>I will be on holiday in the south-west of England for the next 2 weeks, and I won't be able to access the internet, so any comments will remain unanswered until I return.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-115367743090073334?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/115367743090073334/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=115367743090073334' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/115367743090073334'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/115367743090073334'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2006/07/holiday.html' title='Holiday'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-115221401036082182</id><published>2006-07-07T20:08:00.000Z</published><updated>2006-07-07T19:12:04.836Z</updated><title type='text'>Modelling the probability density of Markov sources</title><content type='html'>I have just uploaded a paper to arXiv &lt;a href="http://arxiv.org/abs/cs.NE/0607019"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Title:&lt;/strong&gt; Modelling the probability density of Markov sources&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; This paper introduces an objective function that seeks to minimise the average total number of bits required to encode the joint state of all of the layers of a Markov source. This type of encoder may be applied to the problem of optimising the bottom-up (recognition model) and top-down (generative model) connections in a multilayer neural network, and it unifies several previous results on the optimisation of multilayer neural networks.&lt;br /&gt;&lt;br /&gt;I wrote this paper about a decade ago, and submitted it to Neural Computation on 2 February 1998, but in effect it was not accepted for publication because the referees asked for changes to be made to the paper that I thought (and still think) were unreasonable. Apart from minor reformatting changes, this arXiv version of the paper is &lt;em&gt;identical&lt;/em&gt; to the version that I submitted to Neural Computation.&lt;br /&gt;&lt;br /&gt;The paper contains material that is based on a preprint that I wrote at the &lt;a href="http://www.newton.cam.ac.uk/programmes/NNM/"&gt;Neural Networks and Machine Learning Scientific Programme&lt;/a&gt; at the &lt;a href="http://www.newton.cam.ac.uk/"&gt;Isaac Newton Institute&lt;/a&gt; in 1997. The Newton Institute preprint number is NI97039, and it is available online on this &lt;a href="http://www.newton.cam.ac.uk/preprints97.html"&gt;page of 1997 preprints&lt;/a&gt;, or can be accessed directly &lt;a href="http://www.newton.cam.ac.uk/preprints/NI97039.pdf"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The main idea in the paper is to optimise a multi-layer density-modelling network so that the &lt;em&gt;joint&lt;/em&gt; probability of the state of &lt;em&gt;all&lt;/em&gt; of its layers has desirable properties, rather than for the &lt;em&gt;marginal&lt;/em&gt; probability of the state of &lt;em&gt;only&lt;/em&gt; its input layer to have desirable properties. In effect, all of the layers of the network are put on an equal footing, rather than selecting the input layer to be treated differently from the rest of the layers. The approach to density modelling described in this paper turns out to unify various results that I had published in several of my earlier papers.&lt;br /&gt;&lt;br /&gt;The objective function for optimising the joint probability of the state of all the layers of a network is &lt;em&gt;almost&lt;/em&gt; the same as the one that is used to optimise a &lt;a href="http://en.wikipedia.org/wiki/Helmholtz_machine"&gt;Helmholtz machine&lt;/a&gt;, but it &lt;em&gt;omits&lt;/em&gt; the so-called "bits-back" term that is used by Helmholtz machines, so it penalises the use of &lt;em&gt;distributed&lt;/em&gt; codes whilst encouraging the use of &lt;em&gt;sparse&lt;/em&gt; codes in the hidden layers of the network. At the &lt;a href="http://www.newton.cam.ac.uk/programmes/NNM/"&gt;Neural Networks and Machine Learning Scientific Programme&lt;/a&gt; in 1997, I was told by Geoffrey Hinton (who is one of the originators of the Helmholtz machine) that they had looked at what happened if they omitted the bits-back term, and they had concluded that it didn't lead anywhere interesting! I found this remark quite amusing because much of my useful research output had derived from omitting the bits-back term. Also, with the decade of additional insight that I have accumulated since 1997, I can now see that &lt;em&gt;joint&lt;/em&gt; (rather than &lt;em&gt;marginal&lt;/em&gt;) probability optimisation is the &lt;em&gt;key&lt;/em&gt; to obtaining useful results in multi-layer density-modelling networks. Watch this space, as they say!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-115221401036082182?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/115221401036082182/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=115221401036082182' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/115221401036082182'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/115221401036082182'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2006/07/modelling-probability-density-of.html' title='Modelling the probability density of Markov sources'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-115057158751433502</id><published>2006-06-17T19:07:00.000Z</published><updated>2006-08-15T19:13:04.346Z</updated><title type='text'>Discrete network dynamics stirs</title><content type='html'>After the setback where my DND1 paper &lt;a href="http://acenetica.blogspot.com/2005/11/discrete-network-dynamics-part-1.html"&gt;Discrete network dynamics. Part 1: Operator theory&lt;/a&gt; (which can be found in arXiv at &lt;a href="http://arxiv.org/abs/cs/0511027"&gt;cs.NE/0511027&lt;/a&gt;) was rejected (see &lt;a href="http://acenetica.blogspot.com/2006/04/information-processing-researchers.html"&gt;here&lt;/a&gt; for some remarks about this problem, and in the comments &lt;a href="http://acenetica.blogspot.com/2006/01/information-processing-researchers.html"&gt;here&lt;/a&gt; for some less diplomatic remarks), I have finally started the ball rolling again by actually sitting down to prepare the draft of the &lt;em&gt;second&lt;/em&gt; paper in the series (i.e. DND2).&lt;br /&gt;&lt;br /&gt;I have published a &lt;em&gt;lot&lt;/em&gt; of material on vector quantisers of various types, but unfortunately the common underlying thread behind all of this work is not at all clear from my papers on the subject. Historically, I started working on VQs in the mid-1980's, and I was a refugee from Markov random field models which were too computationally costly for the rather slow computers that we used back then. I have always had an MRF style of thinking (i.e. joint probabilities, Bayesian inference, etc), and VQs offered me a computationally cheap and cheerful way of using &lt;em&gt;some&lt;/em&gt; of the properties of MRF models.&lt;br /&gt;&lt;br /&gt;One of the many reasons why I introduced (in DND1) the operator framework for formulating MCMC algorithms was to unify all of my past work on VQs. Therefore the provisional title of DND2 will be "Discrete network dynamics. Part 2: Vector quantisers". I don't know how long DND2 will be in gestation, because I usually get side-tracked by some new angle that occurs to me whilst I am writing, which then explodes into a whole new line of research that diverts me entirely away from the original task. I will try to avoid that pitfall, but I can't promise that it won't happen.&lt;br /&gt;&lt;br /&gt;Anyway, I thought that you should know that DND is alive and well (even if the reviewers of DND1 didn't think so!).&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Update:&lt;/strong&gt; In my spare time I have been busy "evolving" my DND2 paper. I can't just write down a paper in its final form, rather I have to evolve it from nothing until it reaches a fixed point. As predicted, during this process I have been side-tracked, but it has been onto fertile ground which directly relates to DND2, and which pulls together all of my vector quantiser work even more nicely than I had originally anticipated. As long as 20 years ago I knew all of the operator techniques that I am using in DND2, so you can imagine how annoyed I feel about the circuitous path my research has taken during the intervening years. My evolution of DND2 continues apace, and I will keep you updated as this process converges towards a fixed point.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Update:&lt;/strong&gt; I have just returned from a long Summer holiday during which I thought about my DND2 paper for precisely 0% of the time. Now I have to try to remember where I was in the process of writing DND2, so its gestation will be fairly long as I hinted at in my original posting.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-115057158751433502?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/115057158751433502/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=115057158751433502' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/115057158751433502'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/115057158751433502'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2006/06/discrete-network-dynamics-stirs.html' title='Discrete network dynamics stirs'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-114607607751328058</id><published>2006-04-26T18:07:00.000Z</published><updated>2006-04-26T18:40:45.666Z</updated><title type='text'>Information processing researchers don't like quantum field theory? (update)</title><content type='html'>It has now been officially confirmed. Information processing researchers don't like quantum field theory. My paper &lt;a href="http://acenetica.blogspot.com/2005/11/discrete-network-dynamics-part-1.html"&gt;Discrete network dynamics. Part 1: Operator theory&lt;/a&gt; (which can be found in arXiv at &lt;a href="http://arxiv.org/abs/cs/0511027"&gt;cs.NE/0511027&lt;/a&gt;) has been rejected by the &lt;a href="http://www.jmlr.org/"&gt;Journal of Machine Learning Research&lt;/a&gt;. I presume that I should not quote verbatim from the rejection letter. However, the gist of the reasons for rejection were based on reports from two reviewers:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;The need to include more citations to related work. [This is entirely reasonable. I had overlooked some prior work that didn't use the right key words to be caught by my literature search.]&lt;/li&gt;&lt;li&gt;The need to include a proof of the convergence to the correct joint PDF when &lt;em&gt;multiple&lt;/em&gt; iterations of the update operator were used. [I had already proved that a &lt;em&gt;single&lt;/em&gt; iteration of the update operator was exactly equivalent to a standard MCMC update, and we already know the convergence properties after multiple iterations of a standard MCMC update, so I don't actually need to provide any proof. There is nothing new to prove!]&lt;/li&gt;&lt;li&gt;The need to supply explicit examples of MCMC algorithms generated by use of this approach. [The whole section on ACEnet is such an example. I &lt;em&gt;could&lt;/em&gt; supply more examples, but the paper is already quite long, and I intend to publish these in future papers in the series.]&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;The paper was also sent to a third reviewer who gave an "informal" report which included the statement that my creation and annihilation operators were curious because they anti-commuted. This is news to me, so I invite anyone to point out where I use anti-commuting operators in my paper.&lt;/p&gt;&lt;p&gt;I am disappointed that my paper should be unconditionally rejected. However, it was &lt;em&gt;not&lt;/em&gt; a complete surprise to me (see the title of this posting), so I am not even angry about it. The net effect is that there is now a blockage in the way of all subsequent papers in the DND series, where I investigate various consequences of this approach. Never mind, what I do in this sort of circumstance is to dump a highly abbreviated (and thus incomprehensible) version of the paper on an unsuspecting conference somewhere, where it can languish in their proceedings to be read only by the most inquisitive researchers.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-114607607751328058?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/114607607751328058/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=114607607751328058' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/114607607751328058'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/114607607751328058'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2006/04/information-processing-researchers.html' title='Information processing researchers don&apos;t like quantum field theory? (update)'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-114288571477102763</id><published>2006-03-20T20:03:00.000Z</published><updated>2006-03-20T20:15:14.786Z</updated><title type='text'>Discrete network dynamics. Part 1: Operator theory (update)</title><content type='html'>My paper &lt;a href="http://arxiv.org/abs/cs.NE/0511027"&gt;Discrete Network Dynamics. Part 1: Operator Theory&lt;/a&gt; which I submitted to the  &lt;a href="http://www.jmlr.org/"&gt;Journal of Machine Learning Research&lt;/a&gt; turns out to be rather difficult to assign to reviewers. I just received an apologetic note from the action editor for my paper saying that the people who were the most suitable for reviewing my paper were unavailable for one reason or another. The reasons were not given, but I wonder whether it is because &lt;a href="http://acenetica.blogspot.com/2006/01/information-processing-researchers.html"&gt;Information processing researchers don't like quantum field theory&lt;/a&gt;. I hope this is &lt;em&gt;not&lt;/em&gt; the case, but I fear the worst.&lt;br /&gt;&lt;br /&gt;Optimistically, at least this gives me some more time to extend and refine the ideas in the paper, before I write further papers in the DND sequence of papers.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-114288571477102763?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/114288571477102763/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=114288571477102763' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/114288571477102763'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/114288571477102763'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2006/03/discrete-network-dynamics-part-1.html' title='Discrete network dynamics. Part 1: Operator theory (update)'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-114095268453848223</id><published>2006-02-26T11:00:00.000Z</published><updated>2006-03-22T18:49:41.006Z</updated><title type='text'>Quantum field = tokens in bins</title><content type='html'>I wonder whether some people find the phrase "quantum field theory" off-putting, especially when it is used in the context of networks that are discrete time dynamical systems, such as is described in my paper &lt;a href="http://acenetica.blogspot.com/2005/11/discrete-network-dynamics-part-1.html"&gt;Discrete network dynamics. Part 1: Operator theory&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://en.wikipedia.org/wiki/Quantum_field_theory"&gt;Quantum field theory&lt;/a&gt; is defined in wikipedia as:&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Quantum field theory (QFT) is the application of &lt;/em&gt;&lt;em&gt;quantum mechanics&lt;/em&gt;&lt;em&gt; to &lt;/em&gt;&lt;em&gt;fields&lt;/em&gt;&lt;em&gt;. It provides a theoretical framework, widely used in &lt;/em&gt;&lt;em&gt;particle physics&lt;/em&gt;&lt;em&gt; and &lt;/em&gt;&lt;em&gt;condensed matter physics&lt;/em&gt;&lt;em&gt;, in which to formulate consistent &lt;/em&gt;&lt;em&gt;quantum theories&lt;/em&gt;&lt;em&gt; of many-particle systems, especially in situations where particles may be created and destroyed.&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;This is a &lt;em&gt;physicists'&lt;/em&gt; definition of QFT, which ignores the wider use of the very useful algebraic properties of QFT in the description of many-particle systems. In the discussion below I will focus on &lt;em&gt;bosonic&lt;/em&gt; statistics only, where the algebraic properties are defined using commutation rather than &lt;em&gt;anti&lt;/em&gt;-commutation relations. Also, I will assume that the theory is defined on a &lt;em&gt;discrete&lt;/em&gt; lattice, rather than on a &lt;em&gt;continuous&lt;/em&gt; background space.&lt;br /&gt;&lt;br /&gt;Stripped down to its bare essentials, QFT is a theory of how to manipulate "tokens" (e.g. particles) that are stored in "bins" (e.g. states). Each configuration of tokens-in-bins defines a particular set of bin occupancies (e.g. pure state).&lt;br /&gt;&lt;br /&gt;Creation and annihilation operators applied to the occupants of these bins causes tokens to be created or annihilated in exactly the way that one would intuitively expect:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;A creation operator adds a token to a bin (there is only one way of doing this to a bin), whereas an annihilation operator removes a token from a bin (the number of ways of doing this is equal to the number of tokens in the bin). This corresponds to the commutation relation [&lt;em&gt;a&lt;sub&gt;i&lt;/sub&gt;&lt;/em&gt;, &lt;em&gt;a&lt;sub&gt;j&lt;/sub&gt;&lt;/em&gt;†] = &lt;em&gt;δ&lt;/em&gt;&lt;sub&gt;&lt;em&gt;i&lt;/em&gt;,&lt;em&gt;j&lt;/em&gt;&lt;/sub&gt;.&lt;/li&gt;&lt;li&gt;An annihilation operator completely erases a bin if it is already empty of tokens. This corresponds to annihilation of the vacuum state &lt;em&gt;a&lt;sub&gt;i&lt;/sub&gt;&lt;/em&gt;│0&gt; = 0.&lt;/li&gt;&lt;li&gt;A weighted sum of products of creation operators produces a correspondingly weighted sum of bin occupancies (i.e. mixture state).&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;The above description of the properties of creation and annihilation operators does &lt;em&gt;not&lt;/em&gt; mention Planck's constant, &lt;em&gt;nor&lt;/em&gt; does it specify what weights to use in a weighted sum of operators. The algebraic structure of QFT is &lt;em&gt;not&lt;/em&gt; specifically designed for a quantum theory in which Planck's constant appears, and where the weights are &lt;em&gt;complex&lt;/em&gt;-valued probability-amplitudes. Rather, stripped down to its bare essentials, QFT is a very convenient algebraic structure for doing the combinatoric book-keeping associated with the creation and annihilation of tokens in bins, where the weights are &lt;em&gt;real&lt;/em&gt;-valued probabilities.&lt;/p&gt;&lt;p&gt;So the algebra of QFT is perfect for describing the manipulation of tokens in bins. Many of the algebraic techniques of QFT that have been developed in the context of physics can be reused in the context of tokens-in-bins. For instance, Feynman diagrams are used in physics to represent algebraic expressions that are formed from operators that generate interlinked sets of particle creation and annihilation operations. These diagrams can &lt;em&gt;also&lt;/em&gt; be used to describe interlinked sets of operations on tokens in bins.&lt;/p&gt;&lt;p&gt;So quantum field theory (bosons on a lattice) is entirely appropriate for describing networks that are discrete time dynamical systems, in which the basic update operation consists of the creation and annihilation of tokens in bins.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-114095268453848223?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/114095268453848223/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=114095268453848223' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/114095268453848223'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/114095268453848223'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2006/02/quantum-field-tokens-in-bins.html' title='Quantum field = tokens in bins'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-113882429401215833</id><published>2006-02-04T20:30:00.000Z</published><updated>2006-02-04T20:49:39.413Z</updated><title type='text'>ACEnet = discretised quantum field theory</title><content type='html'>A while ago I blogged about &lt;a href="http://acenetica.blogspot.com/2005/10/markov-chain-monte-carlo-sampling.html"&gt;Markov chain Monte Carlo sampling&lt;/a&gt; from joint probability density functions, which is background material for my paper &lt;a href="http://acenetica.blogspot.com/2005/11/discrete-network-dynamics-part-1.html"&gt;Discrete network dynamics. Part 1: Operator theory&lt;/a&gt; (DND1). I want to explain this connection in more detail here.&lt;br /&gt;&lt;p&gt;The purpose of DND1 is to introduce an operator framework for expressing MCMC algorithms that explore discrete state spaces, where the state space itself looks like the bins of a histogram, and each state is a particular joint occupancy of all of the histogram bins.&lt;/p&gt;&lt;p&gt;Special cases of this type of space are:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;A single discrete-valued quantity that takes values in the set {0, 1, 2, ...} corresponds to a single histogram bin with a variable occupancy.&lt;/li&gt;&lt;li&gt;&lt;em&gt;m&lt;/em&gt; binary-valued quantities, exactly one of which is 1 whilst the rest are 0, corresponds to &lt;em&gt;m&lt;/em&gt; histogram bins with a single occupant in exactly one of the bins, whilst the rest of the bins are empty.&lt;/li&gt;&lt;li&gt;&lt;em&gt;m&lt;/em&gt; discrete-valued quantities correspond to &lt;em&gt;m&lt;/em&gt; histogram bins each with the appropriate occupancy. This directly generalises case (1) above.&lt;/li&gt;&lt;li&gt;If the occupancy in case (3) is constrained in any way, then the corresponding histogram bin occupancies are analogously constrained. Case (2) above is a special case of this.&lt;/li&gt;&lt;li&gt;&lt;em&gt;nm&lt;/em&gt; discrete-valued quantities can be factorised as &lt;em&gt;n&lt;/em&gt; separate instances of &lt;em&gt;m&lt;/em&gt; quantities, which thus corresponds to &lt;em&gt;n&lt;/em&gt; separate histograms each having &lt;em&gt;m&lt;/em&gt; bins. This sort of factorisation generalises in a straightforward way.&lt;/li&gt;&lt;li&gt;A graph with &lt;em&gt;n&lt;/em&gt; nodes, each containing &lt;em&gt;m&lt;/em&gt; discrete values, corresponds to &lt;em&gt;n&lt;/em&gt; histograms each having &lt;em&gt;m&lt;/em&gt; bins. This is an example of case (5) above.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Assume that a single histogram with &lt;em&gt;m&lt;/em&gt;=5 bins and &lt;em&gt;n&lt;/em&gt;=10 occupants has a discrete-time evolution that is generated by hopping a single sample from one bin to another when the time index advances by one unit. A typical evolution might look like the diagram below.&lt;/p&gt;&lt;p&gt;&lt;a href="http://photos1.blogger.com/blogger/4156/1104/1600/image0013.0.jpg"&gt;&lt;img style="CURSOR: hand" alt="" src="http://photos1.blogger.com/blogger/4156/1104/320/image0013.0.jpg" border="0" /&gt;&lt;/a&gt;&lt;/p&gt;&lt;p&gt;In the diagram time advances from left to right, at each time instant the histogram state is represented as a bar chart, and at each advance in the time index the change in the histogram state is indicated by the red highlighted connection which shows where a sample has hopped from one bin to another.&lt;/p&gt;&lt;p&gt;In this example each hopping operation is generated by first annihilating a sample and then creating a sample. The annihilation operation chooses which sample to annihilate at random from the &lt;em&gt;whole&lt;/em&gt; histogram, whereas the creation operation generates a sample in bin &lt;em&gt;i&lt;/em&gt; with probability &lt;em&gt;p&lt;sub&gt;i&lt;/sub&gt;&lt;/em&gt;.&lt;/p&gt;&lt;p&gt;The update operator &lt;em&gt;H&lt;/em&gt; that generates this is given by (using standard quantum field theory notation for bosonic creation and annihilation operators)&lt;br /&gt;&lt;br /&gt;&lt;em&gt;H&lt;/em&gt; ≡ ∑&lt;em&gt;&lt;sub&gt;i&lt;/sub&gt; p&lt;sub&gt;i&lt;/sub&gt; a&lt;sub&gt;i&lt;/sub&gt;&lt;/em&gt;† ∑&lt;em&gt;&lt;sub&gt;j&lt;/sub&gt; a&lt;sub&gt;j&lt;/sub&gt;&lt;/em&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;where ∑&lt;em&gt;&lt;sub&gt;j&lt;/sub&gt; a&lt;sub&gt;j&lt;/sub&gt;&lt;/em&gt; is a sum (from &lt;em&gt;j&lt;/em&gt; = 1 to &lt;em&gt;m&lt;/em&gt;) of annihilation operators that annihilates a sample at random, and ∑&lt;em&gt;&lt;sub&gt;i&lt;/sub&gt; p&lt;sub&gt;i&lt;/sub&gt; a&lt;sub&gt;i&lt;/sub&gt;&lt;/em&gt;† is a probability-weighted sum (from &lt;em&gt;i&lt;/em&gt; = 1 to &lt;em&gt;m&lt;/em&gt;) of creation operators that creates a sample in the required probability-weighted fashion. If &lt;em&gt;H&lt;/em&gt; is applied to a histogram it produces an ensemble of histograms, whose members correspond to each of the possible outcomes of applying &lt;em&gt;H&lt;/em&gt;. In the diagram above a &lt;em&gt;single&lt;/em&gt; member of this ensemble is randomly chosen at each time step, using an appropriate probability weighting to take account of the &lt;em&gt;p&lt;sub&gt;i&lt;/sub&gt;&lt;/em&gt; factor.&lt;/p&gt;&lt;p&gt;In this case, the equilibrium mixture of histograms that is stationary under application of the update operator can be readily deduced. Each update is a random annihilation followed by a probability-weighted creation, which is overall a &lt;em&gt;memoryless&lt;/em&gt; probability weighted hopping operation. Thus the equilibrium mixture of histograms is the &lt;em&gt;same&lt;/em&gt; as is obtained by starting with an empty histogram (the "vacuum" state), and then &lt;em&gt;randomly and independently&lt;/em&gt; placing &lt;em&gt;n&lt;/em&gt; samples into its bins using the same probability weighting for selecting bins as above.&lt;/p&gt;&lt;p&gt;This probability-weighted placement of &lt;em&gt;n&lt;/em&gt; samples into &lt;em&gt;m&lt;/em&gt; bins is exactly the procedure that is used in the ACEnet self-organising encoder network, where ACEnet is a generalisation of a &lt;a href="http://acenetica.blogspot.com/2005/10/stochastic-vector-quantiser.html"&gt;stochastic vector quantiser&lt;/a&gt; to the case of encodings that use &lt;em&gt;more&lt;/em&gt; than one sample to represent the code.&lt;/p&gt;&lt;p&gt;So, the discrete-time evolution of a histogram under the action of the update operator &lt;em&gt;H&lt;/em&gt; defined above has an equilibrium mixture of histograms that is the &lt;em&gt;same&lt;/em&gt; as the distribution of output codes that is obtained from ACEnet.&lt;/p&gt;&lt;p&gt;This is an incredibly useful result, because it allows ACEnet to be replaced by a discrete-time dynamical system whose update operator &lt;em&gt;H&lt;/em&gt; is built out of standard QFT creation and annihilation operators. Essentially, discretised QFT is a way of describing a MCMC algorithms, and ACEnet is a special case of a discretised QFT. All information processing in ACEnet can now be analysed using standard QFT techniques. This result is what motivated me to write of the &lt;a href="http://acenetica.blogspot.com/2005/11/discrete-network-dynamics-part-1.html"&gt;DND1 paper&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;Expressing ACEnet in the language of QFT has made it possible to use a lot of existing physical insight into and theoretical techniques from QFT in the analysis of ACEnet. Future papers in the DND series will focus on various developments based on these ideas.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-113882429401215833?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/113882429401215833/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=113882429401215833' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/113882429401215833'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/113882429401215833'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2006/02/acenet-discretised-quantum-field.html' title='ACEnet = discretised quantum field theory'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-113804131376958979</id><published>2006-01-23T18:18:00.000Z</published><updated>2006-01-25T18:31:59.970Z</updated><title type='text'>Information processing researchers don't like quantum field theory?</title><content type='html'>Having discovered that the problem uploading images was &lt;em&gt;my&lt;/em&gt; fault (see update comment &lt;a href="http://acenetica.blogspot.com/2005/11/problem-uploading-images-to-bloggercom.html"&gt;here&lt;/a&gt;) I am now trying to remember what I was planning to post about my &lt;a href="http://acenetica.blogspot.com/2005/11/discrete-network-dynamics-part-1.html"&gt;Discrete network dynamics. Part 1: Operator theory&lt;/a&gt; paper. I am finding it difficult to pick up this thread.&lt;br /&gt;&lt;br /&gt;Thus far the paper has attracted only one comment, which I hope that I answered satisfactorily. I am disappointed not to have received more comments, and it makes it difficult for me to guess what best to focus on in my planned follow-up postings on the paper. It &lt;em&gt;also&lt;/em&gt; makes it difficult for me to fine tune the paper so that it is ready for peer-reviewed publication; I &lt;em&gt;still&lt;/em&gt; have not submitted it for publication.&lt;br /&gt;&lt;br /&gt;It &lt;em&gt;is&lt;/em&gt; possible that despite my efforts to make the quantum field theory aspect of the paper self-contained, it nevertheless proves to be unintelligible to my intended readers (i.e. people working on information processing networks). Well, that is not surprising to me, because QFT is &lt;em&gt;not&lt;/em&gt; really what you would expect to use when doing information processing.&lt;br /&gt;&lt;br /&gt;I guess my intended readers will pay attention only when I develop the theoretical methods further, and in particular when I derive practical information processing algorithms. I can promise these readers that they are in for some surprises.&lt;br /&gt;&lt;br /&gt;I have wondered whether a better target audience would be quantum field theorists, who might be interested to read about an unusual application of (some of) the QFT methods that they are &lt;em&gt;already&lt;/em&gt; familiar with. I guess that there are lots of researchers who are highly trained in QFT, who eventually end up doing research in an area where QFT is &lt;em&gt;not&lt;/em&gt; used, but who would &lt;em&gt;prefer&lt;/em&gt; to be doing research that &lt;em&gt;is&lt;/em&gt; based on QFT.&lt;br /&gt;&lt;br /&gt;Anyway, I now plan to write a few postings to explain some of the background research on information processing that leads to the QFT techniques that I describe in my paper (see &lt;a href="http://acenetica.blogspot.com/2005/11/discrete-network-dynamics-part-1.html"&gt;here&lt;/a&gt;). This might alleviate some of the "QFT?! Is Luttrell mad?!" responses that I suspect might be lurking out there.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Update:&lt;/strong&gt; I have now submitted my paper (see &lt;a href="http://acenetica.blogspot.com/2005/11/discrete-network-dynamics-part-1.html"&gt;here&lt;/a&gt;) to the &lt;a href="http://www.jmlr.org"&gt;Journal of Machine Learning Research&lt;/a&gt;, which "provides an international forum for the electronic and paper publication of high-quality scholarly articles in all areas of machine learning. All published papers are freely available online." (quoted from their web site).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-113804131376958979?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/113804131376958979/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=113804131376958979' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/113804131376958979'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/113804131376958979'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2006/01/information-processing-researchers.html' title='Information processing researchers don&apos;t like quantum field theory?'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-113164820229438526</id><published>2005-11-27T16:34:00.000Z</published><updated>2006-01-09T18:58:24.006Z</updated><title type='text'>Problem uploading images to blogger.com</title><content type='html'>Merde, merde, and thrice merde!&lt;br /&gt;&lt;br /&gt;For some unknown reason blogger.com is &lt;em&gt;still&lt;/em&gt; not accepting uploaded images, so I am unable to post any images here until that is sorted out. Maybe this problem is related to the comment about "network maintenance" on the &lt;a href="http://status.blogger.com/"&gt;Blogger Status&lt;/a&gt; page.&lt;br /&gt;&lt;br /&gt;This pretty much makes it impossible for me to post anything meaningful, so this blog will be dormant until this problem is sorted out.&lt;br /&gt;&lt;br /&gt;I am more than a little annoyed by this.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Update:&lt;/strong&gt; My frustration is building to the point where I might start to host all of my images elsewhere, and to include links from my postings here.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Update:&lt;/strong&gt; I have &lt;em&gt;now&lt;/em&gt; discovered that the uploading problem was a 30 second timeout that prevented what would otherwise have been successful uploads from completing. For some reason this 30 second threshold was &lt;em&gt;not&lt;/em&gt; exceeded for the first 10 or so images that I uploaded, but it &lt;em&gt;was&lt;/em&gt; exceeded thereafter. Since I was attempting to upload &lt;em&gt;uncompressed&lt;/em&gt; bitmap versions of my images, the solution was simple: compress-before-upload. This trick worked on my first attempt, so my earlier &lt;a href="http://acenetica.blogspot.com/2005/11/discrete-network-dynamics-part-1.html"&gt;Discrete network dynamics. Part 1: Operator theory&lt;/a&gt; posting is now complete with its images. I can now bring this blog out of its dormant state.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-113164820229438526?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/113164820229438526/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=113164820229438526' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/113164820229438526'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/113164820229438526'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2005/11/problem-uploading-images-to-bloggercom.html' title='Problem uploading images to blogger.com'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-113217763537500074</id><published>2005-11-16T21:19:00.000Z</published><updated>2005-11-16T23:26:48.086Z</updated><title type='text'>A new kind of science</title><content type='html'>A few years ago &lt;a href="http://www.stephenwolfram.com/"&gt;Stephen Wolfram&lt;/a&gt; published a book called &lt;a href="http://www.wolframscience.com/"&gt;A new kind of science&lt;/a&gt; (NKS). Both the man and the book cause a strong split in people's opinions. Some claim him to be a genius and others dismiss him as a megalomaniac crackpot. Some claim that the book represents the dawn of a new scientific era, whereas others dismiss it as a derivative work with nothing new to offer. As for me, I don't really pay much attention to what people say, because idle chatter is mostly heresay. I judge things by their demonstrable benefit, and for a book that means I judge it &lt;em&gt;subjectively&lt;/em&gt; by the new ideas that it gives me.&lt;br /&gt;&lt;br /&gt;I originally read the whole of NKS fairly quickly to get the gist of what it said. I already had many of its concepts on board, which helped this speed-reading along. Since then I have delved into selected areas of the book when I realised how the ideas that they held might be applied to my own work on self-organising networks. Usually, I use this book in "pick and mix" mode, where I mine it for concepts rather than complete solutions.&lt;br /&gt;&lt;br /&gt;My &lt;em&gt;subjective&lt;/em&gt; assessment of the book is that it is invaluable for me, because it has allowed me to enlarge the toolbox of scientific techniques that I use in my work. &lt;em&gt;Objectively&lt;/em&gt;, I am sure that most (and perhaps all) of these ideas were already available somewhere in the literature, but then I am told that needles can be found in haystacks if you look hard enough. The key point is that a lot of interesting ideas are contained in the NKS book, and they are repeated many times just in case you don't get the message the first time. I certainly find that thinking in the NKS cellular automaton style gives me a usefully different viewpoint, which I can use to &lt;em&gt;augment&lt;/em&gt; rather than &lt;em&gt;inhibit&lt;/em&gt; my understanding.&lt;br /&gt;&lt;br /&gt;I don't have much time for the scientific snobs who dismiss SW because of his personal and his scientific style. I think it all makes him a rather fascinating character.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-113217763537500074?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/113217763537500074/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=113217763537500074' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/113217763537500074'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/113217763537500074'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2005/11/new-kind-of-science.html' title='A new kind of science'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-113190563914207752</id><published>2005-11-13T17:55:00.000Z</published><updated>2005-11-13T18:36:13.056Z</updated><title type='text'>The human connectome</title><content type='html'>We had the human genome project, and now we have the human connectome project. In this week's New Scientist there is a short article entitled &lt;a href="http://www.newscientist.com/channel/opinion/mg18825251.000"&gt;Connectome&lt;/a&gt;. I found an on-line paper entitled &lt;a href="http://compbiol.plosjournals.org/perlserv?request=get-document&amp;doi=10.1371/journal.pcbi.0010042"&gt;The Human Connectome: A Structural Description of the Human Brain&lt;/a&gt; whose abstract reads:&lt;br /&gt;&lt;br /&gt;&lt;em&gt;The connection matrix of the human brain (the human “connectome”) represents an indispensable foundation for basic and applied neurobiological research. However, the network of anatomical connections linking the neuronal elements of the human brain is still largely unknown. While some databases or collations of large-scale anatomical connection patterns exist for other mammalian species, there is currently no connection matrix of the human brain, nor is there a coordinated research effort to collect, archive, and disseminate this important information. We propose a research strategy to achieve this goal, and discuss its potential impact.&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;This project is much more realistic in its scope than the ridiculously premature &lt;a href="http://acenetica.blogspot.com/2005/10/mission-to-build-simulated-brain.html"&gt;Mission to build a simulated brain&lt;/a&gt; that I wrote about a while ago. Measuring the connection matrix of the human brain might seem to be mundane and rather like butterfly collecting, but it is essential to collect lots of data in order to constrain the imagination of the scientists who construct models of the brain; without such data we would not be doing &lt;em&gt;science&lt;/em&gt;.&lt;br /&gt;&lt;br /&gt;I wish them well in the success of this project. However, knowing the connection matrix does &lt;em&gt;not&lt;/em&gt; imply that you understand the brain states that emerge from the corresponding network of neurons. It is not even clear that the connection matrix is a &lt;em&gt;sufficient&lt;/em&gt; description of the network of interconnections between neurons. Even if you know the connection matrix, it does &lt;em&gt;not&lt;/em&gt; tell you anything about the self-organisation of such a network of connections in the first place. Which aspects are genetically determined and which are the result of being driven by sensory inputs?&lt;br /&gt;&lt;br /&gt;Are there as yet undiscovered universal principles governing such self-organisation? I'll bet there are.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-113190563914207752?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/113190563914207752/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=113190563914207752' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/113190563914207752'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/113190563914207752'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2005/11/human-connectome.html' title='The human connectome'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-113156779751644235</id><published>2005-11-10T19:42:00.000Z</published><updated>2006-01-08T19:38:48.166Z</updated><title type='text'>Discrete network dynamics. Part 1: Operator theory</title><content type='html'>This posting is a brief overview of my recent paper entitled "Discrete network dynamics. Part 1: Operator theory", which can be found in arXiv at &lt;a href="http://arxiv.org/abs/cs/0511027"&gt;cs.NE/0511027&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The basic idea is that an MCMC simulation (see a description of MCMC algorithms in a previous posting of mine &lt;a href="http://acenetica.blogspot.com/2005/10/markov-chain-monte-carlo-sampling.html"&gt;here&lt;/a&gt;) of a joint probability Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;), where &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; is the joint state of an &lt;em&gt;N&lt;/em&gt;-dimensional system (e.g. &lt;em&gt;x&lt;sub&gt;i&lt;/sub&gt;&lt;/em&gt; is the state of the &lt;em&gt;i&lt;/em&gt;&lt;sup&gt;th&lt;/sup&gt; node of an &lt;em&gt;N&lt;/em&gt;-node network), generates a sequence of &lt;em&gt;&lt;strong&gt;x&lt;/strong&gt;&lt;/em&gt;'s (i.e. &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(1), &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(2), ... , &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;t&lt;/em&gt;), ...) that are sampled from Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;). &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;t&lt;/em&gt;+1) is generated from &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;t&lt;/em&gt;) by an operation that holds some of the components of &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;t&lt;/em&gt;) fixed, whilst stochastically updating the rest of the components of &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;t&lt;/em&gt;). The MCMC update operation is chosen so that if Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;t&lt;/em&gt;)) has the correct joint probability then so also does Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;t&lt;/em&gt;+1)). This was discussed in detail &lt;a href="http://acenetica.blogspot.com/2005/10/markov-chain-monte-carlo-sampling.html"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The MCMC update operation is particularly simple if the state of only &lt;em&gt;one&lt;/em&gt; of the nodes of the network is updated at a time, so that &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;t&lt;/em&gt;+1) differs from &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;t&lt;/em&gt;) in only &lt;em&gt;one&lt;/em&gt; of its vector components. Let us assume that it is the &lt;em&gt;i&lt;/em&gt;&lt;sup&gt;th&lt;/sup&gt; component that is updated, then MCMC update operation consists of two simpler operations:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Erase the state &lt;em&gt;x&lt;sub&gt;i&lt;/sub&gt;&lt;/em&gt;. This is an &lt;em&gt;unconditional&lt;/em&gt; erase operation, in which the state is erased with no memory of what it used to be.&lt;/li&gt;&lt;li&gt;Randomly rewrite the state &lt;em&gt;x&lt;sub&gt;i&lt;/sub&gt;&lt;/em&gt;. This is a &lt;em&gt;conditional&lt;/em&gt; rewrite operation, in which the state is drawn from the conditional probability Pr(&lt;em&gt;x&lt;sub&gt;i&lt;/sub&gt;&lt;/em&gt;│&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;\&lt;em&gt;x&lt;sub&gt;i&lt;/sub&gt;&lt;/em&gt;).&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;&lt;a href="http://photos1.blogger.com/blogger/4156/1104/1600/image0011.jpg"&gt;&lt;img style="CURSOR: hand" alt="" src="http://photos1.blogger.com/blogger/4156/1104/320/image0011.jpg" border="0" /&gt;&lt;/a&gt;&lt;/p&gt;&lt;p&gt;The diagram shows an MCMC update in a visually intuitive way for a 2-node network (i.e. &lt;em&gt;N&lt;/em&gt;=2). Node 1 is at the top of the diagram, node 2 is at the bottom of the diagram, and time reads from left to right across the diagram. The initial state &lt;em&gt;x&lt;/em&gt;&lt;sub&gt;1&lt;/sub&gt; of node 1 is changed to final state &lt;em&gt;x&lt;/em&gt;&lt;sub&gt;1&lt;/sub&gt;´ by the influence of the state &lt;em&gt;x&lt;/em&gt;&lt;sub&gt;2&lt;/sub&gt; of node 2 (which is itself &lt;em&gt;unchanged&lt;/em&gt; in going from initial to final state). The solid lines from left to right indicate the evolution of the states of nodes 1 and 2. The vertical dashed line indicates the influence of the state of node 2 on the state of node 1, and the probability of the final state of node 1 is indicated as Pr(&lt;em&gt;x&lt;/em&gt;&lt;sub&gt;1&lt;/sub&gt;´│&lt;em&gt;x&lt;/em&gt;&lt;sub&gt;2&lt;/sub&gt;), which is memoryless because it does &lt;em&gt;not&lt;/em&gt; depend on &lt;em&gt;x&lt;/em&gt;&lt;sub&gt;1&lt;/sub&gt;.&lt;/p&gt;&lt;p&gt;The above diagram is a Feynman-like diagram describing the annihilation of the state of node 1, and then subsequent creation of a new state for node 1, and the "amplitude" for this process is actually a probability given by Pr(&lt;em&gt;x&lt;/em&gt;&lt;sub&gt;1&lt;/sub&gt;´│&lt;em&gt;x&lt;/em&gt;&lt;sub&gt;2&lt;/sub&gt;).&lt;/p&gt;&lt;p&gt;This identification allows the MCMC update operator to be written as&lt;/p&gt;&lt;p&gt;&lt;em&gt;H&lt;sub&gt;a&lt;/sub&gt;&lt;/em&gt; ≡ ∑&lt;sub&gt;&lt;em&gt;i&lt;/em&gt;&lt;/sub&gt; &lt;em&gt;a&lt;/em&gt;&lt;sub&gt;&lt;em&gt;i&lt;/em&gt;&lt;/sub&gt;† (∑&lt;em&gt;&lt;sub&gt;j&lt;/sub&gt;a&lt;sub&gt;j&lt;/sub&gt;&lt;/em&gt;) (∑&lt;em&gt;&lt;sub&gt;k&lt;/sub&gt;&lt;/em&gt;Pr(&lt;em&gt;i&lt;/em&gt;│&lt;em&gt;k&lt;/em&gt;) &lt;em&gt;b&lt;sub&gt;k&lt;/sub&gt;&lt;/em&gt;†&lt;em&gt;b&lt;sub&gt;k&lt;/sub&gt;&lt;/em&gt;)&lt;/p&gt;&lt;p&gt;where &lt;em&gt;a&lt;sub&gt;i&lt;/sub&gt;&lt;/em&gt;† is the creation operator for state &lt;em&gt;i&lt;/em&gt; at node 1, &lt;em&gt;b&lt;sub&gt;k&lt;/sub&gt;&lt;/em&gt;† is the creation operator for state &lt;em&gt;k&lt;/em&gt; at node 2, &lt;em&gt;b&lt;sub&gt;k&lt;/sub&gt;&lt;/em&gt;†&lt;em&gt;b&lt;sub&gt;k&lt;/sub&gt;&lt;/em&gt; is the number operator for state &lt;em&gt;k&lt;/em&gt; at node 2, ∑&lt;em&gt;&lt;sub&gt;k&lt;/sub&gt;&lt;/em&gt;Pr(&lt;em&gt;i&lt;/em&gt;│&lt;em&gt;k&lt;/em&gt;) &lt;em&gt;b&lt;sub&gt;k&lt;/sub&gt;&lt;/em&gt;†&lt;em&gt;b&lt;sub&gt;k&lt;/sub&gt;&lt;/em&gt; is the operator for computing the conditional probability factor that weights the MCMC update, and ∑&lt;em&gt;&lt;sub&gt;j&lt;/sub&gt;a&lt;sub&gt;j&lt;/sub&gt;&lt;/em&gt; unconditionally annihilates the state at node 1.&lt;/p&gt;&lt;p&gt;Application of the MCMC update operator &lt;em&gt;H&lt;sub&gt;a&lt;/sub&gt;&lt;/em&gt; automatically generates the correct probability-weighted ensemble of possible next states of the 2-node network in which the state of node 1 is updated. The full MCMC update operator is the sum of &lt;em&gt;H&lt;sub&gt;a&lt;/sub&gt;&lt;/em&gt;+&lt;em&gt;H&lt;sub&gt;b&lt;/sub&gt;&lt;/em&gt;, where &lt;em&gt;H&lt;sub&gt;b&lt;/sub&gt;&lt;/em&gt; is obtained from &lt;em&gt;H&lt;sub&gt;a&lt;/sub&gt;&lt;/em&gt; by interchanging the "a" and "b" operators (and appropriately changing the conditional probability factor). The Feynman-like diagram version of &lt;em&gt;H&lt;sub&gt;a&lt;/sub&gt;&lt;/em&gt;+&lt;em&gt;H&lt;sub&gt;b&lt;/sub&gt;&lt;/em&gt; is thus a &lt;em&gt;sum&lt;/em&gt; of two diagrams as shown below.&lt;/p&gt;&lt;p&gt;&lt;a href="http://photos1.blogger.com/blogger/4156/1104/1600/image0012.jpg"&gt;&lt;img style="CURSOR: hand" alt="" src="http://photos1.blogger.com/blogger/4156/1104/320/image0012.jpg" border="0" /&gt;&lt;/a&gt;&lt;/p&gt;&lt;p&gt;The MCMC update operator correctly implements the 2-step MCMC update procedure described earlier. Each node of the network has a state with occupancy 1 that is erased (i.e. annhilated) during step 1 of the MCMC update to become the "vacuum" state with occupancy 0, and the state is rewritten (i.e. created) during step 2 of the MCMC update to become a state with occupancy 1 again. This update process is memoryless because its intermediate state is the "vacuum".&lt;/p&gt;&lt;p&gt;However, the MCMC update operator goes beyond the standard MCMC updates, because it can &lt;em&gt;also&lt;/em&gt; be applied to initial states that have an occupancy of 2 or more. The algebra of the creation and annihilation operators automatically does this generalisation for you, and thus defines a unique generalisation of the MCMC algorithm for multiply occupied states of the network nodes. &lt;/p&gt;&lt;p&gt;The above discussion has focussed on a network with only 2 nodes. The generalisation to an &lt;em&gt;N&lt;/em&gt;-node network is straightforward when the probability Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) of the joint state of the network is expanded using the Hammersley-Clifford expansion, as is shown in my paper.&lt;/p&gt;&lt;p&gt;The structure of this MCMC algorithm is similar to a quantum field theory of bosons:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;Each state of node &lt;em&gt;i&lt;/em&gt; of the network can have a variable occupancy, which corresponds to each state of bosons of type&lt;em&gt; i&lt;/em&gt; having a variable occupancy.&lt;/li&gt;&lt;li&gt;An MCMC update causes the multiply occupied state of node &lt;em&gt;i&lt;/em&gt; of the network to change by using the neighbouring nodes to influence the hopping of an occupant from one state of node &lt;em&gt;i&lt;/em&gt; to another state of node &lt;em&gt;i&lt;/em&gt;, which corresponds to a boson of type &lt;em&gt;i&lt;/em&gt; scattering from one state to another under the influence of bosons of various other types.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;This completes the reformulation of MCMC algorithms using the language of creation and annihilation operators, and opens up a large number of possible lines of enquiry.&lt;/p&gt;&lt;p&gt;One particularly interesting consequence of multiple occupancy MCMC algorithms is self-organising networks, in which the state space of a multiply occupied network node can split into 2 or more smaller state spaces, each of which inherits some of the original node's multiple occupants, so that overall the original node splits into several daughter nodes. I will be saying &lt;em&gt;much&lt;/em&gt; more about this later.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-113156779751644235?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/113156779751644235/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=113156779751644235' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/113156779751644235'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/113156779751644235'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2005/11/discrete-network-dynamics-part-1.html' title='Discrete network dynamics. Part 1: Operator theory'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-113140280201225075</id><published>2005-11-07T22:22:00.000Z</published><updated>2005-11-08T00:08:31.656Z</updated><title type='text'>Discrete network dynamics series of papers</title><content type='html'>I am now back after a quiet period during which I was writing my first paper in a forthcoming series of papers on "discrete network dynamics". This first paper is entitled "Discrete Network Dynamics. Part 1: Operator Theory", and I have just uploaded it to arXiv where it is available at &lt;a href="http://arxiv.org/abs/cs/0511027"&gt;cs.NE/0511027&lt;/a&gt;. I will be commenting in detail on this paper in due course.&lt;br /&gt;&lt;br /&gt;The "discrete network dynamics" series of papers will be about the use of operator techniques borrowed from quantum field theory to implement Markov chain Monte Carlo algorithms, and the emphasis will be on the adaptive cluster expansion network (ACEnet) which is a particular type of self-organising network.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-113140280201225075?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/113140280201225075/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=113140280201225075' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/113140280201225075'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/113140280201225075'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2005/11/discrete-network-dynamics-series-of.html' title='Discrete network dynamics series of papers'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-113011088298508420</id><published>2005-10-23T23:35:00.000Z</published><updated>2005-10-23T23:41:22.986Z</updated><title type='text'>Where is this all going?</title><content type='html'>You may well wonder where all of this maths is leading us. You will just have to be patient. I am creating the simplest path possible that short-circuits the route to my latest research work. Before long I will be publishing papers in arXiv, and then subsequently discussing them here. When that happens I hope that things will start to get more exciting.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-113011088298508420?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/113011088298508420/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=113011088298508420' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/113011088298508420'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/113011088298508420'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2005/10/where-is-this-all-going.html' title='Where is this all going?'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-112896717149372810</id><published>2005-10-23T14:10:00.000Z</published><updated>2005-10-23T23:33:53.843Z</updated><title type='text'>Stochastic vector quantiser</title><content type='html'>We can now generalise the basic self-organising network known as the vector quantiser by combining results from several earlier postings:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;In &lt;a href="http://acenetica.blogspot.com/2005/09/basic-self-organising-network-vector.html"&gt;Basic self-organising network: vector quantiser&lt;/a&gt; I described a basic type of self-organising network (SON) called a vector quantiser (VQ), which does two simultaneous jobs: &lt;strong&gt;Encoding&lt;/strong&gt; - which takes each continuous-valued input vector &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; and use an encoder &lt;em&gt;k&lt;/em&gt;(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) to map it to a &lt;em&gt;discrete&lt;/em&gt;-valued code &lt;em&gt;k, &lt;/em&gt;&lt;strong&gt;Decoding&lt;/strong&gt; - which takes each discrete-valued code &lt;em&gt;k&lt;/em&gt; and use a decoder &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;) to map it to a &lt;em&gt;continuous&lt;/em&gt;-valued input &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;. This creation of mappings to and from new "representation" spaces is a defining feature of SONs, and the VQ is the simplest example using &lt;em&gt;non-linear&lt;/em&gt; mappings that I know of.&lt;/li&gt;&lt;li&gt;In &lt;a href="http://acenetica.blogspot.com/2005/09/vector-quantiser-simulation.html"&gt;Vector quantiser simulation&lt;/a&gt; I included &lt;em&gt;Mathematica&lt;/em&gt; code for simulating a VQ, which learnt to partition a 2-dimensional regions into "code cells".&lt;/li&gt;&lt;li&gt;In &lt;a href="http://acenetica.blogspot.com/2005/10/bayes-theorem.html"&gt;Bayes theorem&lt;/a&gt; I introduced Bayes theorem for manipulating joint probabilities. This is the key result that makes it possible to do useful computations with joint probabilities, without being constrained to those special cases that admit analytic solutions.&lt;/li&gt;&lt;li&gt;In &lt;a href="http://acenetica.blogspot.com/2005/10/markov-chain-monte-carlo-sampling.html"&gt;Markov chain Monte Carlo sampling&lt;/a&gt; (MCMC) I illustrated how to use Bayes theorem to draw samples from a joint probability, which generates a Markov chain of samples that visits each joint state with a frequency that is proportional to its joint probablity, with the caveat that the sequence thus generates is &lt;em&gt;correlated&lt;/em&gt; because it has a memory of its past history.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;What do these results give you when they are folded together? The basic result is item 4 above on MCMC sampling joint probabilities. If several MCMC updates are applied to a joint state (&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;, &lt;em&gt;k&lt;/em&gt;) to create the state sequence (?, ?) → (&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;, ?) → (&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;, &lt;em&gt;k&lt;/em&gt;) → (&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;′, &lt;em&gt;k&lt;/em&gt;), where the MCMC updates alternate between &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; and &lt;em&gt;k&lt;/em&gt; spaces, then the encoding/decoding process that occurs in a VQ can be mimicked, giving the encode/decode loop &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; → &lt;em&gt;k&lt;/em&gt; → &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;′ that I described in &lt;a href="http://acenetica.blogspot.com/2005/09/basic-self-organising-network-vector.html"&gt;Basic self-organising network: vector quantiser&lt;/a&gt;. More generally, if the MCMC updates alternate randomly, then the same VQ encoding/decoding process occurs, but less efficiently.&lt;/p&gt;&lt;p&gt;This connection between VQ encoding/decoding and MCMC algorithms allows a &lt;em&gt;stochastic&lt;/em&gt; generalisation of the VQ to be formulated. This is called a stochastic vector quantiser (SVQ), and its structure is shown in the diagram below. &lt;/p&gt;&lt;p&gt;&lt;a href="http://photos1.blogger.com/blogger/4156/1104/1600/image0005.jpg"&gt;&lt;img style="CURSOR: hand" alt="" src="http://photos1.blogger.com/blogger/4156/1104/320/image0005.jpg" border="0" /&gt;&lt;/a&gt;&lt;/p&gt;&lt;p&gt;where Pr(&lt;em&gt;k&lt;/em&gt;│&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) is the encoder's conditional probability over codes &lt;em&gt;k&lt;/em&gt; given the input vector &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;, and Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´│&lt;em&gt;k&lt;/em&gt;) is the corresponding decoder's conditional probability.&lt;/p&gt;&lt;p&gt;Compare this diagram with the following two previous diagrams:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;The basic VQ that I showed in &lt;a href="http://acenetica.blogspot.com/2005/09/basic-self-organising-network-vector.html"&gt;Basic self-organising network: vector quantiser&lt;/a&gt;. The only difference is the notation that is used to describe the "encode" and "decode" blocks of the diagrams.&lt;/li&gt;&lt;li&gt;The first few updates of the MCMC simulation that I showed in &lt;a href="http://acenetica.blogspot.com/2005/10/markov-chain-monte-carlo-sampling.html"&gt;Markov chain Monte Carlo sampling&lt;/a&gt;. Each pair of successive diagrams is potentially a &lt;em&gt;stochastic&lt;/em&gt; encoding followed by a decoding, assuming that the MCMC algorithm happens to update the right hand space followed by the left hand space.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;A slightly more appropriate version of the MCMC diagram would look like the diagram below, where a larger number of states has been used in each of the two spaces than before, and arrows have been drawn to show which space the MCMC algorithm happens to update at each step. In this case the first two steps behave like an SVQ encode/decode loop, because the arrows form the characteristic out-and-back form shown in the SVQ diagram above. &lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;a href="http://photos1.blogger.com/blogger/4156/1104/1600/image00041.jpg"&gt;&lt;img style="CURSOR: hand" alt="" src="http://photos1.blogger.com/blogger/4156/1104/320/image00041.jpg" border="0" /&gt;&lt;/a&gt;&lt;/p&gt;&lt;p&gt;To encode you compute the encoder's conditional probability Pr(&lt;em&gt;k&lt;/em&gt;│&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;), and to decode you then compute the decoder's conditional probability Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´│&lt;em&gt;k&lt;/em&gt;). Overall, this specifies the chain &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; → &lt;em&gt;k&lt;/em&gt; → &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´ leading from the original input vector, via the code, and then back to a reconstruction of the input vector. Each of the arrows in this chain is &lt;em&gt;stochastic&lt;/em&gt; in an SVQ. This means that for each input vector &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; there is a whole &lt;em&gt;ensemble&lt;/em&gt; of alternative possible codes &lt;em&gt;k&lt;/em&gt;, each member of which is weighted by the corresponding conditional probability Pr(&lt;em&gt;k&lt;/em&gt;│&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;), and &lt;em&gt;one&lt;/em&gt; such member is randomly selected as the coded version of the input. Similarly, for each code &lt;em&gt;k&lt;/em&gt; there is a whole ensemble of alternative possible reconstructions &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´, each member of which is weighted by the corresponding conditional probability Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´│&lt;em&gt;k&lt;/em&gt;). This procedure is equivalent to two steps of an MCMC algorithm.&lt;/p&gt;&lt;p&gt;The SVQ generalisation of the Euclidean VQ objective function is&lt;/p&gt;&lt;p&gt;&lt;em&gt;D&lt;/em&gt; = ∫&lt;em&gt;d&lt;strong&gt;x&lt;/strong&gt;&lt;/em&gt; Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) ∑&lt;sub&gt;&lt;em&gt;k&lt;/em&gt;&lt;/sub&gt; Pr(&lt;em&gt;k&lt;/em&gt;│&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) ∫&lt;em&gt;d&lt;strong&gt;x&lt;/strong&gt;&lt;/em&gt;´Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´│&lt;em&gt;k&lt;/em&gt;) ║&lt;strong&gt;&lt;em&gt;x &lt;/em&gt;&lt;/strong&gt;- &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´║&lt;sup&gt;2&lt;/sup&gt;&lt;/p&gt;&lt;p&gt;This may be interpreted thus:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;The joint probability Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;, &lt;em&gt;k&lt;/em&gt;, &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´) of the chain &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; → &lt;em&gt;k&lt;/em&gt; → &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´ can be split up as Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;, &lt;em&gt;k&lt;/em&gt;, &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´) = Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) Pr(&lt;em&gt;k&lt;/em&gt;│&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´│&lt;em&gt;k&lt;/em&gt;).&lt;/li&gt;&lt;li&gt;║&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; - &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´║&lt;sup&gt;2&lt;/sup&gt; is the Euclidean distance between the original input vector and its reconstruction after passing along the chain &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; → &lt;em&gt;k&lt;/em&gt; → &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´.&lt;/li&gt;&lt;li&gt;The ∫&lt;em&gt;d&lt;strong&gt;x&lt;/strong&gt;&lt;/em&gt; Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) ∑&lt;sub&gt;&lt;em&gt;k&lt;/em&gt;&lt;/sub&gt; Pr(&lt;em&gt;k&lt;/em&gt;│&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) ∫&lt;em&gt;d&lt;strong&gt;x&lt;/strong&gt;&lt;/em&gt;´Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´│&lt;em&gt;k&lt;/em&gt;) (...) sums and integrates over all of the alternative choices of the chain &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; → &lt;em&gt;k&lt;/em&gt; → &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;By using Bayes theorem this objective function can be simplified to the following&lt;/p&gt;&lt;p&gt;&lt;em&gt;D&lt;/em&gt; = 2 ∫&lt;em&gt;d&lt;strong&gt;x&lt;/strong&gt;&lt;/em&gt; Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) ∑&lt;sub&gt;&lt;em&gt;k&lt;/em&gt;&lt;/sub&gt; Pr(&lt;em&gt;k&lt;/em&gt;│&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) ║&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; - &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´(&lt;em&gt;k&lt;/em&gt;)║&lt;sup&gt;2&lt;/sup&gt;&lt;/p&gt;&lt;p&gt;where &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´(&lt;em&gt;k&lt;/em&gt;) is defined as&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´(&lt;em&gt;k&lt;/em&gt;) ≡ (∫&lt;em&gt;d&lt;strong&gt;x&lt;/strong&gt;&lt;/em&gt; Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) Pr(&lt;em&gt;k&lt;/em&gt;│&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) / (∫&lt;em&gt;d&lt;strong&gt;x&lt;/strong&gt;&lt;/em&gt; Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) Pr(&lt;em&gt;k&lt;/em&gt;│&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;))&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Compare these expressions with the results in &lt;a href="http://acenetica.blogspot.com/2005/09/basic-self-organising-network-vector.html"&gt;Basic self-organising network: vector quantiser&lt;/a&gt; to see that the SVQ and the VQ are mathematically very similar, &lt;em&gt;except that&lt;/em&gt; in an SVQ the encoder is &lt;em&gt;stochastic&lt;/em&gt; because it is described by Pr(&lt;em&gt;k&lt;/em&gt;│&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;). The overall factor of 2 in the SVQ objective function is not important; it comes from the symmetrical way that I defined &lt;em&gt;D&lt;/em&gt; for an SVQ. The VQ is a special case of an SVQ where Pr(&lt;em&gt;k&lt;/em&gt;│&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) = δ&lt;sub&gt;&lt;em&gt;k&lt;/em&gt;,&lt;em&gt;k&lt;/em&gt;(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;)&lt;/sub&gt;.&lt;/p&gt;&lt;p&gt;Overall, an SVQ is the same as a VQ, except that there is now some "slack" in the choice of how to encode each input vector. Also, a computational structure that is equivalent to an SVQ is embedded within an MCMC simulation of the joint probability Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;, &lt;em&gt;k&lt;/em&gt;). It turns out that the MCMC viewpoint is ultimately the most useful for making further generalisations of the VQ and SVQ self-organising networks.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-112896717149372810?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/112896717149372810/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=112896717149372810' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112896717149372810'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112896717149372810'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2005/10/stochastic-vector-quantiser.html' title='Stochastic vector quantiser'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-112967950558589305</id><published>2005-10-18T23:36:00.000Z</published><updated>2005-10-19T01:03:15.440Z</updated><title type='text'>One face, one neuron</title><content type='html'>This month in Scientific American there is a News Scan article entitled &lt;a href="http://www.sciamdigital.com/browse.cfm?sequencenameCHAR=item2&amp;methodnameCHAR=resource_getitembrowse&amp;amp;interfacenameCHAR=browse.cfm&amp;ISSUEID_CHAR=3BA448C4-2B35-221B-697D63E19B490C13&amp;amp;amp;amp;ARTICLEID_CHAR=3BB77100-2B35-221B-6B9ED95858117171&amp;sc=I100322"&gt;One Face, One Neuron&lt;/a&gt;, which describes some interesting results (observed in several experimental subjects) in which there was a strong response by a &lt;em&gt;single&lt;/em&gt; neuron to fairly abstract concepts. That is to say, when an experimental subject was shown various different things that were related to the &lt;em&gt;same&lt;/em&gt; abstract concept, the &lt;em&gt;same&lt;/em&gt; neuron fired in each case. It wasn't clear from the article what the &lt;em&gt;rest&lt;/em&gt; of the neurons were doing, but there are rather a lot of them so it isn't possible to measure them all.&lt;br /&gt;&lt;br /&gt;This observation is really significant, because it means that the brain is mapping a &lt;em&gt;highly variable&lt;/em&gt; input onto an &lt;em&gt;invariant&lt;/em&gt; response (at least, as far as one of the neurons is concerned), in which a &lt;em&gt;single&lt;/em&gt; neuron fires reliably in response to a whole range of inputs. No doubt the neural dynamics that leads to this invariant mapping involves &lt;em&gt;many&lt;/em&gt; neurons recurrently interacting with each other, with the net effect being the observed firing behaviour of the &lt;em&gt;single&lt;/em&gt; neuron.&lt;br /&gt;&lt;br /&gt;How is it that the neural dynamics in the brain can self-organise so that it creates these invariant mappings? Presumably they are something like fixed points (or limit cycles) of the dynamics. What properties of the external input are preserved at the output end of these mappings? In other words, what is the neural code used by the brain? Are the same properties of the external input preserved by different brains? If they are, presumably the detailed mappings are different in different brains. How much can the external input be varied whilst still being mapped to the same output? Is a few vague clues in the external input enough for the recurrent neural dynamics to home in on the appropriate output?&lt;br /&gt;&lt;br /&gt;It would be really useful to have a general theory of such self-organising mappings in recurrent networks, wouldn't it?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-112967950558589305?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/112967950558589305/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=112967950558589305' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112967950558589305'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112967950558589305'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2005/10/one-face-one-neuron.html' title='One face, one neuron'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-112957618714180360</id><published>2005-10-17T18:48:00.000Z</published><updated>2005-10-19T01:07:00.776Z</updated><title type='text'>Laughlin versus reductionism</title><content type='html'>In a posting &lt;a href="http://motls.blogspot.com/2005/10/laughlin-vs-reductionism.html"&gt;Laughlin vs. reductionism&lt;/a&gt; Luboš Motl chooses to shoot down Robert Laughlin's book "A Different Universe: Reinventing Physics from the Bottom Down" for its supposed attack on reductionism. I commented before on &lt;a href="http://acenetica.blogspot.com/2005/09/reductionism-versus-emergentism.html"&gt;Reductionism versus emergentism&lt;/a&gt;, where I came out in favour of Laughlin's book, because it strikes me as being a fairly balanced account of the merits and disadvantages of reductionism and emergentism. However, my salary is no longer paid to me for doing fundamental physics, so presumably I have less to lose when Laughlin's book damns reductionism with faint praise.&lt;br /&gt;&lt;br /&gt;More generally, it is very useful to think at many levels, all the way from deep underlying theories (i.e. reductionism) up to phenomenological descriptions (i.e. emergentism), because useful science can be done at all of these levels, and usually insight at one level can help progress to be made at another level, although this isn't guaranteed.&lt;br /&gt;&lt;br /&gt;As I said in &lt;a href="http://acenetica.blogspot.com/2005/09/reductionism-versus-emergentism.html"&gt;Reductionism versus emergentism&lt;/a&gt;, my hope is to use self-organising networks to address the relationship between reductionism and emergentism, and "if it all goes to plan then reductionism will emerge from emergentism by a process of self-organisation".&lt;br /&gt;&lt;br /&gt;In information processing I certainly do &lt;em&gt;not&lt;/em&gt; want to have to think in terms of individual binary digits the whole time. I want to build software objects that have well-defined behaviour at a much higher level, and to be able to ignore the detailed internal workings of these objects, and to deal instead only with their externally exposed properties. However, at the same time, I want to be able to interrogate those higher level properties to discover their internal dynamics. This is what a self-organising network will allow me to do.&lt;br /&gt;&lt;br /&gt;You certainly can't do useful (read "macroscopic") self-organising information processing without bumping straight into emergent properties.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-112957618714180360?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/112957618714180360/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=112957618714180360' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112957618714180360'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112957618714180360'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2005/10/laughlin-versus-reductionism.html' title='Laughlin versus reductionism'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-112907282513395359</id><published>2005-10-13T20:26:00.000Z</published><updated>2005-10-23T12:41:17.663Z</updated><title type='text'>Markov chain Monte Carlo sampling</title><content type='html'>In a previous posting I explained &lt;a href="http://acenetica.blogspot.com/2005/10/bayes-theorem.html"&gt;Bayes theorem&lt;/a&gt; for splitting up joint probabilities Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) into component pieces (either Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;│&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) Pr(&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) or Pr(&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;│&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;)), which leads onto a very neat way of drawing sample pairs of vectors (&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) so that they have the correct joint probability Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;). This is called Markov chain Monte Carlo (MCMC) sampling.&lt;br /&gt;&lt;br /&gt;Firstly, notice that the &lt;em&gt;pair&lt;/em&gt; of vectors (&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) forms a &lt;em&gt;single&lt;/em&gt; (higher-dimensional) vector. In the description below this partitioning of the &lt;em&gt;single&lt;/em&gt; vector into a &lt;em&gt;pair&lt;/em&gt; of vectors (&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) is done in whatever way is convenient. This will be explained below.&lt;br /&gt;&lt;br /&gt;Start off by supposing that someone hands you a pair (&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) that is &lt;em&gt;known&lt;/em&gt; to have joint probability Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;). You can use this to generate &lt;em&gt;another&lt;/em&gt; pair (&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) that &lt;em&gt;also&lt;/em&gt; has joint probability Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;). Here is how you do it:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Use Bayes theorem to split Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) as Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) = Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;│&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) Pr(&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;), where Pr(&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) ≡ ∫&lt;em&gt;d&lt;strong&gt;x&lt;/strong&gt;&lt;/em&gt; Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;). This tells you that Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;│&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) = Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) / ∫&lt;em&gt;d&lt;strong&gt;x&lt;/strong&gt;&lt;/em&gt; Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) which will be used below.&lt;/li&gt;&lt;li&gt;Draw a sample vector (call it &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´) from the Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;│&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) computed above, and make this the new value of &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;. Note that you do not need to know Pr(&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) to draw this sample. Here I assume that it is &lt;em&gt;easy&lt;/em&gt; to draw an &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;-sample from Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;│&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) (e.g. &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; is low-dimensional) but it is &lt;em&gt;difficult&lt;/em&gt; to draw an (&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;)-sample from Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) (e.g. (&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) is high-dimensional) which is the sampling problem that we want to avoid.&lt;/li&gt;&lt;li&gt;The joint state is now (&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;). Its joint probability is Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´│&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) Pr(&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;), where Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´│&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) has the &lt;em&gt;same&lt;/em&gt; functional dependence on &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´ as Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;│&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) has on &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;, by construction.&lt;/li&gt;&lt;li&gt;Use Bayes theorem to join these together to obtain Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) = Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;│&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) Pr(&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;), where Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) has the &lt;em&gt;same&lt;/em&gt; functional dependence on &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´ as Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) has on &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;, by construction.&lt;/li&gt;&lt;/ol&gt;The overall effect of the above steps is that (&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) becomes (&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;), and the functional form of the joint probability is unchanged by this operation. Excellent! That is exactly what we wanted to do. We now have two pairs of vectors (&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) becomes (&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;´,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) sampled from Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;). The only downside is that the value of &lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt; is the &lt;em&gt;same&lt;/em&gt; in both pairs of vectors, so they must be strongly correlated with each other.&lt;br /&gt;&lt;br /&gt;The partitioning of the &lt;em&gt;whole&lt;/em&gt; vector into a pair of &lt;em&gt;smaller&lt;/em&gt; vectors (&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) can be done in any way, and as long as it is &lt;em&gt;easy&lt;/em&gt; to draw a sample of &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; from Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;│&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) then &lt;em&gt;another&lt;/em&gt; (&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) (different &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;, same &lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) can be generated as described above. This process can be iterated (choosing a different partitioning of the whole vector at each iteration) to generate a whole sequence (or Markov chain) of vector samples (hence the name "Markov chain Monte Carlo" sampling) all of which have the correct joint probability. The main caveat is that the sequence is &lt;em&gt;correlated&lt;/em&gt;, because each vector in the sequence has a "memory" of the past history of the sequence.&lt;br /&gt;&lt;br /&gt;Note that the above algorithm does &lt;em&gt;not&lt;/em&gt; work if the initial pair of vectors does &lt;em&gt;not&lt;/em&gt; have joint probability Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;). However, under fairly mild assumptions (i.e. ergodicity), the algorithm will generate a sequence of vectors whose joint probability settles down to Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) after an initial convergence period, after which the assumptions of the above algorithm are satisfied.&lt;br /&gt;&lt;br /&gt;The diagrams below show how a typical MCMC sequence building up. This uses the same Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) and diagrammatic notation as was used in my posting on Bayes theorem &lt;a href="http://acenetica.blogspot.com/2005/10/bayes-theorem.html"&gt;here&lt;/a&gt;, and was in fact how I generated the diagrams there. A different random seed has been used here. Each of &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt; has 3 possible states, so (&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) has 9 (=3*3) possible joint states.&lt;br /&gt;&lt;br /&gt;The first diagram shows the first 6 steps of the MCMC sequence, where the initial state (3,2) is at the top left, and successive states are shown in the graphics as read from left to right then top to bottom. After 1 MCMC update the state is unchanged (that is the draw of the random numbers), then after 2 MCMC updates the &lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt; part of the state updates so that the state is (3,3), then at the next step the &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; part of the state updates so that the state is (2,3), then the &lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt; part of the state updates so that the state is (2,1), then the next update can't be inferred from the diagram because it repeats a state that occurred earlier. The source of the correlations between successive MCMC samples is clear because each update moves only &lt;em&gt;one&lt;/em&gt; end of the line in the graphic.&lt;br /&gt;&lt;a href="http://photos1.blogger.com/blogger/4156/1104/1600/image00091.jpg"&gt;&lt;img style="CURSOR: hand" alt="" src="http://photos1.blogger.com/blogger/4156/1104/320/image00091.jpg" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;The diagram below shows the same as the diagram above except that 25 MCMC updates elapse between each graphic. Also, for visualisation purposes, the end points of each line representing a joint state are vertically jiggled a little bit so that they don't all fall on top of each other.&lt;br /&gt;&lt;a href="http://photos1.blogger.com/blogger/4156/1104/1600/image0009.jpg"&gt;&lt;img style="CURSOR: hand" alt="" src="http://photos1.blogger.com/blogger/4156/1104/320/image0009.jpg" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;This MCMC approach to sampling is &lt;em&gt;very&lt;/em&gt; powerful, and with some ingenuity it can be used to draw samples from &lt;em&gt;any&lt;/em&gt; joint probability, but always with the caveat that the sequence is correlated.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-112907282513395359?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/112907282513395359/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=112907282513395359' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112907282513395359'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112907282513395359'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2005/10/markov-chain-monte-carlo-sampling.html' title='Markov chain Monte Carlo sampling'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-112906060538696263</id><published>2005-10-12T20:05:00.000Z</published><updated>2005-10-13T17:53:31.786Z</updated><title type='text'>Bayes theorem</title><content type='html'>&lt;a href="http://www-groups.dcs.st-and.ac.uk/~history/Mathematicians/Bayes.html"&gt;Thomas Bayes (1702-1761)&lt;/a&gt;&lt;br /&gt;&lt;a href="http://photos1.blogger.com/blogger/4156/1104/1600/image0006.jpg"&gt;&lt;img style="CURSOR: hand" alt="" src="http://photos1.blogger.com/blogger/4156/1104/320/image0006.jpg" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;In probability theory Thomas Bayes is legendary for his theorem (known as Bayes theorem). which can be written thus:&lt;br /&gt;&lt;br /&gt;Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) = Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;│&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) Pr(&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) = Pr(&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;│&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;)&lt;br /&gt;&lt;br /&gt;Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) ≡ ∫&lt;em&gt;d&lt;strong&gt;y&lt;/strong&gt;&lt;/em&gt; Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;)&lt;br /&gt;Pr(&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) ≡ ∫&lt;em&gt;d&lt;strong&gt;x&lt;/strong&gt;&lt;/em&gt; Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;)&lt;br /&gt;&lt;br /&gt;where Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) is the joint probability of a &lt;em&gt;pair&lt;/em&gt; of vectors &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;, Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) and Pr(&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) are marginal probabilities of the &lt;em&gt;single&lt;/em&gt; vectors &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; or &lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt; (respectively), and Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;│&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) and Pr(&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;│&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) are conditional probabilities defined from the above as&lt;br /&gt;&lt;br /&gt;Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;│&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) ≡ Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) / ∫&lt;em&gt;d&lt;strong&gt;x&lt;/strong&gt;&lt;/em&gt; Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;)&lt;br /&gt;Pr(&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;│&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) ≡ Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) / ∫&lt;em&gt;d&lt;strong&gt;y&lt;/strong&gt;&lt;/em&gt; Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;)&lt;br /&gt;&lt;br /&gt;These relationships can be intuitively explained using diagrams. Thus choose state spaces for &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt; that each have 3 discrete states, so the total number of joint states is 9. Choose a 3 by 3 matrix of joint probabilities Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;), and from Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) independently draw 150 samples of (&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;em&gt;&lt;strong&gt;y&lt;/strong&gt;&lt;/em&gt;) at random.&lt;br /&gt;&lt;br /&gt;Now represent each of these samples in the diagram below as a line joining one of the 3 bins on the left (the &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;-bins) to one of the 3 bins on the right (the &lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;-bins). Vertically jiggle the positions of the end points of the lines so that they don't accidentally fall on top of each other. The &lt;em&gt;number&lt;/em&gt; of lines (joining any pair of bins) in this diagram represents the corresponding element of the &lt;em&gt;joint&lt;/em&gt; probability Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;), for each of the 9 (=3*3) possible pairs of bins.&lt;br /&gt;&lt;a href="http://photos1.blogger.com/blogger/4156/1104/1600/image0007.jpg"&gt;&lt;img style="CURSOR: hand" alt="" src="http://photos1.blogger.com/blogger/4156/1104/320/image0007.jpg" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;Now display in the diagrams below the contributions that are anchored in each of the 6 (=3+3) possible bins. The &lt;em&gt;number&lt;/em&gt; of lines (joining any pair of bins) represents the &lt;em&gt;conditional&lt;/em&gt; probability Pr(&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;│&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) (top row) and Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;│&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) (bottom row). Note that the &lt;em&gt;total&lt;/em&gt; number of lines in each diagram represents the corresponding &lt;em&gt;marginal&lt;/em&gt; probability, i.e. Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) (top row) and Pr(&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;) (bottom row).&lt;br /&gt;&lt;a href="http://photos1.blogger.com/blogger/4156/1104/1600/image0008.jpg"&gt;&lt;img style="CURSOR: hand" alt="" src="http://photos1.blogger.com/blogger/4156/1104/320/image0008.jpg" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;These diagrams give a neat visual summary of the various probability quantities that are interrelated by Bayes theorem. They will provide neat ways of visualising various results that I will describe in future postings to this blog.&lt;br /&gt;&lt;br /&gt;Bayesian purists will object to the above diagrams because they rely on drawing a set of samples &lt;strong&gt;&lt;em&gt;S&lt;/em&gt;&lt;/strong&gt; from the joint probability P(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;), and then using &lt;strong&gt;&lt;em&gt;S&lt;/em&gt;&lt;/strong&gt; as if it was actually the &lt;em&gt;same&lt;/em&gt; object as P(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;,&lt;strong&gt;&lt;em&gt;y&lt;/em&gt;&lt;/strong&gt;). Of course, they are &lt;em&gt;different&lt;/em&gt; objects, but for the purposes of visualisation &lt;strong&gt;&lt;em&gt;S&lt;/em&gt;&lt;/strong&gt; is very useful. You can always "fill in the gaps" in your mind's eye.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-112906060538696263?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/112906060538696263/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=112906060538696263' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112906060538696263'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112906060538696263'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2005/10/bayes-theorem.html' title='Bayes theorem'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-112864222568314340</id><published>2005-10-06T22:54:00.000Z</published><updated>2005-10-10T16:17:13.220Z</updated><title type='text'>Mission to build a simulated brain</title><content type='html'>&lt;p&gt;In New Scientist a while ago there was an article entitled &lt;a href="http://www.newscientist.com/article.ns?id=dn7470"&gt;Mission to build a simulated brain begins&lt;/a&gt; which describes "An effort to create the first computer simulation of the entire human brain, right down to the molecular level...". It goes on to say "The hope is that the virtual brain will help shed light on some aspects of human cognition, such as perception, memory and perhaps even consciousness", and so on.&lt;br /&gt;&lt;br /&gt;"...simulation of the entire human brain"?!! &lt;/p&gt;&lt;ol&gt;&lt;li&gt;That's rather a lot of neurons, and a mind-boggling number of connections. And that's not even going down to the molecular level. The state of the art is very limited compared to the goals of this project. I assume that the report is slightly inaccurate. &lt;/li&gt;&lt;li&gt;For instance, there are detailed computational models of how neurons function, and how they interact in small assemblies; there is the &lt;a href="http://www.neuron.yale.edu/neuron/"&gt;NEURON&lt;/a&gt; software for running these simulations. At a much higher level there are also general architectual models of whole regions of the brain.&lt;/li&gt;&lt;li&gt;In a few cases there are models that span several scales all the way from the bottom level to a fairly high level. I am thinking in particular of the visual cortex where we have the luxury of addressing the brain directly via the retina (which is effectively a layer of brain tissue). This has led to a very good (relative to other areas of the brain) neural model of visual processing in the brain. &lt;/li&gt;&lt;/ol&gt;&lt;p&gt;"...and perhaps even consciousness"?!! &lt;/p&gt;&lt;ol&gt;&lt;li&gt;When I hear the "C" word I reach for my gun! People routinely use the "C" word to inject an air of mystery and importance into what they are saying. The fact is that it is a non-word that is used by people who haven't seriously reflected on what goes on inside their heads.&lt;/li&gt;&lt;li&gt;One book (amongst many) that I have found very insightful on this issue is &lt;a href="http://www.amazon.com/gp/product/019280569X/102-3258438-3366535?v=glance&amp;n=283155&amp;amp;s=books&amp;v=glance"&gt;The Artful Universe Expanded&lt;/a&gt; by John Barrow, in which he lucidly explains how evolution ensures that the universe imprints itself on the way that we we think. This means that the thing that we think with is not as free thinking as we would like to believe that it is. Is it at all surprising that we feel as if there is an inner light (call it "consciousness" if you want)?&lt;/li&gt;&lt;li&gt;Isn't it obvious that evolution ensures that you (whatever you are, human, fish, ant, or whatever) have a sense of "self" to encourage you to survive and reproduce, or have a sense of "host" to make you sacrifice yourself to save the nest? There will be "degrees" of "self" according to the "amount" of "processing" that is used to "implement" it. Yes, there are inverted commas everywhere because the terminology is not precisely defined.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;What sort of brain studies should be given more prominence?&lt;br /&gt;&lt;br /&gt;There is a lot of research going on near the bottom level, such as modelling neural chemical processes, modelling the dynamics of membrane potentials, modelling the interaction of small assemblies of neurons, etc. This is all very good reductionist work that potentially will lead to a deeper understanding.&lt;br /&gt;&lt;br /&gt;However, I can't see much work going on that studies the &lt;em&gt;theory&lt;/em&gt; of low-level brain-like processing, although there is a lot of neural modelling work. What do I mean by that? Brain-like processing is massively parallel, which needs its own special kind of "algorithm" that can be distributed across a large number of small interconnected processors. For really large parallel systems (e.g. brains) you &lt;em&gt;can't&lt;/em&gt; go in and individually program every processor, but you &lt;em&gt;can&lt;/em&gt; envisage the processors having inbuilt behaviours (i.e. a bootstrap program) that allow them to sort out their collective behaviour (i.e. self-organise) to do something macroscopically useful. We certainly need a lot more work on the &lt;em&gt;theory&lt;/em&gt; of this type of massively parallel self-organising system.&lt;/p&gt;&lt;p&gt;There is a &lt;em&gt;vast&lt;/em&gt; amount of research into what is called "neural networks", but which is nothing of the sort. As with the "C" word above, the "NN" phrase is used to guild otherwise rather mediocre research to make it seem more sophisticated than it is. A better phrase to describe most of this research is "non-linear adaptive filtering". The small subset of "NN" research that uses discrete firing events, rather than firing rates, is much better qualified to be called "NN".&lt;/p&gt;&lt;p&gt;There is the important distinction between supervised and unsupervised training:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;Supervised training is usually used to guide an adaptive filter so that it approximates an &lt;em&gt;externally&lt;/em&gt; specified output. From the point of view of brain-like processing this is &lt;em&gt;not&lt;/em&gt; a very interesting type of adaptation.&lt;/li&gt;&lt;li&gt;Unsupervised training is &lt;em&gt;much&lt;/em&gt; more interesting because it does &lt;em&gt;not&lt;/em&gt; require an output to be externally specified, so the adaptive filter must discover for &lt;em&gt;itself&lt;/em&gt; something "interesting" about its input; this is like answering an open-ended question, which is much more challenging than a question with a known answer. From the point of view of brain-like processing this is a &lt;em&gt;very&lt;/em&gt; interesting type of adaptation.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Too much time is spent on &lt;em&gt;supervised&lt;/em&gt; training by people doing what they &lt;em&gt;think&lt;/em&gt; is true "NN" research. They are kidding themselves. How do they expect &lt;em&gt;very large&lt;/em&gt; networks to be trained when all the external supervisor can do is to guide the network &lt;em&gt;outputs&lt;/em&gt;? Is supervised training really going to help the deep inner workings of the network to learn anything useful? Wouldn't it be much better to distribute throughout the network the ability to discover "interesting" structure? Doesn't this sound rather like what &lt;em&gt;unsupervised&lt;/em&gt; training is aiming to do anyway?&lt;/p&gt;&lt;p&gt;In &lt;a href="http://www.newscientist.com/article.ns?id=dn7470"&gt;Mission to build a simulated brain begins&lt;/a&gt; I want to see more theoretical work, especially on abstract brain-like processing and (more specifically) unsupervised training of neural networks that use discrete firing events, which would &lt;em&gt;then&lt;/em&gt; guide the modelling work that they hope to do anyway. To understand what a large network simulation is doing you need a theory that is formulated at an appropriate level. It is &lt;em&gt;not&lt;/em&gt; always a good idea to drill down to the bottom level of modelling (see my comments on &lt;a href="http://acenetica.blogspot.com/2005/09/reductionism-versus-emergentism.html"&gt;reductionism versus emergentism&lt;/a&gt;).&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-112864222568314340?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/112864222568314340/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=112864222568314340' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112864222568314340'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112864222568314340'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2005/10/mission-to-build-simulated-brain.html' title='Mission to build a simulated brain'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-112853299289120224</id><published>2005-10-05T17:01:00.000Z</published><updated>2005-10-10T20:01:18.820Z</updated><title type='text'>Whither this blog?</title><content type='html'>I have had look at the various postings that I have made thus far and notice that, although they are "on topic", they are quite variable in their style. Some verge on seeming to be philosophical (e.g. &lt;a href="http://acenetica.blogspot.com/2005/09/reductionism-versus-emergentism.html"&gt;here&lt;/a&gt;) whilst others use a fairly demanding level of maths/software (e.g. &lt;a href="http://acenetica.blogspot.com/2005/09/vector-quantiser-simulation.html"&gt;here&lt;/a&gt;), including several future postings that I currently have in draft form. I even had a comment by private email that their "brain started to explode half way through the VQ article"! I had rather hoped I would get all my comments directly to the blog; there seems to be rather a shortage of them. I seem to be talking to myself!&lt;br /&gt;&lt;br /&gt;I operate in any (all, some, or none) of these modes of expression, and my writing and maths tends to show it. It will take a while for me to converge on an optimum approach to posting on this blog, but I suspect it will always mix together different styles. I will try to make the maths as self-evident as possible. After all, I don't think you have really understood a piece of maths until it you can explain it in such an intuitively obvious way that people ask why they are paying you do do something so trivial.&lt;br /&gt;&lt;br /&gt;I will post software written in &lt;em&gt;Mathematica&lt;/em&gt; to illustrate the points I am making. Maybe some (most?) people don't like my choice of computer language, but I have given my reasons for using Mathematica &lt;a href="http://acenetica.blogspot.com/2005/09/mathematica.html"&gt;here&lt;/a&gt;. It blurs the distinction between maths and algorithms which I think is a &lt;em&gt;big step forwards&lt;/em&gt;. This is one of the main purposes of Stephen Wolfram's &lt;a href="http://www.wolframscience.com/"&gt;New Kind of Science&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I don't think I have yet started to put in enough links to the rest of the world, so I can hardly expect other people to link to me (they haven't yet, as far as I know!). I even tried to set up a trackback from one of my earlier postings &lt;a href="http://acenetica.blogspot.com/2005/09/my-online-papers.html"&gt;here&lt;/a&gt; to an arXiv paper of mine &lt;a href="http://arxiv.org/abs/cs.LG/0410036"&gt;here&lt;/a&gt;. The trackback was received (it appeared on arXiv's "recent trackbacks" page) , but it never got attached to my paper. I guess the arXiv administrators spiked it because it was a self-comment. That's OK, and it is the appropriate policy for arXiv.&lt;br /&gt;&lt;br /&gt;I have had a poke around blogspace and I can't find anyone at all who is posting in the area of self-organising networks (or even self-organi&lt;strong&gt;&lt;span style="font-size:130%;"&gt;z&lt;/span&gt;&lt;/strong&gt;ing networks!). There are blogs which have to do with self-organising "social" networks, but that isn't what this blog is about. So how do you start off a blogging community in a new subject area? My own areas of expertise span quantum physics as well as self-organising networks, and I intend to show just how inter-related these two areas are in future postings, so at some point my postings will start to look much more like quantum physics. I hope things will start to interest a lot more people then.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Update:&lt;/strong&gt; The above arXiv trackback eventually appeared a little over a week after I pinged arXiv. Because this was my first such trackback I must have triggered the &lt;em&gt;manual&lt;/em&gt; part of the "semi-automated editorial process that approves trackbacks for display". I pinged arXiv with some more trackbacks, and this time they appeared straight away. So it seems that I have been judged to be "worthy". I will use the privilege wisely.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-112853299289120224?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/112853299289120224/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=112853299289120224' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112853299289120224'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112853299289120224'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2005/10/whither-this-blog.html' title='Whither this blog?'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-112750367607239295</id><published>2005-09-30T18:19:00.000Z</published><updated>2005-09-30T18:53:39.360Z</updated><title type='text'>Vector quantiser simulation</title><content type='html'>&lt;p&gt;The aim here is to show how to write the code to implement the training of a vector quantiser (VQ) as described in my earlier posting &lt;a href="http://acenetica.blogspot.com/2005/09/basic-self-organising-network-vector.html"&gt;here&lt;/a&gt;. The code is implemented as a discrete time dynamical system (DTDS) as described in another of my earlier postings &lt;a href="http://acenetica.blogspot.com/2005/09/discrete-time-dynamical-systems.html"&gt;here&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;Call in some graphics packages.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="font-family:courier new;"&gt;Needs["Graphics`Graphics`"];&lt;br /&gt;Needs["Graphics`MultipleListPlot`"];&lt;br /&gt;Needs["DiscreteMath`ComputationalGeometry`"];&lt;/span&gt;&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;Define the size of the training set (i.e. the number of vectors sampled from the input probability density) and the number of code indices to use (i.e. the size of the "code book").&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="font-family:courier new;"&gt;numtrain=100;&lt;br /&gt;numcodes=20;&lt;/span&gt;&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;Define a function &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;nn&lt;/span&gt;&lt;/strong&gt; to compute the index of the vector (in a list of reconstruction vectors &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;r&lt;/span&gt;&lt;/strong&gt;) that lies closest (i.e. is the "nearest neighbour") to an input vector &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;x&lt;/span&gt;&lt;/strong&gt;. This is the encoded version of the input vector.&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;&lt;strong&gt;&lt;br /&gt;nn[x_,r_]:=Position[#,Min[#]]&amp;[Map[#.#&amp;amp;[x-#]&amp;,r]][[1,1]];&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Read this definition from the inside out:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;strong&gt;&lt;span style="font-family:courier new;"&gt;Map[#.#&amp;[x-#]&amp;amp;,r]&lt;/span&gt;&lt;/strong&gt; computes the list of squared distances between the input vector &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;x&lt;/span&gt;&lt;/strong&gt; and each vector in the list of vectors &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;r&lt;/span&gt;&lt;/strong&gt; (the "code book").&lt;/li&gt;&lt;li&gt;&lt;strong&gt;&lt;span style="font-family:courier new;"&gt;Position[#,Min[#]]&amp;[...][[1,1]]&lt;/span&gt;&lt;/strong&gt; computes the position in the list of the smallest of the above squared distances.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Define a function &lt;span style="font-family:courier new;"&gt;&lt;strong&gt;y&lt;/strong&gt;&lt;/span&gt; to build a list of the codes corresponding to the whole set of training data at time step &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;t&lt;/span&gt;&lt;/strong&gt;.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;&lt;strong&gt;y[t_]:=(y[t]=Map[nn[#,r[t]]&amp;amp;,data[t]]);&lt;/strong&gt;&lt;/span&gt; &lt;/p&gt;&lt;p&gt;Read this definition from the inside out:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;span style="font-family:courier new;"&gt;&lt;strong&gt;data[t]&lt;/strong&gt;&lt;/span&gt;&lt;span style="font-size:0;"&gt; &lt;/span&gt;is a function that returns the training set at time t.&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:courier new;"&gt;&lt;strong&gt;Map[nn[#,r[t]]&amp;,data[t]]&lt;/strong&gt;&lt;/span&gt; computes the list of encoded versions of all of the input vectors in the training set.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;&lt;span style="font-family:courier new;"&gt;y[t]=...&lt;/span&gt;&lt;/strong&gt; memorises the result of the above step.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Define a function &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;datay&lt;/span&gt;&lt;/strong&gt; to build a list of the training data that maps to each code at time step &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;t&lt;/span&gt;&lt;/strong&gt;.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="font-family:courier new;"&gt;datay[t_]:=Table[Cases[Transpose[{data[t],y[t]}],{x_,i}:&gt;x],{i,numcodes}];&lt;/span&gt;&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;Read this definition from the inside out:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;strong&gt;&lt;span style="font-family:courier new;"&gt;Transpose[{data[t],y[t]}]&lt;/span&gt;&lt;/strong&gt; builds a list of pairs of corresponding input vectors and their codes.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;&lt;span style="font-family:courier new;"&gt;Cases[...,{x_,i}:&gt;x]&lt;/span&gt;&lt;/strong&gt; extracts the subset of data vectors that are paired with code &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;i&lt;/span&gt;&lt;/strong&gt;.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;&lt;span style="font-family:courier new;"&gt;Table[...,{i,numcodes}]&lt;/span&gt;&lt;/strong&gt; builds a list of the above subsets for all codes.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Define a function &lt;span style="font-family:courier new;"&gt;&lt;strong&gt;r&lt;/strong&gt;&lt;/span&gt; to compute the updated list of reconstruction vectors at time step &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;t&lt;/span&gt;&lt;/strong&gt;.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="font-family:courier new;"&gt;r[t_]:=(r[t]=Map[If[Length[#]&gt;0,Apply[Plus,#]/Length[#],Table[Random[],{2}]]&amp;,datay[t-1]]);&lt;/span&gt;&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;Read this definition from the inside out:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;strong&gt;&lt;span style="font-family:courier new;"&gt;Apply[Plus,#]/Length[#]&amp;[datay[t-1]]&lt;/span&gt;&lt;/strong&gt; computes the updated list of reconstruction vectors from the list of subsets of training data that maps to each code at time step &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;t-1&lt;/span&gt;&lt;/strong&gt;.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;&lt;span style="font-family:courier new;"&gt;If[Length[#]&gt;0,...,Table[Random[],{2}]]&lt;/span&gt;&lt;/strong&gt; protects the above code from a possible error codition when the length of a sublist is zero (i.e. none of the training vectors map to a particular code).&lt;/li&gt;&lt;li&gt;&lt;strong&gt;&lt;span style="font-family:courier new;"&gt;r[t]=...&lt;/span&gt;&lt;/strong&gt; memorises the result of the above step.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Define the value that the &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;r&lt;/span&gt;&lt;/strong&gt; function returns at time step &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;0&lt;/span&gt;&lt;/strong&gt;. This initialises the reconstruction vectors.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="font-family:courier new;"&gt;r[0]=Table[Random[],{numcodes},{2}];&lt;/span&gt;&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;Define a function &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;data&lt;/span&gt;&lt;/strong&gt; that returns the set of training data at time step &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;t&lt;/span&gt;&lt;/strong&gt;. This is a &lt;em&gt;fixed&lt;/em&gt; set (for all time steps) of 2-dimensional vectors uniformly distributed in the unit square.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="font-family:courier new;"&gt;data[t_]=Table[Random[],{numtrain},{2}];&lt;/span&gt;&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;Define the number of training updates. Typically, this VQ converges in far fewer updates than the 20 chosen.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;&lt;strong&gt;numupdates=20;&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Display the training of the reconstruction vectors.&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;&lt;strong&gt;&lt;br /&gt;vqgraphics=Table[MultipleListPlot[data[t],r[t],PlotRange-&gt;{{0,1},{0,1}},AspectRatio-&gt;1,Frame-&gt;True,SymbolStyle-&gt;{RGBColor[0,0,1],RGBColor[1,0,0]},SymbolShape-&gt;{PlotSymbol[Triangle],PlotSymbol[Box]},PlotLabel-&gt;"Step "&lt;&gt;ToString[t],FrameTicks-&gt;False],{t,0,numupdates}];&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;[GRAPHICS OMITTED]&lt;br /&gt;&lt;br /&gt;Display the encoding cells corresponding to the reconstruction vectors. &lt;span style="font-family:courier new;"&gt;&lt;br /&gt;&lt;strong&gt;&lt;br /&gt;codecellgraphics=Table[DiagramPlot[r[t],PlotRange-&gt;{{0,1},{0,1}},LabelPoints-&gt;False,PlotLabel-&gt;"Step "&lt;&gt;ToString[t],Frame-&gt;True,FrameTicks-&gt;False],{t,0,numupdates}];&lt;/strong&gt;&lt;/span&gt;&lt;strong&gt;&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;[GRAPHICS OMITTED]&lt;br /&gt;&lt;br /&gt;Display all of the graphics together.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="font-family:courier new;"&gt;MapThread[Show[#1,#2]&amp;,{codecellgraphics,vqgraphics}];&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size:0;"&gt;&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;[GRAPHICS OMITTED] &lt;/p&gt;&lt;p&gt;Display the first few update steps to show the convergence to an optimum VQ. This example has been chosen deliberately so that it converges fast. It usually takes more than 3 update steps for the optimisation of this VQ to converge.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;span style="font-family:courier new;"&gt;DisplayTogetherArray[Partition[Take[MapThread[Show[#1,#2]&amp;,{codecellgraphics,vqgraphics}],4],2]];&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:Courier New;"&gt;&lt;a href="http://photos1.blogger.com/blogger/4156/1104/1600/image0004.jpg"&gt;&lt;img style="CURSOR: hand" alt="" src="http://photos1.blogger.com/blogger/4156/1104/320/image0004.jpg" border="0" /&gt;&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;Because this DTDS implementation of the training of a VQ &lt;em&gt;memorises&lt;/em&gt; the values returned by the &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;y&lt;/span&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;r&lt;/span&gt;&lt;/strong&gt; functions you need to erase these values using &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;y=.&lt;/span&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;r=.&lt;/span&gt;&lt;/strong&gt; before retraining the VQ.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-112750367607239295?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/112750367607239295/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=112750367607239295' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112750367607239295'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112750367607239295'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2005/09/vector-quantiser-simulation.html' title='Vector quantiser simulation'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-112802308166046068</id><published>2005-09-29T19:44:00.000Z</published><updated>2005-09-30T16:29:22.003Z</updated><title type='text'>Basic self-organising network: vector quantiser</title><content type='html'>The simplest self-organising network (SON) that computes non-linear transformations of data is the vector quantiser (VQ).&lt;br /&gt;&lt;br /&gt;A VQ does two simultaneous jobs:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;&lt;strong&gt;Encoding:&lt;/strong&gt; take each continuous-valued input vector &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; and use an encoder &lt;em&gt;k&lt;/em&gt;(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) to map it to a discrete-valued code &lt;em&gt;k&lt;/em&gt;.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Decoding:&lt;/strong&gt; take each discrete-valued code &lt;em&gt;k&lt;/em&gt; and use a decoder &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;) to map it to a continuous-valued input &lt;em&gt;&lt;strong&gt;x&lt;/strong&gt;&lt;/em&gt;.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;a href="http://photos1.blogger.com/blogger/4156/1104/1600/image0003.jpg"&gt;&lt;/a&gt;&lt;/p&gt;&lt;p&gt;&lt;a href="http://photos1.blogger.com/blogger/4156/1104/1600/image00031.jpg"&gt;&lt;img style="CURSOR: hand" alt="" src="http://photos1.blogger.com/blogger/4156/1104/320/image00031.jpg" border="0" /&gt;&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Thus a VQ defines a pair of &lt;em&gt;oppositely&lt;/em&gt; directed mappings &lt;strong&gt;&lt;em&gt;x &lt;/em&gt;&lt;/strong&gt;→ &lt;em&gt;k&lt;/em&gt; (i.e. &lt;em&gt;k&lt;/em&gt;(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;)) and &lt;em&gt;k &lt;/em&gt;→ &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; (i.e. &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;)). The "quality" of the encoding may be judged by going once round the encode/decode loop thus &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; → &lt;em&gt;k&lt;/em&gt; → &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;′, and then comparing the original input vector &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; with its reconstruction &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;′. A good VQ will have a small reconstruction error, and to achieve this the two mappings &lt;em&gt;k&lt;/em&gt;(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) and &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;) must work together harmoniously. &lt;/p&gt;&lt;p&gt;Now let me explicitly show how to optimise a VQ.&lt;br /&gt;&lt;br /&gt;A simple measure of the "goodness" of a VQ is a "Euclidean" objective function &lt;em&gt;D&lt;/em&gt; that measures the average squared reconstruction error, which is defined as&lt;br /&gt;&lt;br /&gt;&lt;em&gt;D&lt;/em&gt; = ∫&lt;em&gt;d&lt;strong&gt;x&lt;/strong&gt;&lt;/em&gt; Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;)║&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; - &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;))║&lt;sup&gt;2&lt;/sup&gt;&lt;br /&gt;&lt;br /&gt;where the various pieces have the following interpretation:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;)) encodes (and then decodes) an input vector &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; to produce a reconstruction.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; - &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;)) computes the difference between the original input vector and its reconstruction.&lt;/li&gt;&lt;li&gt;║&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; - &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;))║&lt;sup&gt;2&lt;/sup&gt; computes the squared length of the above difference vector.&lt;/li&gt;&lt;li&gt;∫&lt;em&gt;d&lt;strong&gt;x&lt;/strong&gt;&lt;/em&gt; Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;)║...║&lt;sup&gt;2&lt;/sup&gt; averages the average of the above squared length over input vectors &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; with probability density Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;).&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Minimisation of this objective function with respect to the two mappings &lt;em&gt;k&lt;/em&gt;(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) and &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;) optimises the VQ. &lt;/p&gt;&lt;p&gt;By inspection, the optimum &lt;em&gt;k&lt;/em&gt;(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) is given by&lt;br /&gt;&lt;br /&gt;&lt;em&gt;k&lt;/em&gt;(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) = argmin&lt;sub&gt;&lt;em&gt;k&lt;/em&gt;&lt;/sub&gt;║&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; - &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;)║&lt;sup&gt;2&lt;/sup&gt;&lt;br /&gt;&lt;br /&gt;Differentiate &lt;em&gt;D&lt;/em&gt; with respect to &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;)&lt;br /&gt;&lt;br /&gt;∂&lt;em&gt;D&lt;/em&gt;/∂&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;) = -2∫&lt;em&gt;d&lt;strong&gt;x&lt;/strong&gt;&lt;/em&gt; Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) δ&lt;sub&gt;&lt;em&gt;k&lt;/em&gt;,&lt;em&gt;k&lt;/em&gt;(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;)&lt;/sub&gt;(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; - &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;))&lt;br /&gt;&lt;br /&gt;where δ&lt;sub&gt;&lt;em&gt;k&lt;/em&gt;,&lt;em&gt;k&lt;/em&gt;(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;)&lt;/sub&gt; is a Kronecker delta. The optimum &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;) satisfies ∂&lt;em&gt;D&lt;/em&gt;/∂&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;) = &lt;strong&gt;0&lt;/strong&gt; (strictly, this is a necessary but &lt;em&gt;not&lt;/em&gt; sufficient condition to locate the global optimum) which is given by&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;) = (∫&lt;em&gt;d&lt;strong&gt;x&lt;/strong&gt;&lt;/em&gt; Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) δ&lt;sub&gt;&lt;em&gt;k&lt;/em&gt;,&lt;em&gt;k&lt;/em&gt;(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) &lt;/sub&gt;&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) / (∫&lt;em&gt;d&lt;strong&gt;x&lt;/strong&gt;&lt;/em&gt; Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) δ&lt;sub&gt;&lt;em&gt;k&lt;/em&gt;,&lt;em&gt;k&lt;/em&gt;(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;)&lt;/sub&gt;)&lt;/p&gt;&lt;p&gt;This pair of coupled non-linear equations for the optimum mappings &lt;em&gt;k&lt;/em&gt;(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) and &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;) can be solved iteratively.&lt;/p&gt;&lt;ol&gt;&lt;li&gt;Randomly initialise the &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;) for &lt;em&gt;k&lt;/em&gt; = 1, 2, ..., &lt;em&gt;m&lt;/em&gt; where &lt;em&gt;m&lt;/em&gt; is the size of the "code book" of &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;) vectors to be optimised.&lt;/li&gt;&lt;li&gt;Compute the right hand side of the &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;) = (...) / (...) equation, using the &lt;em&gt;k&lt;/em&gt;(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) = ... equation to compute &lt;em&gt;k&lt;/em&gt;(&lt;strong&gt;x&lt;/strong&gt;) whenever it is needed. The integrals ∫&lt;em&gt;d&lt;strong&gt;x&lt;/strong&gt;&lt;/em&gt; Pr(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) (...) are approximated using a finite-sized set of input vectors &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; (normally called the "training set").&lt;/li&gt;&lt;li&gt;The result for (...) / (...) from step 2 above will in general &lt;em&gt;not&lt;/em&gt; be the same as the previously assigned values for the &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;). The resolution of this is to overwrite the previous values of the &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;) with the newly computed (...) / (...) from step 2 above, and to iterate this process.&lt;/li&gt;&lt;li&gt;When the &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;) have settled down so that successive iterations give (approximately) the same result the optimisation has converged to a minimum of the objective function. Note that this is guaranteed to be a &lt;em&gt;local&lt;/em&gt; minimum, but &lt;em&gt;not&lt;/em&gt; guaranteed to be a &lt;em&gt;global&lt;/em&gt; minimum.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Why is a VQ a SON?&lt;/p&gt;&lt;p&gt;The encoder and decoder pair of mappings &lt;em&gt;k&lt;/em&gt;(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) and &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;) bidirectionally connect two spaces (&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;-space and &lt;em&gt;k&lt;/em&gt;-space) thus &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; ↔ &lt;em&gt;k&lt;/em&gt;. The encode/decode loop &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; → &lt;em&gt;k&lt;/em&gt; → &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;′ contains all the places that &lt;em&gt;k&lt;/em&gt; appears, i.e. mapping &lt;em&gt;k&lt;/em&gt;(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;), &lt;em&gt;k&lt;/em&gt;-space, and mapping &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;em&gt;k&lt;/em&gt;). The structure of this encode/decode loop is determined entirely by the minimisation of an objective function, and the term "self-organisation" is used to describe the optimisation process that creates the loop &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt; → &lt;em&gt;k&lt;/em&gt; → &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;′. As a side effect of the creation of this encode/decode loop a new space (i.e. &lt;em&gt;k&lt;/em&gt;-space) is created that is able to "represent" states in &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;-space.&lt;/p&gt;&lt;p&gt;This creation of mappings to and from new "representation" spaces is a defining feature of SONs, and the VQ is the simplest example using &lt;em&gt;non-linear&lt;/em&gt; mappings that I know of. If the discrete-valued &lt;em&gt;k&lt;/em&gt; is replaced by a continuous-valued &lt;em&gt;k&lt;/em&gt;, or more generally a vector &lt;strong&gt;&lt;em&gt;k&lt;/em&gt;&lt;/strong&gt;, then using a Euclidean objective function the optimum mappings &lt;strong&gt;&lt;em&gt;k&lt;/em&gt;&lt;/strong&gt;(&lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;) and &lt;strong&gt;&lt;em&gt;x&lt;/em&gt;&lt;/strong&gt;(&lt;strong&gt;&lt;em&gt;k&lt;/em&gt;&lt;/strong&gt;) are &lt;em&gt;linear&lt;/em&gt; (this is "principal components analysis", or PCA). Note that networks of linear mappings can build &lt;em&gt;only&lt;/em&gt; linear mappings, whereas networks of non-linear mappings can build &lt;em&gt;all&lt;/em&gt; mappings. It is very convenient that merely using a discrete-valued (rather than a continuous-valued) &lt;strong&gt;&lt;em&gt;k&lt;/em&gt;&lt;/strong&gt; or &lt;em&gt;k&lt;/em&gt; leads to the useful non-linearity found in a VQ.&lt;/p&gt;&lt;p&gt;This use of discrete-valued quantities turns out to absolutely crucial for building a generalised theory of SONs. I will repeatedly return to this theme in the future.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-112802308166046068?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/112802308166046068/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=112802308166046068' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112802308166046068'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112802308166046068'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2005/09/basic-self-organising-network-vector.html' title='Basic self-organising network: vector quantiser'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-112795356459513205</id><published>2005-09-29T00:22:00.000Z</published><updated>2005-10-09T14:41:06.326Z</updated><title type='text'>Quantum mechanics</title><content type='html'>I have always loved this cartoon which I saw in an old copy of Private Eye. I know it's &lt;em&gt;not&lt;/em&gt; about self-organising networks (SON), but QM runs deep in my veins so I thought I would share it.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://photos1.blogger.com/blogger/4156/1104/1600/image00022.jpg"&gt;&lt;img style="CURSOR: hand" alt="" src="http://photos1.blogger.com/blogger/4156/1104/320/image00022.jpg" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;Actually, SONs and QM are more closely related than you might imagine; I will say more about that in the future.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-112795356459513205?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/112795356459513205/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=112795356459513205' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112795356459513205'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112795356459513205'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2005/09/quantum-mechanics.html' title='Quantum mechanics'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-112750587212405054</id><published>2005-09-25T15:22:00.000Z</published><updated>2005-09-29T17:55:45.123Z</updated><title type='text'>Discrete time dynamical systems</title><content type='html'>What is a discrete time dynamical system (DTDS)?&lt;br /&gt;&lt;br /&gt;A DTDS is a dynamical system that evolves its state in discrete time steps, rather than through continuous evolution as is usually the case. By choosing a small enough time step a DTDS can approximate a continuous-time dynamical system. All algorithms can be formulated in the language of DTDS, because algorithms are stepwise procedures whose steps can be implemented as individual DTDS time steps. An algorithm running on a piece of hardware is nothing more than a set of rules telling that hardware how to update its state at each tick of the clock. This idea generalises to multiple pieces of hardware with independent clocks.&lt;br /&gt;&lt;br /&gt;So DTDS are really useful. How can they be neatly implemented in software?&lt;br /&gt;&lt;br /&gt;DTDS are very easy to implement in &lt;em&gt;Mathematica&lt;/em&gt;. The archetypal DTDS is a cellular automaton, and where the "universe" consists of an array of discrete-valued cells, and the "laws of physics" consist of a set of cell update rules. A well known example of a cellular automaton is Conway's "Game of Life". More generally, the states of the cells can be anything you want (discrete-valued or continuous-valued), and the update rules can be implemented as any update algorithm you want to apply to the current state of the "universe" of cells. &lt;em&gt;Mathematica&lt;/em&gt; has been designed with the simulation of cellular automata in mind, which is why DTDS are very easy to implement in &lt;em&gt;Mathematica&lt;/em&gt;.&lt;br /&gt;&lt;br /&gt;The Lorenz attractor is a good example to illustrate the implementation of a DTDS in &lt;em&gt;Mathematica&lt;/em&gt;.&lt;br /&gt;&lt;br /&gt;Initialise a pair of constants. The value of &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;e&lt;/span&gt;&lt;/strong&gt; controls the size of the &lt;em&gt;discrete&lt;/em&gt; steps used by the DTDS version of the Lorentz equations to approximate the &lt;em&gt;continuous&lt;/em&gt; version of the Lorenz equations.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;&lt;strong&gt;e=0.013;&lt;br /&gt;r=28;&lt;br /&gt;&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;Define the initial conditions.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;&lt;strong&gt;x[0]={0.1,0.1,10.0};&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Define the Lorenz equations.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="font-family:courier new;"&gt;x[t_]:=x[t]=&lt;br /&gt;{&lt;br /&gt;&amp;nbsp&amp;nbspx[t-1][[1]]+10e(x[t-1][[2]]-x[t-1][[1]]),&lt;br /&gt;&amp;nbsp&amp;nbspx[t-1][[2]]+e(r x[t-1][[1]]-x[t-1][[1]]x[t-1][[3]]-x[t-1][[2]]),&lt;br /&gt;&amp;nbsp&amp;nbspx[t-1][[3]]+e(x[t-1][[1]]x[t-1][[2]]-(8x[t-1][[3]])/3)&lt;br /&gt;};&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;The &lt;span style="font-family:courier new;"&gt;&lt;strong&gt;x[t_]:=x[t]=...&lt;/strong&gt;&lt;/span&gt; construct is &lt;em&gt;Mathematica&lt;/em&gt;'s way of implementing dynamic programming, where the result of evaluating the right hand side of the Lorenz equations &lt;span style="font-family:courier new;"&gt;&lt;strong&gt;&lt;right&gt;&lt;/strong&gt;&lt;/span&gt;is "memorised" using &lt;span style="font-family:courier new;"&gt;&lt;strong&gt;x[t]=...&lt;right&gt;&lt;/strong&gt;&lt;/span&gt;, and then &lt;span style="font-family:courier new;"&gt;&lt;strong&gt;x[t_]:=...&lt;memorised&gt;&lt;/strong&gt;&lt;/span&gt; can subsequently be used to retrieve this memorised value. This dynamical programming trick ensures that when going from &lt;span style="font-family:courier new;"&gt;&lt;strong&gt;t-1&lt;/strong&gt;&lt;/span&gt; to &lt;span style="font-family:courier new;"&gt;&lt;strong&gt;t&lt;/strong&gt;&lt;/span&gt; the DTDS memorises its updated state &lt;span style="font-family:courier new;"&gt;&lt;strong&gt;x[t]&lt;/strong&gt;&lt;/span&gt;, so that it never needs to recompute &lt;span style="font-family:courier new;"&gt;&lt;strong&gt;x[t]&lt;/strong&gt;&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;Call in a graphics package.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;&lt;strong&gt;Needs["Graphics`Graphics3D`"];&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Plot the Lorenz attractor.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;&lt;strong&gt;ScatterPlot3D[Table[x[t],{t,0,1000}],PlotJoined-&gt;True,AspectRatio-&gt;1];&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://photos1.blogger.com/blogger/4156/1104/1600/image00021.jpg"&gt;&lt;img style="CURSOR: hand" alt="" src="http://photos1.blogger.com/blogger/4156/1104/320/image00021.jpg" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Although &lt;span style="font-family:courier new;"&gt;&lt;strong&gt;{x[1],x[2],...,x[1000]}&lt;/strong&gt;&lt;/span&gt; have &lt;em&gt;not&lt;/em&gt; actually been evaluated prior to evaluating &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;ScatterPlot3D[...]&lt;/span&gt;&lt;/strong&gt;, their evaluation is then &lt;em&gt;forced&lt;/em&gt; by evaluating &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;Table[x[t],{t,0,1000}]&lt;/span&gt;&lt;/strong&gt;. In effect the evolution of the DTDS happens as a &lt;em&gt;side effect &lt;/em&gt;of asking for its evolution to be plotted. The memorisation of intermediate results using the &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;x[t_]:=x[t]=...&lt;/span&gt;&lt;/strong&gt; construct then ensures that all of &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;{x[0],x[1],x[2],...,x[1000]}&lt;/span&gt;&lt;/strong&gt; are available for immediate reuse, and do &lt;em&gt;not&lt;/em&gt; need to be recomputed.&lt;br /&gt;&lt;br /&gt;This memorisation of results means that if you want to restart the computation using different initial conditions &lt;span style="font-family:courier new;"&gt;&lt;strong&gt;x[0]&lt;/strong&gt;&lt;/span&gt;, then you have to explicitly &lt;em&gt;erase&lt;/em&gt; the previously computed values &lt;span style="font-family:courier new;"&gt;&lt;strong&gt;{x[1],x[2],...,x[1000]}&lt;/strong&gt;&lt;/span&gt; otherwise they will &lt;em&gt;not&lt;/em&gt; be recomputed based on the new initial condition. An approach that avoids the need to erase values is to enforce the new initial condition on &lt;span style="font-family:courier new;"&gt;&lt;strong&gt;x[1001]&lt;/strong&gt;&lt;/span&gt;, and then to carry the computations forwards from there as &lt;span style="font-family:courier new;"&gt;&lt;strong&gt;{x[1002],x[1003],...}&lt;/strong&gt;&lt;/span&gt;. I prefer this latter method because it avoids doing what is effectively a backwards jump in time.&lt;br /&gt;&lt;br /&gt;Tag: &lt;a href="http://technorati.com/tag/Mathematica" rel="tag"&gt;Mathematica&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-112750587212405054?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/112750587212405054/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=112750587212405054' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112750587212405054'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112750587212405054'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2005/09/discrete-time-dynamical-systems.html' title='Discrete time dynamical systems'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-112759417864874117</id><published>2005-09-25T13:00:00.000Z</published><updated>2005-09-26T18:52:40.140Z</updated><title type='text'>Including maths in blog postings</title><content type='html'>In the not too distant future it should be very easy to include maths on web pages in a way that is understood by &lt;em&gt;all&lt;/em&gt; web browsers. It is certainly possible to do this right now, but maths is very definitely a second class citizen as far as web pages are concerned.&lt;br /&gt;&lt;br /&gt;The technology that helps us to include maths (and other structured objects) on web pages is XML (e&lt;strong&gt;X&lt;/strong&gt;tensible &lt;strong&gt;M&lt;/strong&gt;ark-up &lt;strong&gt;L&lt;/strong&gt;anguage) which is like a very fancy version of the HTML that we all know and love. I have been tracking the development of &lt;a href="http://www.w3.org/XML"&gt;XML&lt;/a&gt; with interest for many years.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.w3.org/Math"&gt;MathML&lt;/a&gt; is one of the mark-up languages that has been defined using XML, and its use in blogs has been frequently discussed by Jacques Distler in his blog &lt;a href="http://golem.ph.utexas.edu/~distler/blog/"&gt;here&lt;/a&gt;. However, I am not aware of any shrink-wrapped solutions to the problem of blogging with MathML. It seems to always require too much hacking around with "utilities", which is something I &lt;em&gt;could&lt;/em&gt; do but am too lazy to bother with.&lt;br /&gt;&lt;br /&gt;MathML has two features that I very much like:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;It explicitly represents the full structure of mathematical expressions. This means that it is not a snapshot of the &lt;em&gt;finished&lt;/em&gt; equation, but is an algorithmic description of how to &lt;em&gt;construct&lt;/em&gt; the equation, keeping the interrelationship between its various parts intact.&lt;/li&gt;&lt;li&gt;The maths is part of the web page itself, and is not a set of embedded images (called in from hundreds of small image files) as has been the case in the past (e.g. &lt;a href="http://www.latex2html.org/"&gt;LaTeX2HTML&lt;/a&gt;, &lt;em&gt;Mathematica&lt;/em&gt;'s "Save as HTML", etc).&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Currently few browsers understand MathML directly (e.g. Internet Explorer needs the free &lt;a href="http://www.dessci.com/en/products/mathplayer"&gt;MathPlayer&lt;/a&gt; plug-in), and even then they do not necessarily display the maths very well. This technology is still not mature.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;I particularly like MathML (and more generally XML) because its design allows it to be easily imported/exported to/from &lt;em&gt;Mathematica&lt;/em&gt;. The reason for this ease of communication is that the internal representation used by &lt;em&gt;Mathematica&lt;/em&gt; is essentially the same as XML.&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Mathematica&lt;/em&gt; also offers another less pretty way of including maths on web pages, which also preserves the structure of the maths in the same way that MathML does. Thus you can use &lt;em&gt;Mathematica&lt;/em&gt; &lt;span style="font-family:courier new;"&gt;&lt;strong&gt;InputForm&lt;/strong&gt;&lt;/span&gt; to write out maths in a 1-dimensional notation. For instance, &lt;span style="font-family:courier new;"&gt;&lt;strong&gt;Sin[x]==Integrate[Cos[x],x]&lt;/strong&gt;&lt;/span&gt; has the obvious meaning, and it will evaluate to &lt;span style="font-family:courier new;"&gt;&lt;strong&gt;True&lt;/strong&gt;&lt;/span&gt; in &lt;em&gt;Mathematica&lt;/em&gt;. This approach generalises to arbitrarily complicated expressions but it can become rather "wordy", so careful indenting to highlight the structure of expressions is a good idea.&lt;br /&gt;&lt;br /&gt;The use of MathML has not yet become widely enough acknowledged that all web software knows how to display it. For instance, the blog editing software I am using (see &lt;a href="http://www.blogger.com"&gt;www.blogger.com&lt;/a&gt;) doesn't understand MathML (or, at least, I can't find a way of doing it). So I will limit myself to using &lt;em&gt;Mathematica&lt;/em&gt; &lt;strong&gt;&lt;span style="font-family:courier new;"&gt;InputForm&lt;/span&gt;&lt;/strong&gt; for now in my blog postings.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-112759417864874117?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/112759417864874117/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=112759417864874117' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112759417864874117'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112759417864874117'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2005/09/including-maths-in-blog-postings.html' title='Including maths in blog postings'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-112755188919678795</id><published>2005-09-25T02:00:00.000Z</published><updated>2005-09-29T17:45:00.993Z</updated><title type='text'>Functional versus procedural programming</title><content type='html'>The "functional programming" style that I describe here is &lt;em&gt;not&lt;/em&gt; the purist's version of functional programming, where assignment of results to variables does &lt;em&gt;not&lt;/em&gt; occur. I have a freer interpretation of the term "functional programming" which is explained by the physical analogy below.&lt;br /&gt;&lt;br /&gt;The relationship between functional and procedural programming is succinctly explained by using a physical analogy. Space-time is 4-dimensional and it contains a copy of the 3 spatial dimensions &lt;em&gt;for each and every&lt;/em&gt; possible time instant. At each time instant there will be fields recorded throughout the 3-dimensional space, and the evolution of these fields forwards to the next time instant will be governed by whatever laws apply to those fields.&lt;br /&gt;&lt;br /&gt;The two styles of programming may then be described thus:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Functional programming is the evolution of the fields so that they progressively fill up successive 3-dimensional slices of the 4-dimensional space-time.&lt;/li&gt;&lt;li&gt;Procedural programming is the same as functional programming &lt;em&gt;excep&lt;/em&gt;t that at each time instant the 3-dimensional slice containing the evolved copy of the fields &lt;em&gt;overwites&lt;/em&gt; the 3-dimensional slice containing the old copy of the fields.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Thus procedural programming &lt;em&gt;reuses&lt;/em&gt; storage whereas functional programming does &lt;em&gt;not&lt;/em&gt;. A procedural program operates on the state of the world and updates it in place, whereas functional programming makes a copy which it then updates.&lt;/p&gt;&lt;p&gt;A key advantage of functional programming is that the causal chain of updates is laid out explicitly so that an "audit trail" of what has happened during the updates is recorded, whereas in procedural programming this record is overwritten. This means that when a functional program needs to make use of the state of a particular variable at a particular point in time then that state is guaranteed to be available, whereas in a procedural program there is always the danger that the value has been overwritten.&lt;/p&gt;&lt;p&gt;In &lt;em&gt;Mathematica&lt;/em&gt; a concrete example of this is as follows:&lt;/p&gt;&lt;pre&gt;&lt;strong&gt;&lt;span style="font-family:courier new;"&gt;negativeQ[x_]:=x&lt;0;&lt;br /&gt;f[0]=0;&lt;br /&gt;g[0]=1;&lt;br /&gt;f[t_?negativeQ]:=0;&lt;br /&gt;g[t_?negativeQ]:=0;&lt;br /&gt;f[t_]:=f[t-1]-g[t-1];&lt;br /&gt;g[t_]:=g[t-1]-f[t-2];&lt;/span&gt;&lt;/strong&gt;&lt;/pre&gt;where the state of the world at time &lt;span style="font-family:courier new;"&gt;&lt;strong&gt;t&lt;/strong&gt;&lt;/span&gt; is &lt;span style="font-family:courier new;"&gt;&lt;strong&gt;{f[t],g[t]}&lt;/strong&gt;&lt;/span&gt; which depends on the state at the previous &lt;em&gt;two&lt;/em&gt; time instants &lt;span style="font-family:courier new;"&gt;&lt;strong&gt;t-1&lt;/strong&gt;&lt;/span&gt; and &lt;span style="font-family:courier new;"&gt;&lt;strong&gt;t-2&lt;/strong&gt;&lt;/span&gt;. This dependence on various past states is handled completely automatically in functional programming, whereas in procedural programming the overwriting of past states must be carefully buffered in order to ensure that the state at the previous &lt;em&gt;two&lt;/em&gt; time instant is available. Clearly, this is a trivial example, but the principle generalises to cases where functional programming is &lt;em&gt;enormously&lt;/em&gt; more convenient than procedural programming.&lt;/p&gt;&lt;p&gt;Excessive memory usage by functional programs can be avoided by adopting two strategies:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;Do as much computation as possible between each time step, because intermediate results that are &lt;em&gt;not&lt;/em&gt; assigned to variables are &lt;em&gt;not&lt;/em&gt; permanently recorded. The garbage collector will take care of this temporary memory usage.&lt;/li&gt;&lt;li&gt;Explicitly erase variables that are no longer required. In effect this creates blank areas in the "audit trail" where parts of the causal chain of updates are "shredded".&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Functional programming is a very natural way of writing out discretised evolution equations (as illustrated above). For instance, cellular automata have a completely natural formulation in the functional programming style. More generally, any discretised version of a differential equation can be written out easily in the functional programming style. The key advantage over procedural programming is that you don't need to worry about buffering the past states of the world because they are always available, at least until you erase them.&lt;/p&gt;&lt;p&gt;Functional programming should come naturally to anyone who has been exposed to the space-time way of thinking in physics, where the physicist surveys the whole 4-dimensional record of the world, and thinks in terms of complete histories (i.e. "audit trails").&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-112755188919678795?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/112755188919678795/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=112755188919678795' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112755188919678795'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112755188919678795'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2005/09/functional-versus-procedural.html' title='Functional versus procedural programming'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-112749679979607257</id><published>2005-09-24T20:00:00.000Z</published><updated>2005-09-25T16:17:01.210Z</updated><title type='text'>Mathematica</title><content type='html'>On &lt;a href="http://www.wolfram.com/products/mathematica"&gt;www.wolfram.com/products/mathematica&lt;/a&gt; it says &lt;em&gt;Mathematica&lt;/em&gt; is "The Way the World Calculates". That is sales blurb so I'll allow them a little exaggeration.&lt;br /&gt;&lt;br /&gt;It then goes on to say:&lt;br /&gt;&lt;br /&gt;"From simple calculator operations to large-scale programming and interactive-document preparation, Mathematica is the tool of choice at the frontiers of scientific research, in engineering analysis and modeling, in technical education from high school to graduate school, and wherever quantitative methods are used."&lt;br /&gt;&lt;br /&gt;That is is bit more restrained, and I imagine it is not too far from the truth.&lt;br /&gt;&lt;br /&gt;I have been a long-time user of &lt;em&gt;Mathematica&lt;/em&gt; since version 1 was released in 1988, and prior to that circa 1980 (when I was doing my PhD on QCD) I used SMP which was the direct ancestor of &lt;em&gt;Mathematica&lt;/em&gt;. I cannot exaggerate how much &lt;em&gt;Mathematica&lt;/em&gt; has influenced me. It was immediately clear right from version 1 that here was something really new and potentially useful. I say "potentially" because for many years (and versions) I was tantalised by what &lt;em&gt;Mathematica&lt;/em&gt; could do &lt;em&gt;in principle&lt;/em&gt;, but I was disappointed by its &lt;em&gt;slow&lt;/em&gt; compute speed which was a relic of its roots in symbolic algebra. All of these problems have now gone away, and &lt;em&gt;Mathematica&lt;/em&gt; is my main tool for technical computing of all sorts, including computationally challenging problems such as image processing. I can do everything I want within one environment, so my "laboratory notebook" is now a hyperlinked web of notebooks containing elements such as text, graphics, maths, software, etc. Apart from idle jottings and scribbles, I have not kept a paper notebook since around 2000. There are no artificial boundaries between the different types of element in &lt;em&gt;Mathematica&lt;/em&gt; notebooks which gives me an enormous efficiency advantage over people who don't use &lt;em&gt;Mathematica&lt;/em&gt;.&lt;br /&gt;&lt;br /&gt;I no longer think of &lt;em&gt;Mathematica&lt;/em&gt; as a computer language. It is actually a notation that extends standard mathematical notation into the world of algorithms. Loosely speaking, equations and algorithms are now the same thing for me, because &lt;em&gt;Mathematica&lt;/em&gt; is what "breathes the fire into the equations".&lt;br /&gt;&lt;br /&gt;I might sound as if I am evangelising about &lt;em&gt;Mathematica&lt;/em&gt;, but I am not. I have heard people complain about its price, but that is false economy. It currently costs around £1,600 for a commercial license, much cheaper for teachers, and extremely cheap for students. Based on its productivity and the amount of time you save it pays for itself very quickly.&lt;br /&gt;&lt;br /&gt;However, there is a downside. The learning curve for &lt;em&gt;Mathematica&lt;/em&gt; is long, especially for people who have used only procedural programming languages (e.g. Fortran, Basic, C++, Matlab, etc) because you have to make a mental shift to the functional programming style. However, these problems are alleviated by the extremely logical and unified structure of &lt;em&gt;Mathematica&lt;/em&gt;.&lt;br /&gt;&lt;br /&gt;There is another downside that I &lt;em&gt;do&lt;/em&gt; find irritating. You need a full &lt;em&gt;Mathematica&lt;/em&gt; installation on any computer on which you want to run software that you have written using &lt;em&gt;Mathematica&lt;/em&gt;. This means that it is &lt;em&gt;much&lt;/em&gt; more difficult to get non-&lt;em&gt;Mathematica&lt;/em&gt; users to pay attention to what you are doing than it would otherwise be. Currently it is &lt;em&gt;not&lt;/em&gt; possible for you to wrap your software in a standalone EXE. There are various translators that will convert &lt;em&gt;Mathematica&lt;/em&gt; to other languages, but they impose strong constraints on what &lt;em&gt;Mathematica&lt;/em&gt; constructs you are allowed to use; I have found them to be useless for serious work.&lt;br /&gt;&lt;br /&gt;There is a &lt;em&gt;Mathematica&lt;/em&gt; newsgroup at &lt;a href="news:comp.soft-sys.math.mathematica"&gt;comp.soft-sys.math.mathematica&lt;/a&gt; where world experts hang out to solve your every &lt;em&gt;Mathematica&lt;/em&gt; problem.&lt;br /&gt;&lt;br /&gt;I intend to use &lt;em&gt;Mathematica&lt;/em&gt; for any code that I might publish in this blog. I will try to ensure that the code can be read as pseudocode by non-&lt;em&gt;Mathematica&lt;/em&gt; users, though I will be programming functionally rather than dysfunctionally.&lt;br /&gt;&lt;br /&gt;Tag: &lt;a href="http://technorati.com/tag/Mathematica" rel="tag"&gt;Mathematica&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-112749679979607257?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/112749679979607257/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=112749679979607257' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112749679979607257'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112749679979607257'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2005/09/mathematica.html' title='Mathematica'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-112749319162699153</id><published>2005-09-23T16:12:00.000Z</published><updated>2005-09-25T16:17:42.773Z</updated><title type='text'>My online papers</title><content type='html'>I have a few papers published in arXiv:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;&lt;a href="http://arXiv.org/cs.LG/0410036"&gt;arXiv.org/cs.LG/0410036&lt;/a&gt;: Self-Organised Factorial Encoding of a Toroidal Manifold&lt;/li&gt;&lt;li&gt;&lt;a href="http://arXiv.org/cs.NE/0410020"&gt;arXiv.org/cs.NE/0410020&lt;/a&gt;: Adaptive Cluster Expansion (ACE): A Hierarchical Bayesian Network&lt;/li&gt;&lt;li&gt;&lt;a href="http://arXiv.org/cs.NE/0408050" name="cs.NE/0408050"&gt;arXiv.org/cs.NE/0408050&lt;/a&gt;: Invariant Stochastic Encoders&lt;/li&gt;&lt;li&gt;&lt;a href="http://arXiv.org/cs.NE/0408049" name="cs.NE/0408049"&gt;arXiv.org/cs.NE/0408049&lt;/a&gt;: Using Stochastic Encoders to Discover Structure in Data&lt;/li&gt;&lt;li&gt;&lt;a href="http://arXiv.org/cs.NE/0406017" name="cs.NE/0406017"&gt;arXiv.org/cs.NE/0406017&lt;/a&gt;: Using Self-Organising Mappings to Learn the Structure of Data Manifolds&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;I also have an archive of my Crown Copyright publications at &lt;a href="http://www.luttrell.org.uk/papers/index.html"&gt;www.luttrell.org.uk/papers&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;Most of my papers are relevant in some way to the field of self-organising networks.&lt;/p&gt;Tag: &lt;a href="http://technorati.com/tag/arXiv" rel="tag"&gt;arXiv&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-112749319162699153?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/112749319162699153/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=112749319162699153' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112749319162699153'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112749319162699153'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2005/09/my-online-papers.html' title='My online papers'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-112743100879546931</id><published>2005-09-22T23:14:00.000Z</published><updated>2005-10-26T22:29:46.093Z</updated><title type='text'>Reductionism versus emergentism</title><content type='html'>The issue of reductionism versus emergentism amounts to whether you model your experimental observations in terms of underlying causes (this is reductionism), or whether you directly model the experimental observations in terms of themselves (this is emergentism). There are "wars" waged between the factions that support one or the other approach, with the reductionists being accused of being "coldly scientific", and the emergentists accused of being "vague and mystical". I think this polarisation of attitudes is silly; it seems that people simply like to be partisan and to spoil for a fight.&lt;br /&gt;&lt;br /&gt;A concrete example from physics is the modelling of experimental scattering amplitudes using an underlying quantum field theory (QFT), or modelling them directly using an S-matrix approach. QFT (reductionism) predicts the scattering amplitudes from the behaviour of underlying degrees of freedom, whereas the S-matrix approach (emergentism) models the observations directly by defining inter-relationships between the scattering amplitudes.&lt;br /&gt;&lt;br /&gt;Reductionism and emergentism are really the same thing, except that they have a different view about what the fundamental degrees of freedom are. Reductionism explains experimental observations in terms of degrees of freedom that are at a deeper level than the experimental observations, whereas emergentism regards the deepest level as being the level at which the experimental observations are made.&lt;br /&gt;&lt;br /&gt;A really good book that gives you an informal feel for the issues surrounding reductionism and emergentism is &lt;a href="http://www.amazon.com/gp/product/046503828X/102-5600871-4992938?v=glance&amp;n=283155&amp;amp;s=books&amp;amp;v=glance"&gt;A Different Universe: Reinventing Physics from the Bottom Down&lt;/a&gt; by &lt;a href="http://www.stanford.edu/dept/physics/people/faculty/laughlin_robert.html"&gt;Robert B Laughlin&lt;/a&gt;. This book explains why it may not always be useful to use a reductionist approach. Essentially, the up-side of a reductionist model is that it reduces everything to simpler deeper degrees of freedom, but the down-side is that you have to put in the effort to derive everything from these deeper degrees of freedom. Unfortunately, the amount of effort can be prohibitive, in which case the reductionist model explains the observations "in principle", but "in practice" it is too computationally expensive to be useful.&lt;br /&gt;&lt;br /&gt;How is all this relevant to self-organising networks?&lt;br /&gt;&lt;br /&gt;There are two general classes of network model called "generative" and "recognition" models:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;The generative class of models are reductionist, because they explain experimental observations in terms of underlying causes (or hidden variables). Usually a generative model is primed in a fair amount of detail, so it contains quite a lot of prior knowledge.&lt;/li&gt;&lt;li&gt;The recognition class of models are emergentist, because they compute whatever they need without reference to underlying causes. Usually a recognition model is primed only loosely, so it contains little prior knowledge.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Reductionist and emergentist network models lie at two extremes of a spectrum of possibilities. However, most networks fall close to one or other of these extremes, because each approach has its own "mind set" that tends to repel the other.&lt;/p&gt;&lt;p&gt;The emphasis in this blog is to start with the emergentist approach to SONs, because this approach introduces less prior knowledge, and so is more "hands off" in its analysis of data. It is possible, although not guaranteed, that self-organisation can then be used to automatically discover the sorts of underlying degrees of freedom that are used in the reductionist approach.&lt;/p&gt;&lt;p&gt;If it all goes to plan then reductionism will emerge from emergentism by a process of self-organisation. This sets the scene for my research programme into self-organising networks.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-112743100879546931?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/112743100879546931/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=112743100879546931' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112743100879546931'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112743100879546931'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2005/09/reductionism-versus-emergentism.html' title='Reductionism versus emergentism'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13056860.post-112742747316073234</id><published>2005-09-22T21:54:00.000Z</published><updated>2005-09-29T00:14:56.943Z</updated><title type='text'>ACEnetica</title><content type='html'>This is a new blog for discussing ideas about self-organising networks (SON).&lt;br /&gt;&lt;br /&gt;The term "ACEnetica" is derived from the phrase "Adaptive Cluster Expansion network", which is what I call my own offering in the field of self-organising networks.&lt;br /&gt;&lt;br /&gt;What is a SON?&lt;br /&gt;&lt;ol&gt;&lt;li&gt;The "N" stands for "network" by which I actually mean a network of interconnected processing nodes, where the global processing activity is broken down into a number of smaller processing activities.&lt;/li&gt;&lt;li&gt;The "SO" stands for "self-organising" by which I mean that both the processing at each node &lt;em&gt;and&lt;/em&gt; the interconnections between nodes can be adjusted dynamically.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;This general definition of a SON covers a wide range of possibilities. However, the only SON approaches that I consider worth discussing in this blog are those that are "scientific"; the rest of the approaches fall into the category of "natural philosophy" which is definitely &lt;em&gt;not&lt;/em&gt; the focus of this blog.&lt;/p&gt;&lt;p&gt;This means that the approach must have the following minimal properties:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;It must be supported by a consistent theoretical framework.&lt;/li&gt;&lt;li&gt;It must have predictive power. &lt;/li&gt;&lt;li&gt;It must be open to falsification.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;There are two levels at which prediction and falsification are done with processing networks:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;Relationship with the external physical world. This is conventional "science", where the processing network is judged by its ability to explain experimental observations.&lt;/li&gt;&lt;li&gt;Relationship with the world of algorithms. This is &lt;em&gt;un&lt;/em&gt;conventional "science", and could be called "a new kind of science" (see &lt;a href="http://www.wolframscience.com/"&gt;http://www.wolframscience.com/&lt;/a&gt;), where the processing network is judged by its ability to construct processing algorithms.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Some additional properties that are desirable (but not essential) in a SON approach are:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;It should scale well with network "size".&lt;/li&gt;&lt;li&gt;It should include "standard" approaches as special cases.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;An extra property that would be a real bonus if it emerged for free:&lt;/p&gt;&lt;p align="left"&gt;It explains some (or even all) of the properties of brain-like processing.&lt;/p&gt;&lt;p&gt;In this blog I will attempt to present SONs within the above context.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;Tag: &lt;a href="http://technorati.com/tag/ACEnetica" rel="tag"&gt;ACEnetica&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13056860-112742747316073234?l=acenetica.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://acenetica.blogspot.com/feeds/112742747316073234/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13056860&amp;postID=112742747316073234' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112742747316073234'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13056860/posts/default/112742747316073234'/><link rel='alternate' type='text/html' href='http://acenetica.blogspot.com/2005/09/acenetica.html' title='ACEnetica'/><author><name>Stephen Luttrell</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://3.bp.blogspot.com/_kyIFrMjt9ag/SP0C9a2y5pI/AAAAAAAAAG4/mfxx7PcZimI/S220/me2.jpg'/></author><thr:total>0</thr:total></entry></feed>
