Last Sunday (April 28th, 2013) was the 8th Black board day (BBD), which is a small informal workshop I organize every year. It started 8 years ago on my hero Kurt Gödel‘s 100th birthday. This year, I found out that April 30th (1916) is Claud Shannon‘s birthday so I decided the theme would be his information theory.

I started by introducing probabilistic reasoning as an extension of logic in this uncertain world (as Michael Buice told us in BBD7). I quickly introduced two key concepts, Shannon’s entropy $H(X) = -\sum_i p_i \log_2 p_i$ which additively quantifies the uncertainty of a sequence of independent random quantity in bits, and mutual information $I(X;Y) = H(X) - H(X|Y) = H(Y) - H(Y|X)$ which quantifies the how much uncertainty is reduced in $X$ by the knowledge of $Y$ (and vice versa, it’s symmetric). I showed a simple example of the source coding theorem which states that a symbol sequence can be maximally compressed to the length of it’s entropy (information content), and stated the noisy channel coding theorem, which provides an achievable limit of information rate that can be passed through a channel (the channel capacity). Legend says that von Neumann told Shannon to use the word “entropy” due to its similarity to the concept in physics, so I gave a quick microcanonical picture that connects the Boltzmann entropy to Shannon’s entropy.

Andrew Tan: Holographic entanglement entropy

Andrew wanted to connect how space-time structure can be derived from holographic entanglement entropy, and furthermore to link it to graphical models such as the restricted Boltzmann machine. He gave overviews of quantum mechanics (deterministic linear dynamics of the quantum states), density matrix, von Neumann entropy, and entanglement entropy (entropy of a reduced density matrix, where we assume partial observation and marginalization over the rest). Then, he talked about the asymptotic behaviors of entropy for the ground state and critical regime, and introduced a parameterized form of Hamiltonian that gives rise to a specific dependence structure in space-time, and sketched what the dimension of boundary and area of the dependence structure are. Unfortunately, we did not have enough time to finish what he wanted to tell us (see Swingle 2012 for details).

Jonathan Pillow: Information Schminformation

Information theory is widely applied to neuroscience and sometimes to machine learning. Jonathan sympathized with Shannon’s note (1956) called “the bandwagon”, criticized the possible abuse/overselling of information theory. First, Jonathan focused on the derivation of a “universal” rate-distortion theory based on the “information bottleneck principle”. Then, he continued with his recent ideas in optimal neural codes under different Bayesian distortion functions. He showed a multiple-choice exam example where maximizing mutual information can be worse, and a linear neural coding example for different cost functions.

References:
1. • 