NIPS 2014 main meeting
NIPS is growing fast with 2400+ participants! I felt there were proportionally less “neuro” papers compared to last year, maybe because of a huge presence of deep network papers. My NIPS keywords of the year: Deep learning, Bethe approximation/partition function, variational inference, climate science, game theory, and Hamming ball. Here are my notes on the interesting papers/talks from my biased sampling by a neuroscientist as I did for the previous meetings. Other bloggers have written about the conference: Paul Mineiro, John Platt, Yisong Yue and Yun Hyokun (in Korean).
The NIPS Experiment
The program chairs, Corinna Cortes and Neil Lawrence, ran an experiment on the reviewing process and estimated the inconsistency. 10% of the papers were chosen to be reviewed independently by two pools of reviewers and area chair, hence those authors got 6-8 reviews, and had to submit 2 author responses. The disagreement was around 25%, meaning around half of the accepted papers could have been rejected (the baseline assuming independent random acceptance was around 38%). This tells you that the variability in NIPS reviewing process is, so keep that in mind whether your papers got in or not! They accepted all papers that had disagreement between the two pools, so the overall acceptance rate was a bit higher this year. For details, see Eric Price’s blog post and Bert Huang’s post.
Latent variable modeling of neural population activity
Extracting Latent Structure From Multiple Interacting Neural Populations
Joao Semedo, Amin Zandvakili, Adam Kohn, Christian K. Machens, Byron M. Yu
How can we quantify how two populations of neurons interact? A full interaction model would require O(N^2) which quickly makes the inference intractable. Therefore, low-dimensional interaction model could be useful, and this paper exactly does this by extending the ideas of canonical correlation analysis to vector autoregressive processes.
Clustered factor analysis of multineuronal spike data
Lars Buesing, Timothy A. Machado, John P. Cunningham, Liam Paninski
How can you put more structure to a PLDS (Poisson linear dynamical system) model? They assumed disjoint groups of neurons would have loadings from a restricted set of factors only. For application, they actually restricted the loading weights to be non-negative, in order to separate out the two underlying components of oscillation in spinal cord. They have a clever subspace clustering based initialization, and a variational inference procedure.
A Bayesian model for identifying hierarchically organised states in neural population activity
Patrick Putzky, Florian Franzen, Giacomo Bassetto, Jakob H. Macke
How do you capture discrete states in the brain, such as UP/DOWN states? They propose using a probabilistic hierarchical hidden Markov model for population of spiking neurons. The hierarchical structure reduces the effective number of parameters of the state transition matrix. The full model captures the population variability better than coupled GLMs, though the number of states and their structure is not learned. Estimation is via variational inference.
On the relations of LFPs & Neural Spike Trains
David E. Carlson, Jana Schaich Borg, Kafui Dzirasa, Lawrence Carin.
Analysis of Brain States from Multi-Region LFP Time-Series
Kyle R. Ulrich, David E. Carlson, Wenzhao Lian, Jana S. Borg, Kafui Dzirasa, Lawrence Carin
Bayesian brain, optimal brain
Fast Sampling-Based Inference in Balanced Neuronal Networks
Guillaume Hennequin, Laurence Aitchison, Mate Lengyel
Sensory Integration and Density Estimation
Joseph G. Makin, Philip N. Sabes
Optimal Neural Codes for Control and Estimation
Alex K. Susemihl, Ron Meir, Manfred Opper
Spatio-temporal Representations of Uncertainty in Spiking Neural Networks
Cristina Savin, Sophie Denève
Optimal prior-dependent neural population codes under shared input noise
Agnieszka Grabska-Barwinska, Jonathan W. Pillow
Neurons as Monte Carlo Samplers: Bayesian Inference and Learning in Spiking Networks
Yanping Huang, Rajesh P. Rao
Other Computational and/or Theoretical Neuroscience
Using the Emergent Dynamics of Attractor Networks for Computation (Posner lecture)
J. J. Hopfield
He introduced bump attractor networks via analogy of magnetic bubble (shift register) memory. He suggested that cadence and duration variations in voice can be naturally integrated with state-dependent synaptic input. Hopfield previously suggested using relative spike timings to solve a similar problem in olfaction. Note that this continuous attractor theory predicts low-dimensional neural representation. His paper is available as a preprint.
Deterministic Symmetric Positive Semidefinite Matrix Completion
William E. Bishop and Byron M. Yu
See workshop posting where Will gave a talk on this topic.
General Machine Learning
Identifying and attacking the saddle point problem in high-dimensional non-convex optimization
Yann N. Dauphin, Razvan Pascanu, Caglar Gulcehre, Kyunghyun Cho, Surya Ganguli, Yoshua Bengio
From results in statistical physics, they hypothesize that there are more saddles in high-dimension which are the main cause of slow convergence of stochastic gradient descent. In addition, exact Newton method converges to saddles, (stochastic) gradient descent is slow to get out of saddles, causing lengthy platou in training neural networks. They provide a theoretical justification for a known heuristic optimization method which is to take the absolute value of eigenvalues of the Hessian when taking the Newton step. This avoids saddles, and dramatically improves convergence speed.
A* Sampling
Chris J. Maddison, Daniel Tarlow, Tom Minka
Extends the Gumbel-Max Trick to an exact sampling algorithm for general (low-dimensional) continuous distributions with intractable normalizers. The trick involves perturbing a discrete-domain function by adding an independent samples from Gumbel distribution.They construct Gumbel process which gives bounds on the intractable log partition function, and use it to sample.
Divide-and-Conquer Learning by Anchoring a Conical Hull
Tianyi Zhou, Jeff A. Bilmes, Carlos Guestrin
Spectral Learning of Mixture of Hidden Markov Models
Cem Subakan, Johannes Traa, Paris Smaragdis
Clamping Variables and Approximate Inference
Adrian Weller, Tony Jebara
His slides are available online.
Information-based learning by agents in unbounded state spaces
Shariq A. Mobin, James A. Arnemann, Fritz Sommer
Expectation Backpropagation: Parameter-Free Training of Multilayer Neural Networks with Continuous or Discrete Weights
Daniel Soudry, Itay Hubara, Ron Meir
Self-Paced Learning with Diversity
Lu Jiang, Deyu Meng, Shoou-I Yu, Zhenzhong Lan, Shiguang Shan, Alexander Hauptmann