Skip to content

Mixture of point processes


Suppose you mix two Gaussian random variables \mathcal{N}(-1, 1) and \mathcal{N}(-1, 1) equally, that is, if one samples from the mixture, with probability 1/2, it comes from the first Gaussian and vice versa. It is evident that the mixture of Gaussians is not a Gaussian. (Do not confuse with adding two Gaussian random variables which produces another Gaussian random variable.)

Similarly, mixture of inhomogeneous Poisson processes results in a non-Poisson point process. The figure below illustrates the difference between a mixture of two Poisson processes (B) and a Poisson process with the same marginal intensity (rate) function (A). The colored bars indicates the rate over the real line (e.g. time); in this case they are constant rate over a fixed interval. The 4 realizations from each process A and B are represented by rows of vertical ticks.

Several special cases of mixed Poisson processes are studied [1], however, they are mostly limited to modeling over-dispersed homogeneous processes. In theoretical neuroscience, it is necessary to mix arbitrary (inhomogeneous) point processes. For example, to maximize the mutual information between the input spike trains and the output spike train of a neuron model, the entropy of a mixture of point processes is needed.

In general, a regular point process on the real line can be completely described by the conditional intensity function \lambda(t|\mathcal{H}_t) where \mathcal{H}_t is the full spiking history up to time t [2]. Let us take the discrete limit to form regular point processes. Let \rho_k to be the probability of a spike (an event) at the k-th bin of size \Delta, that is,

\rho_k \simeq \lambda(k \Delta|y_{1:k-1}) \Delta,

where y_{1:k-1} are the 0-1 responses in all the previous bins. The likelihood of observing y_k = 0 or y_k = 1, given the history is simply,

P(y_k|y_{1:k-1}, \lambda) = {\rho_k}^{y_k} \left(1 - \rho_k\right)^{1 - y_k}.

In the limit of small \Delta, this approximation converges to a regular point process. A fun fact is that a mixture of Bernoulli random variables is Bernoulli again, since it’s the only distribution for 0-1-valued random variables. Specifically, for a family of Bernoulli random variables with probability of 1 being \rho_z indexed by z, and a mixing distribution P(z), the probability of observing one symbol y=0 or y=1 is

P(y) = \int P(y|z)P(z) \mathrm{d}z = \int {\rho_z}^{y} \left(1 - \rho_z\right)^{1 - y} P(z) \mathrm{d}z = {\bar\rho}^{y} \left(1 - \bar\rho\right)^{1 - y}

where \bar\rho = \int \rho_k P(z) \mathrm{d}z is the average probability.

Suppose we mix \lambda(t|\mathcal{H}_t, z) with P(z). Then, similarly, for binned point process representation, above implies that,

P(y_k|y_{1:k-1},\lambda) = \int P(y_k|y_{1:k-1},\lambda) P(z) \mathrm{d}z = {\bar\rho}_k^{y_k} \left(1 - \bar\rho_k \right)^{1 - y_k}

where \bar\rho_k = \int \rho_k P(z) \mathrm{d}z is the marginal rate. Moreover, due to causal dependence between y_k‘s, we can chain the expansion and get the marginal probability of observing y_{1:k},

P(y_{1:k}) = P(y_k|y_{1:k-1}) P(y_{1:k-1}) = P(y_k|y_{1:k-1}) P(y_{k-1}|P_{1:k-2}) \cdots P(y_1)

= \prod_{i=1}^k {\bar\rho}_i^{y_i} \left(1-\bar\rho_i\right)^{1-y_i}.

Therefore, in the limit the mixture point process is represented by the conditional intensity function,

\lambda(t|\mathcal{H}_t) = \int \lambda(t|\mathcal{H}_t, z) P(z) \mathrm{d}z.

Conclusion: The conditional intensity function of a mixture of point processes is given by the expected conditional intensity function over the mixing distribution.


  1. Grandell. Mixed Poisson processes. Chapman & Hall / CRC Press 1997
  2. Daley, Vere-Johns. An Introduction to the Theory of Point Processes. Springer.
  3. Taro Toyoizumi, Jean-Pascal Pfister, Kazuyuki Aihara, Wulfram Gerstner. Generalized Bienenstock–Cooper–Munro rule for spiking neurons that maximizes information transmission. PNAS, 2005. doi:10.1073/pnas.0500495102
2 Comments leave one →
  1. 2012/08/29 9:45 pm

    Interesting! I don’t understand how you have N(0,1) and N(0,-1). Does the second one have negative variance? I tried googling that and ended up in mostly esoteric stuff that I don’t know about. Your chart looks to me more like evenly mixing N(1,1) with N(-1,1).

    Also, do you mind explaining what is on the horizontal figure in the second axis?

    • memming permalink*
      2012/08/29 10:25 pm

      Thanks for your interest and for pointing out the errors. The each vertical tick in the second figure represents an event such as an action potential in time. I tried adding more explanation to the text.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: