Suppose you mix two Gaussian random variables $\mathcal{N}(-1, 1)$ and $\mathcal{N}(-1, 1)$ equally, that is, if one samples from the mixture, with probability 1/2, it comes from the first Gaussian and vice versa. It is evident that the mixture of Gaussians is not a Gaussian. (Do not confuse with adding two Gaussian random variables which produces another Gaussian random variable.) Similarly, mixture of inhomogeneous Poisson processes results in a non-Poisson point process. The figure below illustrates the difference between a mixture of two Poisson processes (B) and a Poisson process with the same marginal intensity (rate) function (A). The colored bars indicates the rate over the real line (e.g. time); in this case they are constant rate over a fixed interval. The 4 realizations from each process A and B are represented by rows of vertical ticks. Several special cases of mixed Poisson processes are studied , however, they are mostly limited to modeling over-dispersed homogeneous processes. In theoretical neuroscience, it is necessary to mix arbitrary (inhomogeneous) point processes. For example, to maximize the mutual information between the input spike trains and the output spike train of a neuron model, the entropy of a mixture of point processes is needed.

In general, a regular point process on the real line can be completely described by the conditional intensity function $\lambda(t|\mathcal{H}_t)$ where $\mathcal{H}_t$ is the full spiking history up to time $t$ . Let us take the discrete limit to form regular point processes. Let $\rho_k$ to be the probability of a spike (an event) at the $k$-th bin of size $\Delta$, that is, $\rho_k \simeq \lambda(k \Delta|y_{1:k-1}) \Delta,$

where $y_{1:k-1}$ are the 0-1 responses in all the previous bins. The likelihood of observing $y_k = 0$ or $y_k = 1$, given the history is simply, $P(y_k|y_{1:k-1}, \lambda) = {\rho_k}^{y_k} \left(1 - \rho_k\right)^{1 - y_k}.$

In the limit of small $\Delta$, this approximation converges to a regular point process. A fun fact is that a mixture of Bernoulli random variables is Bernoulli again, since it’s the only distribution for 0-1-valued random variables. Specifically, for a family of Bernoulli random variables with probability of 1 being $\rho_z$ indexed by $z$, and a mixing distribution $P(z)$, the probability of observing one symbol $y=0$ or $y=1$ is $P(y) = \int P(y|z)P(z) \mathrm{d}z = \int {\rho_z}^{y} \left(1 - \rho_z\right)^{1 - y} P(z) \mathrm{d}z = {\bar\rho}^{y} \left(1 - \bar\rho\right)^{1 - y}$

where $\bar\rho = \int \rho_k P(z) \mathrm{d}z$ is the average probability.

Suppose we mix $\lambda(t|\mathcal{H}_t, z)$ with $P(z)$. Then, similarly, for binned point process representation, above implies that, $P(y_k|y_{1:k-1},\lambda) = \int P(y_k|y_{1:k-1},\lambda) P(z) \mathrm{d}z = {\bar\rho}_k^{y_k} \left(1 - \bar\rho_k \right)^{1 - y_k}$

where $\bar\rho_k = \int \rho_k P(z) \mathrm{d}z$ is the marginal rate. Moreover, due to causal dependence between $y_k$‘s, we can chain the expansion and get the marginal probability of observing $y_{1:k}$, $P(y_{1:k}) = P(y_k|y_{1:k-1}) P(y_{1:k-1}) = P(y_k|y_{1:k-1}) P(y_{k-1}|P_{1:k-2}) \cdots P(y_1)$ $= \prod_{i=1}^k {\bar\rho}_i^{y_i} \left(1-\bar\rho_i\right)^{1-y_i}.$

Therefore, in the limit the mixture point process is represented by the conditional intensity function, $\lambda(t|\mathcal{H}_t) = \int \lambda(t|\mathcal{H}_t, z) P(z) \mathrm{d}z$.

Conclusion: The conditional intensity function of a mixture of point processes is given by the expected conditional intensity function over the mixing distribution.

References

1. Grandell. Mixed Poisson processes. Chapman & Hall / CRC Press 1997
2. Daley, Vere-Johns. An Introduction to the Theory of Point Processes. Springer.
3. Taro Toyoizumi, Jean-Pascal Pfister, Kazuyuki Aihara, Wulfram Gerstner. Generalized Bienenstock–Cooper–Munro rule for spiking neurons that maximizes information transmission. PNAS, 2005. doi:10.1073/pnas.0500495102
1. • 